Learning Path: "Voice and Sound Recognition"

Learning Path: "Voice and Sound Recognition"

Chapter 1: SOUND AND WAVEFORMS


The concept that I learnt from this content is fundamental concepts related to sound and waveforms. Sound is described as vibrations produced by objects that cause air molecules to oscillate, creating waves. Mechanical waves, which include sound waves, require a medium to propagate, often air. A sound wave's structure is explained, with areas of compression and rarefaction leading to variations in air pressure, visualized using a pressure plot.

Waveforms, graphical representations of air pressure changes over time, are discussed. Complex sounds are depicted using waveforms, enabling visualization of musical elements like notes and durations. Sound is categorized into periodic (repeating patterns) and aperiodic (non-repeating) types. Periodic sounds include single sine waves, while complex sounds result from the combination of multiple sine waves.


Sound:

● Produced by vibration of an object

● Vibrations cause air molecules to oscillate

● Change in air pressure creates a wave

Mechanical wave:

● Oscillation that travels through space

● Energy travels from one point to another

● The medium is deformed?

No alt text provided for this image
Sound waves

WaveForm:

?Carries multifactorial information:

○ Frequency

○ Intensity

○ Timbre

No alt text provided for this image

The parameters of a sine wave are explained amplitude (perturbation height), frequency (cycles per second), and phase (position at time zero). Frequency and amplitude impact sound perception, with higher frequency and amplitude resulting in higher perceived sound and louder volume. The concept of pitch is introduced, highlighting its logarithmic perception.MIDI notes, a convention for representing musical notes, are explained in the context of a piano keyboard.The higher the frequency ,the higher the sound wave is.Mapping Pitch of Frequency is derived with formula of


F(P) = 2^(p-69/12)*440

Hearnig Capacity:

No alt text provided for this image
Hearning Sensitivity
No alt text provided for this image
Classification of Sound Waves


Chapter 2: Intensity,Loudness and Timbre


Sound Power:

It is a physical method to measure the range of sound.Rate at which energy is transferred .Energy per unit time emitted by a sound source in all directions .Measured in watt(W).

Sound Intensity: Sound power per unit area and mesured in W/m^2

Threshold of hearing:Humans can perceive sounds with small intensities?

TOH?= 10^-12 W/m^2

intensity Level: To describe the intensity of sound and the range is large so why we measure in Lograthmic scale and measured in decibels (dB) and the ratio between two intensity values.Use an intensity of reference (TOH)

db(I) = 10*logbase10(I/IbaseTOH)

Complex Sounds:Superposition of sounds .A partial is a sinusoid used to describe a sound.The lowest partial is called fundamental frequency.A harmonic partial is a frequency that's a multiple of the fundamental frequency.

Timbre recap:

  1. Multifactorial sound dimension?
  2. Amplitude envelope?
  3. Distribution of energy across partials
  4. Signal modulation(frequency/amplitude)


要查看或添加评论,请登录

Dhanushkumar R的更多文章

  • MLOPS -Getting Started .....

    MLOPS -Getting Started .....

    What is MLOps? MLOps (Machine Learning Operations) is a set of practices and principles that aim to streamline and…

    1 条评论
  • Pydub

    Pydub

    Audio files are a widespread means of transferring information. So let’s see how to work with audio files using Python.

    1 条评论
  • Introduction to Python libraries for image processing(Opencv):

    Introduction to Python libraries for image processing(Opencv):

    Image processing is a crucial field in computer science and technology that involves manipulating, analyzing, and…

    1 条评论
  • @tf.function

    @tf.function

    Learning Content:@tf.function @tf.

  • TEXT-TO-SPEECH Using Pyttsx3

    TEXT-TO-SPEECH Using Pyttsx3

    Pyttsx3 : It is a text to speech conversion library in Python which is worked in offline and is compatible with both…

    2 条评论
  • Web Scraping

    Web Scraping

    Web scraping is the process of collecting structured web data in an automated manner. It's also widely known as web…

  • TORCHAUDIO

    TORCHAUDIO

    Torchaudio is a library for audio and signal processing with PyTorch. It provides I/O, signal and data processing…

  • Getting Started With Hugging Face-Installation and setUp

    Getting Started With Hugging Face-Installation and setUp

    Hub Client Library : The huggingface_hub library allows us to interact with Hugging FaceHub wich is a ML platform…

  • Audio Features of ML

    Audio Features of ML

    Why audio? Description of sound Different features capture different aspects of sound Build intelligent audio system…

  • Pytorch Learning -3 [TRANSFORMS]

    Pytorch Learning -3 [TRANSFORMS]

    TRANSFORMS Data does not always come in its final processed form that is required for training machine learning…

社区洞察

其他会员也浏览了