Who is the singer?

Who is the singer?

Introduction

?This month we are talking about the different categories our AI can discern. We have already discussed moods and valences and took a detailed overview of the rhythmic features. This week will explain features such as the type of vocals or main instruments on the mix. We will also talk about the scale, main key, and distortion levels. These categories require a great amount of data to be properly identified. For this reason, it makes use of our great friend: the metadata.

?VOCALS:?

No alt text provided for this image

?It is easy for our ears, in most cases, to identify whether the song has female or male vocals or if it does not have vocals at all. As humans, we don′t have the best hearing system in the animal realm. Nonetheless, we are specialized in identifying the frequency ranges and nuances of human voices. We can resolve them from an instrumental background, and we can, almost seamlessly, tell if the voice sounds feminine or masculine. Nonetheless, when we feed a computer with a musical composition, all the sounds in the spectrum get mixed on a complex soundwave. The frequencies of the human voices merge with other instruments and effects, making the analysis harder. We help out our ai during the training, using a mix of annotated metadata and human curation. If we do not need to listen and manually annotate this information, we get more time to check for more complex musical features.

Have you ever listened to a new song where you aren′t sure whether the singer is a boy or a girl? Just like us, the ai can get it wrong sometimes. Tracy Chapman, for example, is one of those cases that get everybody (humans and ais) confused:

?Female singer identified as male voice:

Fast Car by Tracy Chapman

Male singer identified as female voice:

Feel it still by Portugal. The man.


?DOMINANT INSTRUMENT:?

No alt text provided for this image

?As the name says, this category indicates the most dominant instruments in the song, besides the vocals. Our ai is trained to tag several instruments such as piano, electronics, guitar, strings, synthesizer, wind, saxophone, flute, trumpet, drum kit, keys, accordion, violin, harpsichord, choir, cello, and electric bass. For this category,during the training, we also took metadata information into account.

Recently, we have started to implement an extra layer of specialization, based on a thesaurus, to pull up less common instruments, based on natural language correlations.?

?

?SOUND GENERATION:

This category refers to whether the music contains acoustic, electric, electronic, or mixed instruments and sound textures. For training the ai for this category, we need a mix of metadata and human listening feedback. Can you tell the difference?

Acoustic:

?Don′t know why by Norah Jones

Electric:

Black in Black by ACDC

Electronic:

Rain on My by Lady Gaga, featuring Ariana Grande

?

ROUGHNESS:?

No alt text provided for this image

This category refers to how rough the song is. Roughness is a complex feature that has to do with the sort of distortion the instruments or vocals have. You can perceive the distortion when you visualize the sound waves. Even if it is an objective trait, the training of the ai still requires human hearing. The distortion depends on secondary frequencies derived from the primary frequency of the voice or instrument sound. Those secondary frequencies build up what we call the “timbre”. Normal harmonics and harmonic distortions build up pleasant, warm timbres that feel smooth, full and soft. When the distortion is non-harmonic, the secondary frequencies of the instrument or voice crash with the primary frequencies, generating a sense of roughness. The old magnetic tapes, when saturated, produce pleasant harmonic distortions, desirable for many musical styles. The digital systems, in contrast, clip the tracks abruptly when they reach saturation, generating non-harmonic distortions. Our ai categorizes the roughness levels of the tracks as clear, moderate roughness, and distorted.

?Clear:

Sloop John B by The Beach Boys

Moderate Roughness:

Kokomo by The Beach Boys

Distorted:

I’m Waiting for the Day by The Beach Boys

?

SCALE:

No alt text provided for this image

Our ai can also differentiate songs written in major, minor and neutral keys. We consider that the scale has a neutral key when the song uses alternative scales or presents a mix of major and minor keys in different sections. The scale refers to the relative distance between the notes used in the melodic components of the track.

?Major key scale:?

Here comes the Sun by The Beatles

Minor key scale:

Girl by The Beatles

?Neutral Key:

A day in life by The Beatles


?MAIN KEY

No alt text provided for this image

This category refers to the dominant key of the song. The key is the principal frequency (aka, musical note) around which the other notes used within the melody work together. Only expert human ears can “easily” identify the key of a song. To properly train our ai in this condition, we need a mix of annotated metadata and human curation. Luckily, the key of a song is the reference piece of inforsmation to transpose the composition. Thus, it is easy to retrieve information about the key of a recording from several databases accessible online. We used this information to help our ai during its learning process. Nonetheless, this feature is still challenging because some songs do not fit in a particular key or present key changes along the track. In these cases, our ai tags them as “unclear”.


CONCLUSION

We hope this served as an overview of this pre-last subgroup of categories. This time we focused on tags where metadata and human hearing combine in the training of the ai engine.?

Next week we will be talking about the last group of categories our ai can identify, along with other music examples.

要查看或添加评论,请登录

musicube的更多文章

社区洞察

其他会员也浏览了