Analyzing scale degree distributions in Major and Minor Keys

Analyzing scale degree distributions in Major and Minor Keys

Introduction:

How useful is it to know the tonal context of a musical piece? This may sound like a question for MIDI-nerds or music theory fanatics, but I hope that this article can stimulate your curiosity on the subject.

Goal:

Evaluate the tonal context of Standard MIDI File (SMF) songs.

Content:

How I conducted the analysis and the results obtained.

Data and Tools Used:

  • I performed the analysis on a pool of ~10,000 SMF songs in which KeySignature and Mode info were correctly programmed. I compiled this pool with pieces that are heterogeneous in terms of musical genre, period and place of origin, without the unrealistic claim of proportionally representing all the world's tonal music… Genres of pop music are prevalent but jazz and baroque and classical pieces are also present.
  • The analysis was carried out using an SMF parser written by me in Python. This parser can classify and divide large quantities of SMF into folders based on configurable criteria and can generate reports in CSV (Comma Separated Values) format.
  • CSV files were loaded and visualized as histograms in Google Sheets.

Steps Followed:

Through the SMF parser, I divided the SMF pool into three groups based on these criteria:

  1. Songs entirely in major keys.
  2. Songs entirely in minor keys (identified by evaluating the second byte of the Key Signature meta-event, which indicates the minor mode when set to 1).
  3. Songs containing internal Key Signature changes, either major or minor modes.

Using the SMF parser, I generated three CSV files from the three groups.?

For each song, the CSV file included a row with 12 numbers indicating the weight of the different scale degrees in relation to the current Key.?

For example, in the context of a current major Key, the first number indicates the weight of the Tonic until succeeding to the twelfth indicating the weight of the Leading Tone.

The SMF analysis in details:

  • The "weight" of the notes was derived from their length.
  • Drum tracks were excluded from the analysis; tracks with MIDI Ch. 10 or with CC#0 Bank Select MSB set to 120 or 127 were considered Drum tracks.
  • Notes with Prog.Change values in the range 112-127 were excluded from the analysis, as these Prog.Change values relate to sounds that do not follow semitone pitch tracking.
  • For each note that isn't excluded from the analysis based on the criteria described above, its length is calculated and added to an array of 'notes' ranging from 0 to 11. These values represent the results of the 'module 12' operations applied to the notes brought back to the '0' reference key (C major). Then the resulting values are normalized to a range between 0 and 100."
  • A factor “2 X" was applied to the tracks identified as “Bass”, i.e., with the current Program Change in the range 32-39; I am aware that this factor is arbitrary but it seemed useful to give a "numerical" impact to the role of the bass.

Results:

The following graph shows the average distribution of scale degrees in the pool of songs entirely in major modes.

The following graph shows the average distribution of scale degrees in the pool of songs entirely in minor modes.

The following graph compares the three song pools (major modes only, minor modes only, "assorted" modes).

Observations:

Graphs of averages calculated over entire song pools show the expected distributions, but examination of individual songs may reveal significant differences. For example, the next two graphs display the distributions of four major and four minor songs, highlighting substantial differences among individual songs.

Conclusions and Applications:

With the data obtained, the possibility emerges of derive empirical rules to evaluate the correctness of the Key and Mode info of SMFs or to calculate from scratch the Key and Mode of SMFs.

I have implemented test versions of such rules in my parser and they are providing promising results, so this may be the topic of a forthcoming article ??

Possible Developments:

I believe it will be appropriate to try to use, instead of empirical rules, a classifier based on Machine Learning, taking full advantage of the data collected for model training and result checking.

要查看或添加评论,请登录

Giovanni Mazzotti的更多文章

社区洞察

其他会员也浏览了