Part-1 Tutorial on Singing Transcription (by Roger Jang)
Contents
This is part-1 tutorial on singing transcription from polyphonic music, for AI Cup Competition. This part will focus on MATLAB functions for plot and playback of PV (pitch vector of singing voice) and music notes. Part-2 tutorial will focus on the transcription itself.
Preprocessing
Before we start, let's add necessary toolboxes to the search path of MATLAB:
addpath d:/users/jang/matlab/toolbox/utility addpath d:/users/jang/matlab/toolbox/sap addpath d:/users/jang/matlab/toolbox/machineLearning
All the above toolboxes can be downloaded from Roger's toolbox page. Make sure you are using the latest toolboxes to work with this script.
For compatibility, here we list the platform and MATLAB version that we used to run this script:
fprintf('Platform: %s\n', computer); fprintf('MATLAB version: %s\n', version); fprintf('Date & time: %s\n', char(datetime)); scriptStartTime=tic; % Timing for the whole script
Platform: PCWIN64 MATLAB version: 9.6.0.1214997 (R2019a) Update 6 Date & time: 31-May-2020 22:53:55
Basic operations
Let's check out the song of number 6. The content of the link file is:
linkFile='d:\dataSet\public\MIR-ST500\6\6_link.txt'; youtubeLink=deblank(fileread(linkFile)); fprintf('Youtube link = "%s"\n', youtubeLink); %web(youtubeLink)
Youtube link = "https://www.youtube.com/watch?v=be2wvNFTLMc"
You can click the following link to open the browser with the given Youtube link: 隱形的翅膀
Once you locate the original music link, you can perform the following task if you want to use the original music for analysis:
- Download the music by using any online service. You can google "mp3 youtube download" to find many online service for this purpose.
- Extract the singing voice by SOVIA.
- Extract the vocal pitch by using SAP toolbox (more about this later). In fact, the extracted pitch is available in the MIR-ST500 corpus.
All the features related to the music is stored at *_feature.json. Note that due to IP issue, we cannot distribute the music directly. However, you can still download the music for personal fair use. For this final project, you do not need to analyze the music audio directly if you don't want to. Instead, you can use the feature files provided in MIR-ST500 corpus for your analysis and modeling. For instance, the feature file of 隱形的翅膀 contains time stamps, vocal pitch, and other features for ST (singing transcription). We can read the feature file and plot the singing pitch versus time, as follows.
feaFile='d:\dataSet\public\MIR-ST500\6\6_feature.json'; fea=jsondecode(fileread(feaFile)); pv.pitch=fea.vocal_pitch; pv.time=fea.time; pv.name='隱形的翅膀'; opt=pvPlot('defaultOpt'); opt.showPlayButton=0; figure; pvPlot(pv, opt);
data:image/s3,"s3://crabby-images/6e028/6e028b8953d437f53d0ad9535119e4ceebb3377a" alt=""
You can plot the pitch of the first phrase:
timeInterval=[27 33.5]; % The first phrase pv=pvSubsequence(pv, timeInterval); pv.name='Singing pitch of the first phrase of 隱形的翅膀'; figure; pvPlot(pv);
data:image/s3,"s3://crabby-images/5c3ee/5c3ee637b2e5cb965d4cd7ff35eafcd905f458a2" alt=""
You can play the singing pitch of the first phrase, and save it to an audio file:
opt=pvPlay('defaultOpt'); opt.method=2; opt.auFileName='pv.wav'; showPlot=1; figure; pvPlay(pv, opt, showPlot);
Saving pv.wav (within note2au)...
data:image/s3,"s3://crabby-images/c47c9/c47c9a5b6eeedcd180202855159792f2a314afd6" alt=""
You can click to play the pv.wav.
You can also plot the groudtruth of the PV:
gtFile='d:\dataSet\public\MIR-ST500\6\6_groundtruth.txt'; note=noteFileRead(gtFile); figure; note=noteSubsequence(note, timeInterval, 1); % Take the note vector within the time interval
data:image/s3,"s3://crabby-images/bc258/bc2584234d8a275b9775dbc348b19e1ab80fd3b7" alt=""
You can play the note vector and save it to another audio file:
fprintf('Play the groundtruth music notes...\n'); opt=notePlay('defaultOpt'); opt.auFileName='note.wav'; figure; signal=notePlay(note, opt, showPlot);
Play the groundtruth music notes... Saving note.wav (within note2au)...
data:image/s3,"s3://crabby-images/476e7/476e71103996b74643166ba71ec9bd6d030772b0" alt=""
You can click to play the saved note.wav.
We shall save "pv" and "note" for further analysis:
fprintf('Saving transparentWings.mat...\n'); save transparentWings.mat pv note
Saving transparentWings.mat...
Summary
This is a brief introduction to the functions for display and playback of PV and music notes. For singing transcription, please refer to the second part of the tutorial.
Appendix
List of functions and scripts used in this tutorial:
Date and time when finishing this script:
fprintf('Date & time: %s\n', char(datetime));
Date & time: 31-May-2020 22:54:05
Overall elapsed time:
toc(scriptStartTime)
Elapsed time is 10.028268 seconds.
Jyh-Shing Roger Jang, created on
datetime
ans = datetime 31-May-2020 22:54:05
If you are interested in the original MATLAB code for this page, you can type "grabcode(URL)" under MATLAB, where URL is the web address of this page.