Music Mood Classification Dataset Version 1.0 (NJU-MusicMood-v1.0)

This page contains the experimental dataset for music mood classification based on the audio and lyric information of music.

Dataset:

The dataset consists of 777 music clips of 4 mood categories - angry, happy, relaxed and sad, in which 400 clips are used as the training set and the other 377 ones are used for testing. The distribution of clip samples in 4 mood categories are shown in the following table:
Mood Category Training Samples Testing Samples
angry 100 71
happy 100 106
relaxed 100 101
sad 100 99
Note: Files in each sample set may not be numbered continuously in their file names.

For each music clip in the dataset, a plain text (.txt) file is provided consisting of every sentences of music lyrics, along with the time tags (i.e. the time offset [hour:minute.second] of the sentence relative to the start of the music).

For copyright reason, the audio data of the music clips cannot be provided here, which, on the other hand, can usually be sought via web search engines and downloaded from the Internet based on the information about the music clip provided in a text file "info.txt" in each training/testing sample set, which comprises 4 information fields separated by colon:

  1. Music's number in the package (as part of the name of the sample file)
  2. Music's title
  3. Performer's name
  4. Duration of music (in second)

Reference:

The dataset is created by Hao Xue, Like Xue, Hailiang Xu, and Feng Su, as part of their work on the following paper:
Multimodal Music Mood Classification by Fusion of Audio and Lyrics. Hao Xue, Like Xue, Feng Su. In Proc. of MMM 2015, LNCS 8936, pp 26-37.
Please consider to cite the above paper if the dataset is employed.

Contact:

For any questions, please send e-mail to Dr. Feng SU (suf@nju.edu.cn).