February 2014 Archives

The Latin Music Database (LMD) was created around 2007 by Silla et al. for use in a comparative evaluation of particular approaches for music genre classification. It has been used in the MIREX Latin music genre recognition task since 2009.

LMD is described in, C. N. Silla, A. L. Koerich, and C. A. A. Kaestner, "The Latin music database," in Proc. ISMIR, 2008. That paper describes the LMD as 3,227 song recordings, each labeled in one of ten different classes: Axé, Batchata, Bolero, Forró, Gaúcha, Merengue, Pagode, Salsa, Sertaneja, and Tango. This dataset is notable among those created for music genre recognition because it contains music outside the realm of Western popular music. Like the Ballroom dataset, each music recording is assigned a single label by "experts in Brazilian dance" according to the appropriate dance. However, unlike GTZAN and Ballroom, the audio data is not freely available; only pre-computed features are available for download.

Searching through the references of my music genre recognition survey, I find this dataset (or portions of it) has been used in the evaluations of music genre recognition systems in at least 16 conference papers and journal articles:

  1. Y. M. G. Costa, L. S. Oliveira, A. L. Koerich, and F. Gouyon. Music genre recognition using spectrograms. In Proc. Int. Conf. Systems, Signals and Image Process., 2011.
  2. Y. M. G. Costa, L. S. Oliveira, A. L. Koerich, and F. Gouyon. Comparing textural features for music genre classification. In Proc. IEEE World Cong. Comp. Intell., June 2012.
  3. Y.M.G. Costa, L.S. Oliveira, A.L. Koerich, F. Gouyon, and J.G. Martins. Music genre classification using LBP textural features. Signal Process., 92(11):2723-2737, Nov. 2012.
  4. S. Doraisamy and S. Golzari. Automatic musical genre classification and artificial immune recognition system. In Z. W. Ras and A. A. Wieczorkowska, editors, Advances in Music Information Retrieval, pages 390-402. Springer, 2010.
  5. N. A. Draman, C. Wilson, and S. Ling. Modified AIS-based classifier for music genre classification. In Proc. ISMIR, pages 369-374, 2010.
  6. T. Lidy, C. Silla, O. Cornelis, F. Gouyon, A. Rauber, C. A. A. Kaestner, and A. L. Koerich. On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-western and ethnic music collections. Signal Process., 90(4):1032-1048, 2010.
  7. M. Lopes, F. Gouyon, A. Koerich, and L. E. S. Oliveira. Selection of training instances for music genre classification. In Proc. ICPR, Istanbul, Turkey, 2010.
  8. G. Marques, T. Langlois, F. Gouyon, M. Lopes, and M. Sordo. Short-term feature space and music genre classification. J. New Music Research, 40(2):127-137, 2011.
  9. G. Marques, M. Lopes, M. Sordo, T. Langlois, and F. Gouyon. Additional evidence that common low-level features of individual audio frames are not representative of music genres. In Proc. SMC, Barcelona, Spain, July 2010.
  10. A. Schindler and A. Rauber. Capturing the temporal domain in echonest features for improved classification effectiveness. In Proc. Adaptive Multimedia Retrieval, Oct. 2012.
  11. C. Silla, A. Koerich, and C. Kaestner. Improving automatic music genre classification with hybrid content-based feature vectors. In Proc. Symp. Applied Comp., Sierre, Switzerland, Mar. 2010.
  12. C. N. Silla, A. Koerich, and C. Kaestner. Automatic music genre classification using ensembles of classifiers. In Proc. IEEE Int. Conf. Systems, Man, Cybernetics, pages 1687.1692, 2007.
  13. C. N. Silla, A. L. Koerich, and C. A. A. Kaestner. Feature selection in automatic music genre classification. In Proc. IEEE Int. Symp. Mulitmedia, pages 39-44, 2008.
  14. C. N. Silla, A. L. Koerich, and C. A. A. Kaestner. A feature selection approach for automatic music genre classification. Int. J. Semantic Computing, 3(2):183-208, 2009.
  15. C. Silla, C. Kaestner, and A. Koerich. Time-space ensemble strategies for automatic music genre classification. In Jaime Sichman, Helder Coelho, and Solange Rezende, editors, Advances in Artificial Intelligence, pages 339-348. Springer Berlin / Heidelberg, 2006.
  16. C.N. Silla and A. A. Freitas. Novel top-down approaches for hierarchical classification and their application to automatic music genre classification. In IEEE Int. Conf. Systems, Man, and Cybernetics, San Antonio, USA, Oct. 2009.
However, as for GTZAN, and as for Ballroom, it appears that researchers have taken for granted the integrity of LMD. I have acquired the audio for LMD, which has 3,229 song files (two more than stated by Silla et al.). Through my fingerprinting method (just a little Shazam-like implementation), I compare all songs in each class, and find 213 replicas. This is in spite of the cautions Silla et al. (2008) describe taking in creating LMD.

So far, I have only looked for replicas within each class, and not across classes; but we now know at least 6.5% of the dataset is replicated (which is greater than the 5% in GTZAN, and which we already know cannot be ignored). Below, I list the replicas I find.

Blog Roll

About this Archive

This page is an archive of entries from February 2014 listed from newest to oldest.

January 2014 is the previous archive.

May 2014 is the next archive.

Find recent content on the main index or look in the archives to find all content.