Since about 2001, many studies in this area have used the 1.2 GB dataset assembled by George Tzanetakis, who was one of the first to study this area. The typical approach has been to design and test a set of acoustic features with a classifier, then report the mean accuracies from cross validation, and perhaps a confusion table. Below is a confusion table I just created from this dataset using one classification method and set of acoustic features.
First, we listen to the mislabeled Classical excepts, of which we see there are four out of the one hundred tested. The excerpt classical.00045 was labeled Jazz:
I would say that this confusion is acceptable because of its use of subject and variations, and the general lack of tonal center; and probably most of that passage was notated with figured bass and later written. (The pops are present in the original file.)
Next is classical.00039, misclassified as Disco:
Unlike the previous mistake, I say that though this confusion is enormously amusing, it is completely unacceptable. First, no part of that excerpt reminds me of Gloria Gaynor. Second, it lacks all of the defining characteristics of disco: a bouncy driving beat, 16th note hi hats, and of course sequins. (I would also say this excerpt is not Classical, but oh well.)
The excerpt classical.00049 is misclassified as Rock:
and I am sure Mozart would be happy with that, but I'm not. Also, here we hear some distortion present in the dataset. It sounds as if the excerpt was amplified beyond quantization limits, which makes me wonder if this excerpt is classified as rock because of its lack of dynamic range. Still, genre is a quality that transcends distortion, such as AM radio.
Finally, classical.00051 is misclassified as Metal:
This is going to annoy my metal friends, but I think that is acceptable considering the presence of dramatic dynamic shifts, tutti power chords, rumbling bass, and szforzando kettles. Mussorgsky always was a bad boy, like a past-day Glenn Branca.
Now, what about things misclassified as Classical? From the confusion table we see that excerpts from all but two genres (Hiphop and Metal) are misclassified Classical. First, we have jazz.00000 and jazz.00001.
One of the few things you might notice is that BOTH OF THESE ARE NOT JAZZ. Have you ever heard an entire orchestra improvise, and at the same time quote Stravinsky? (Maybe inadvertently John Cage, but not Stravinsky.) The musical from which these excerpts come, West Side Story, is not Jazz. I think this is a problem in this dataset.
The Stan Getz in jazz.00057, which I would label Jazz, is misclassified as Classical:
A similar thing happens with country.00069, which is also apparently Classical:
It is as if this classifier misses the front and center saxophone, and Willie Nelson's voice --- so rare are these instruments to Classical music --- and focuses instead on the narrow definition "uses strings." But, probably, the classifier is not listening at all, but is sensitive to particular methods of mastering typical to genres --- which explains the above misclassification of Mozart as Rock. (Another Willie Nelson tune, Uncloudy Day, is also misclassified as Classical.)
Then there is blues.00004:
which are audibly not Classical, yet have no bowed strings.
Blues, Jazz and Country can't have all the fun; disco.00020 is thought to be Classical as well:
which is Clarence Carter's "Patches". Though it is certainly not Classical, I think we can all agree that if this was played at a disco, everyone would stop dancing and start crying. Just as Bernstein's West Side Story above is mislabeled Jazz, Carter's "Patches" is mislabeled Disco. And on top of that, disco.00047, mislabeled Classical
is in my opinion mislabeled itself even though it comes from a Disco piece. This excerpt comes from the most unDisco-like portion of Barbara Streisand and Donna Summer singing No More Tears (Enough is Enough). Listen to when the real Disco portion breaks loose at 1m50! Still, in my opinion, this excerpt has no business being used to represent Disco.