It was one of those ideas that came to me in bed. No, not one of those ideas. The sort of thing that drifts into your mind between waking and getting up. Or, if you’re me, vice versa.
It was about music genres. Actually, I’m not very good at identifying and naming genres in the detail that some people thrive on. I don’t know how Classic Death Metal differs from Dark Death Metal, or what exactly Ambient House has over Ambient Trance; but I can tell Jazz from Drum’n’Bass and Country & Western. That’s a key point: you know it when you hear it, but putting that into words is a different matter.
What my idea was is that there must be various features of music which vary in a linear way, and could represent axes in a Hilbert Space. Say, to simplify, you had a harmony axis, ranging from “harmonious” to “discordant”, and at right angles, an axis of rhythm complexity, “simple” to “complex”. You could place any type of music on a point in the plane of those axes. A basic C&W tune has mellow harmonies and a simple beat. Romantic classical music has more diverse rhythms, but still has sweet harmonies. Stockhausen is standing in a corner on his own. Once more than 3 dimensions are added, it’s hard to visualize, but mathematically, it’s not a problem. Physicists routinely deal with spaces composed of an infinite number of dimensions.
I suspected, that like many early morning thoughts, my idea was neither profound nor original. Naturally, the Internet was the place to confirm that. One hit was a paper by Slaney & White of Yahoo! who tried to analyze playlists for the purpose of music recommendation. They used the methodology from a much earlier paper by Tzanetakis (IEEE) which I remembered reading before (God knows why) which was based on extracting acoustic properties of the music: audio spectrum features, acoustic energy content, and such.
I wasn’t satisfied with those concepts, where the fundamental idea is to enable computers to recognize music genres, partly because they don’t seem to relate to how humans hear music, and partly because they don’t work very well. Untrained humans can spot broad genres better than 75% of the time, while the best computer algorithms are averaging around 55%.
I know the latter figure because, to my joy, I discovered that there is an annual competition, the Music Information Retrieval Evaluation eXchange (MIREX), which includes automatic genre recognition. I have no doubt that computers will beat humans one day, and that you’ll be able to tell your iPod “play me some Shoegaze”, but I’m still not happy.
You see, the thing is that the machines don’t understand music. They feel nothing. They can’t tug on your sleeve and say “Hey, listen to this: it’s fabulous.” That’s what friends are for.