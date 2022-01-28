Hi friends. Today is January 28, 2022.

And for much of the past week, major music charts have been absolutely dominated by the surprise Disney hit "We Don’t Talk About Bruno.”

“Bruno” doesn’t precisely scream “Billboard Hot 100.” It’s a minor earworm off Lin-Manuel Miranda’s Encanto soundtrack. For three-and-a-half minutes, an ensemble cast trades anecdotes about the movie’s creepy uncle character over a slinky, salsa-y, off-beat rhythm. Its lyrics make absolutely no sense unless you’ve seen the movie (unlike, say, “How Far I’ll Go,” the big hit from Moana). And listeners can’t exactly generalize its themes to their own lives (unlike “Let It Go,” which I have personally blasted to exorcise the memory of many a bad work/family/romantic encounter).

This is often how hit songs work, though: It’s hard to predict what will stick. Until very recently, in fact, even algorithmic models trained on tens of thousands of songs couldn’t forecast a new track’s success — a conundrum known, in computer science, as the “hit song science” problem. Even now, in the age of algorithmic market predictions and early prison releases and God knows what else, these musical models aren’t nearly as accurate as you might expect.

So — how do you predict a hit? Early challengers tried to tackle the problem by focusing on songs’ features: things you could easily identify and tag, like the genre or tempo or topic of the song. The first model of this type, published in 2005, claimed it could identify common threads between hits in different genres. (That was largely debunked.) Another attempt at hit song science, conducted three years later, found no direct link between song popularity and any particular feature. A third go, in 2011, also failed to predict the popularity of YouTube music videos.

Some computer scientists began to wonder if popularity was just too messy, too human, for algorithms to grasp. Writing for an academic publication in 2011, Francois Pachet — then a researcher at the SONY Computer Science Laboratory in Paris — argued that computer models would struggle to account for variables even music psychologists didn’t understand.

After all, people don’t just like a song because of its lyrics or its proportion of percussive to harmonic sounds: They’re also influenced by other people. They gravitate toward or away from the songs they hear most often, according to where they were or how they felt when they heard them. And they encounter many songs not through personal choice, but through third-party radio DJs or Spotify recommendations. How could even a really good AI predict that “Bruno” would become a breakout TikTok trend?

But at the same time computer scientists were pondering these questions, consumers were changing how and where they listened to music, giving researchers access to an abundance of new information they hadn’t previously had. While the earliest hit song science model used only 1,700 songs to train, researchers were running machine-learning algorithms on tens of thousands of Spotify songs by the mid-to-late 2010s.

In 2019, two undergrads in California claimed their model — based on a data set of 1.8 million songs, and built using "the entire fleet of computers available to [the] University of San Francisco’s Computer Science Department" — could predict Billboard hits with 88% success. Another service, founded by a senior researcher at the Finnish Centre for Interdisciplinary Music Research, also claims to accurately identify high-potential songs, and shows some of its work in a weekly forecast.

But the “magic formula” for hit prediction still eludes us, two data science professors wrote just over a year ago. No one has achieved perfect accuracy in their predictions, and the field hasn’t congealed around one particular model.

That might be a good thing, given that the ability to predict the popularity of a piece of music could change which songs (and books and movies and TV series and fashion lines and art exhibits) get produced, at all. Would a hit-song algorithm have green-lit the creepy-uncle Latin-pop song…? Dunno, but it’s hard to imagine.

That’s it for this week! Until the next one. Warmest virtual regards.

— Caitlin