And now for something completely different

This is (nominally) a Data Science blog, but I do have other interests. One of those other interests is music, and now that I have a platform for forcing my opinion onto others, I am going to do so and I apologize. 

(For what does it mean to be human if not having unreasonably strong opinions about things that don’t matter, and then placing those opinions into the form of a list?)

Usually, I prefer discussing albums in their entirety, but unfortunately, I haven’t really had time to dig into albums these past six months. Indeed, I’ve had the busiest year I can remember having, and I’m not saying that in the usual, look-at-me-I’m-just-so-busy-and-important Holiday Card type of way. I’m saying it in the real world, I’m-scared-and-excited-and-tired-and-there’s-pros-and-cons-to-everything type of way. 

Which is all to say: these songs are ones that have been meaningful to me at one of the most exciting, fluid, unpredictable and exhausting times of my life. That’s got to be worth something, and I’m sharing them in the hopes that you get that something — or even some unrelated thing — out of them, too. 

And don’t worry! I’m going to perform hierarchal clustering on my selections at the end of this post so that it’s (nominally) about Data Science. 

My Top 10

1. “An Ocean Between the Waves” by The War on Drugs (from Lost in the Dream)

The War on Drugs broke through earlier this year with Lost in the Dream. This song will show you why.


2. “A Dream of You and Me” by Future Islands (from Singles)

On “A Dream of You and Me,” Future Islands unleash one of the biggest hooks of their career, and frontman Sam Herring owns every moment.


3. “Heavenly Father” by Isaiah Rashad (from Cilvia Demo)

Isaiah Rashad’s debut is my favorite rap album of the year. His influences are self-evident, but luckily, he’s got great influences. On “Heavenly Father,” his sing-song flow and gospel-intoned beat remind me of the best of his Southern rap predecessors. 


4. “Your Love is Killing Me” by Sharon Van Etten (from Are We There)

Occasionally, songs come around that force you to stop what you’re doing. This is one of them. 


5. “Carissa” by Sun Kil Moon (from Benji)

Much like “Your Love is Killing Me” at Number 4, Sun Kil Moon’s “Carissa” is a straightforwardly devastating song. It also happens to be one of the more thoughtful discussions of death and family and humanity I’ve ever heard.


6. “Afraid” by Posse (from Soft Opening)

“Afraid” is a song that could have only originated in the Pacific Northwest, with a melody that would sound perfect on a gloomy Seattle morning. 


7. “Move That Dope” by Future (from Honest)

I can’t think of a bigger moment in hip hop this year than “Move That Dope,” and in a genre that’s never lacking for capital-E Events, that’s saying something. Every verse in this song has a moment that’ll have rap nerds geeking out, and the whole thing has a big screen quality that lesser artists would be too scared to try. 


8. “Do It Again” by Royksopp and Robyn (from Do It Again)

Robyn is the master of huge-sounding pop music, but more importantly, she’s an expert at combining those huge, radio-ready beats with insightful writing and beautiful vocals.


9. “Habit” by Ought (from More Than Any Other Day)

When crafting a list like this, it’s hard to figure out how to balance “songs I play all the time” versus “powerful songs that are too emotionally impactful to listen to all the time, but that I still really, really love.” If you’re interested, the ratio is currently 6:3, and “Habit” is the third in the latter category. Ought’s frontman delivers his vocals with a strained intensity that make you feel every word and every guitar pluck.


10. “If It Wasn’t True” by Shamir (from Northtown)

Shamir’s music feels completely untethered from reality and influence, even though he sings about very real things (in this case, the slow death of a relationship). The combination makes him one of the more exciting young acts currently making music.

The Rest

11. “Draft Day” by Drake (Non-Album Single)

12. “Asleep” by Makthaverskan (from II)

13. “Oh, I’m a Wrecker” by A Sunny Day in Glasgow (from Sea When Absent)

14. “April’s Song” by Real Estate (from Atlas)

15. “Rebound” by Dornik (Non-Album Single) 

16. “Tough Love” by Jessie Ware (Single from forthcoming sophomore album)

17. “Danny Glover” by Young Thug (from Black Portland)

18. “Can’t Do Without You” by Caribou (Single from forthcoming album Our Love)

19. “Shame” by Freddie Gibbs & Madlib (from Piñata

20. “Down It Goes” by White Lung (from Deep Fantasy)

21. “Inauguration” by Hospitality (from Trouble)

22. “Passing Out Pieces” by Mac DeMarco (from Salad Days)

23. “Man of the Year” by ScHoolboy Q (from Oxymoron)

24. “Lonely Richard” by Amen Dunes (from Love)

25. “Match & Tinder” by You Blew It! (from Keep Doing What You’re Doing)

Oh, and here’s the dendrogram for that hierarchal clustering thing I talked about earlier (discussion and code below)


There are some profoundly basic things happening here. For example, we can clearly see a red “Hip Hop and Hip Hop-Influenced” section, so our clustering algorithm was smart enough to pick up on the obvious distinction between my two favorite genres of music.

As is often the case, the really interesting things are a bit more subtle, and require a bit of domain experience. Consider “Danny Glover” by Young Thug. By way of background, it’s hard to visit a hip-hop-minded blog without running into one of his songs, and the general consensus is that he’s one of the freshest voices we’ve heard in rap for a long time. Now, we can prove that this is indeed the case with Data Science: Noticeably, he’s the only hip hop act missing from the red hip hop cluster. Indeed, he seems to have more in common with one of music’s other singular talents, Robyn. 

Moving onto the blue, which is probably being clustered together on the basis of “it’s not hip hop,” we can see some other things going on. If I were forced to assign a concept to what the algorithm is picking up on the most, I would strangely need to say “sadness.” “Your Love is Killing Me” and “Carissa” are both devastatingly powerful songs, and as you move up from them, we in turn move from sad to wistful (songs like “April’s Song” and “A Dream of You and Me”). This ultimately gives way to high-energy tracks that bridge the gap between indie rock, hip-hop, and dance music.

Wrapping Up

This, of course, was not a very rigorous exercise (as you’ll quickly be able to tell by reading my code), but I thought it might be a little bit of fun. If you’re interested in performing a similar analysis, I’ve copied my code below for posterity’s sake. I would recommend doing a bit more preprocessing than I did.

%matplotlib inline 
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
from import wavfile
from scipy.cluster import hierarchy
import seaborn as sns

# I love you, Seaborn

# the names of my songs 
songs = [
         "A Dream Of You And Me",
         "An Ocean In Between The Waves",
         "April's Song",
         "Can't Do Without You",
         "Danny Glover",
         "Do It Again",
         "Down It Goes",
         "Draft Day",
         "Heavenly Father",
         "If It Wasn't True",
         "Lonely Richard",
         "Man Of The Year",
         "Match & Tinder",
         "Move That Dope",
         "Oh I'm A Wrecker",
         "Passing Out Pieces",
         "Tough Love",
         "Your Love Is Killing Me"

# go through songs and read in .wav data
song_wav = []

for song in songs:
    temp_file = "data/wav/" + song + ".wav"
    wavy =

# find minimum length .wav data
min_wav_length = song_wav[0].shape[0]
min_pos = 0

for i, wav_data in enumerate(song_wav):
    if wav_data.shape[0] < min_wav_length:
        min_wav_length = wav_data.shape[0]
        min_pos = i
print songs[i] + ": " + str(min_wav_length)

# correct our data for minimum .wav data length, and 
# ravel so that we're dealing with 1D arrays for any given song
final_data = np.zeros((len(songs), 2 * min_wav_length))
for i, wav_data in enumerate(song_wav):
    final_data[i, :] = wav_data[:min_wav_length, :].ravel()

# check 
print final_data.shape

# make linkage matrix with cosine similarity metric 
link_songs = hierarchy.linkage(final_data, metric="cosine")

# plot dendrogram
def llf(id):

    return songs[id]

dendro_song = hierarchy.dendrogram(link_songs, orientation="right", color_threshold=.9935, leaf_label_func=llf)
plt.gca().set_title("Top 25 Clustering")
plt.gca().set_xlim((1.0, 0.96))
plt.xticks((1.0, .99, .98, .97, .96))
plt.savefig("music_dendro.png", dpi=500)