Do Rappers Have a Bigger Vocabulary Than Shakespeare?: A Data Scientist Maps Out the Answer


Each year brings us a new list of words that, once hip or sub­cul­tur­al, sig­nal their admis­sion into the main­stream by enter­ing the pages—print or online—of the Oxford Eng­lish Dic­tio­nary or Mer­ri­am Web­ster’s. Many of those come from the world of hip hop. The form is a ver­i­ta­ble lab­o­ra­to­ry of lin­guis­tic inno­va­tion, spawn­ing dozens of region-spe­cif­ic argots that mutate and evolve beyond the capac­i­ty of hip lex­i­cog­ra­phers to doc­u­ment. One data sci­en­tist, Matt Daniels, has made an inter­est­ing attempt, how­ev­er, in a project he calls “The Largest Vocab­u­lary in Hip Hop.” Pro­ceed­ing from the premise that cer­tain rap­pers might match or best Shake­speare for the title of “largest vocab­u­lary ever,” Daniels used a method­ol­o­gy called “token analy­sis” to ana­lyze the lyri­cal con­tent of “the most famous artists in hip hop.” He relied on Rap Genius tran­scrip­tions, which are only cur­rent to 2012, to pro­duce a sam­ple size of 35,000 words (the equiv­a­lent of 3–5 stu­dio albums).

Top­ping the list by far with a total of 7,392 unique words used is rap­per Aesop Rock, whom, Daniels admits, is some­what obscure by com­par­i­son with Jay Z or Snoop Dog. More well-known artists like Wu Tang Clan, The Roots, and Out­kast also rank high­ly, but what Daniels dis­cov­ered is that many of the rap­pers near the top of the scale are under­ground or obscure artists who don’t sell mil­lions of records. And occu­py­ing the low­er end are some top-sell­ing artists and house­hold names like Lil Wayne, Kanye West, and Snoop Dog (DMX is dead last at #85). King of the hill Jay‑Z, whose 2013 album Magna Carta…Holy Grail sold half a mil­lion copies in its first week, ranks some­where in the mid­dle, and Daniels quotes from the mega-sell­ing rapper’s “Moment of Clar­i­ty” from his Black Album in which he plain­ly admits that he’ll write mid­dle­brow lyrics for mil­lion dol­lar sales fig­ures, say­ing “I dumb­ed down for my audi­ence to dou­ble my dol­lars” (one won­ders how many lis­ten­ers per­ceived the slight).

Daniels admits in an NPR inter­view that this is “not a seri­ous aca­d­e­m­ic study” but a project he under­took for the fun of it. And a great many of the “unique words” count­ed in each rapper’s totals are slang coinages or vari­ants like “pimps, pimp, pimp­ing, and pimpin,” each of which counts sep­a­rate­ly. Even so, writes Daniels on the project’s site, “it’s still direc­tion­al­ly inter­est­ing,” as well as soci­o­log­i­cal­ly. And of course, lit­er­ary writ­ers have been con­tribut­ing made-up words to the gen­er­al lex­i­con for cen­turies. See Daniels’ site for an inter­ac­tive visu­al­iza­tion (screen shot above) of the rank­ings of all 85 rap­pers sur­veyed.

If you’re won­der­ing who has a big­ger vocab­u­lary — Shake­speare or rap­pers — here’s the quick answer in pure­ly numer­i­cal terms. In his sam­ple size of 35,000 words per artist, Daniels deter­mined that Aesop Rock used 7,392 unique words (and Wu-Tang Clan, 5,895) against Shake­speare’s 5,000 unique words. And there you have it.

Relat­ed Con­tent:

Jay‑Z: The Evo­lu­tion of My Style

The Great­ness of Charles Dar­win Explained with Rap Music

The Art of Data Visu­al­iza­tion: How to Tell Com­plex Sto­ries Through Smart Design

Josh Jones is a writer and musi­cian based in Durham, NC. Fol­low him at @jdmagness.

by | Permalink | Comments (4) |

Sup­port Open Cul­ture

We’re hop­ing to rely on our loy­al read­ers rather than errat­ic ads. To sup­port Open Cul­ture’s edu­ca­tion­al mis­sion, please con­sid­er mak­ing a dona­tion. We accept Pay­Pal, Ven­mo (@openculture), Patre­on and Cryp­to! Please find all options here. We thank you!

Comments (4)
You can skip to the end and leave a response. Pinging is currently not allowed.
  • Hal says:

    I saw this yes­ter­day (can’t remem­ber where) and the com­ments were in favor of the rap­pers as some­how bet­ter writ­ers than Shake­speare. If I had not scanned to the bot­tom of this post I would­n’t have known that it is “not a seri­ous aca­d­e­m­ic study.”

    Does it real­ly have to be point­ed out that some­times big­ger is not bet­ter? It is not the size of vocab­u­lary, but how it is put togeth­er.

    Many writ­ers have a large vocab­u­lary — James Mich­n­er claimed near­ly 80,000 words. He would nev­er have com­pared him­self to the Bard.

    What both­ers me is, a lot of young peo­ple take this seri­ous­ly. In the words of one com­menter:

    “To be, or not to be. What does that even mean?”

    I won­der if Elvis was keen­er on rock­et sci­ence than Astro­physi­cist Neil deGrasse Tyson?

  • Lando says:

    An impor­tant note with regards to com­par­ing any­one on this chart with Shake­speare:

    ” I used the first 5,000 words for 7 of Shake­speare’s works: Ham­let, Romeo and Juli­et, Oth­el­lo, Mac­beth, As You Like It, Win­ter’s Tale, and Troilus and Cres­si­da. For Melville, I used the first 35,000 words of Moby Dick.”

    A direct com­par­i­son between the rap­pers on this chart and Shake­speare or Melville is not reflec­tive of full vocab­u­lary size, it is just meant as a kind of visu­al ref­er­ence point.

  • David says:

    This is a non­sense “study” that only proves that one can prove any­thing with faulty method­ol­o­gy. A per­son­’s vocab­u­lary isn’t lim­it­ed to a peri­od in their life, so pick­ing a few plays only shows what words were need­ed for those plays. To real­ly show any­one’s vocab­u­lary, you’d need to count all the words they used in every­thing. And count­ing deriva­tions of the same word like pimp as mul­ti­ple words is idio­cy.

    No doubt there are rap­pers with large vocab­u­lar­ies, but this does­n’t come close to demon­strat­ing that, let alone show they’re beyond Shake­speare.

Leave a Reply

Open Culture was founded by Dan Colman.