Things are getting dumber. The Flesch-Kincaid grade level for each of the Presidential debates in 1960, 2008, 2012, 2016 and 2020.
Word Analysis of 2012 U.S. Presidential Debates
Obama vs. Romney / Biden vs. Ryan
This analysis explores word usage and lexical content of the 2012
US Presidential and Vice-Presidential debates. It is based on the same
approach I used to analyze the 2008 debates.
The purpose is to explore the structure of speech, as characterized
by the use of nouns, verbs, adjectives and adverbs, pronouns and noun
phrases. The speech patterns of opposing candidates are compared in an
effort to identify priorities, perspectives, characteristic values and personality traits.
I analyze the debate for the following
- • word frequency and distribution for different parts of speech
- • words exclusive to a candidate, and those shared by both candidates
- • complexity of noun phrases, which relate to independent concepts
- • a general measure of complexity and repetition in speech, nicknamed the Windbag Index.
A formal debate serves as a great text for this kind of
analysis. The format is somewhat controlled: each speaker is subjected
to the same stimulus (question) and is given the same amount of time
to respond. Reduced is the variation that would appear in analysis of
interviews and other unscripted speech.
The transcript for each debate is parsed to identify the speaker, tag stop words with their part of speech (tagging), and identify noun phrases (chunking).
The tagged and chunked transcripts are analyzed to determine
- • word frequency distribution for each candidate
- • sentence size and proportion of unique words
- • words exclusive to a candidate and those shared by both candidates
- • frequency of concepts, as defined by part of speech pairings (e.g. noun/verb)
- • complexity of noun phrases
- • word clouds for a variety of word lists extracted from the transcripts (e.g. all nouns unique to Obama)
I attempt to quantify the overall complexity of speech by a metric
I call the Windbag Index, which is a product of 8 terms
each measuring uniqueness in different aspects of speech (more about Windbag Index).
A full description of each of the steps in the analysis is
available in the detailed methods section.
The analysis has some limitations.
Results and Commentary
Detailed results and comments are available for each debate.
Analysis of Barack Obama vs Mitt Romney (1st debate)
Analysis of Barack Obama vs Mitt Romney (2nd debate)
Analysis of Barack Obama vs Mitt Romney (3nd debate)
Analysis of Joe Biden vs Paul Ryan
Analysis of Barack Obama vs Mitt Romney (combined debates)
Analysis of Barack Obama (2008 vs 2012)
Each debate analysis report contains a great deal of data. Every debate report is shown in exactly the same format, which should help you with making comparisons. To start, you may find these elements the most interesting
Visualizing the Debates
tables & basic word clouds
Word usage tables describe the structural characteristics of speech by frequency of words, sentence size, proportion of unique and exclusive words and breakdown of words by part-of-speech • see example
Word clouds for each candidate, categorized by parts of speech. Obama promises "folks" "opportunity" • see example
Word clouds, categorized by ownership. Romney loves using "middle-income" • see example
Word clouds for concepts based on part-of-speech pairs. Obama focuses on "middle-class families" and "small business", to Romney's "federal tax". • see example
Candidates's Lexical Profiles
Word Usage Summary
Below are two summary tables from the full analysis of the first debate.
The Windbag Index is a compound measure that characterizes the complexity of speech. A low index is indicative of succinct speech with low degree of repetition and large number of independent concepts (details
Word clouds below are colored by part of speech:
Words exclusive to Barack Obama (not spoken by Romney) in the first debate, colored by part of speech. Note the repeated use of "folks" and "opportunity".
Words exclusive to Mitt Romney (not spoken by Obama) in the first debate, colored by part of speech: "always", "lose" and "hurt". Ouch.
All nouns in debates, colored by contributing speaker (green = Obama, blue = Romney, grey = spoken by both).
All verbs in debates, colored by contributing speaker (green = Obama, blue = Romney, grey = spoken by both).
Content of word list archive and data structure syntax is described in the methods section.
Barack Obama vs Mitt Romney (1st debate) transcript word lists tag clouds data structure
Barack Obama vs Mitt Romney (2nd debate) transcript word lists tag clouds data structure
Barack Obama vs Mitt Romney (3nd debate) transcript word lists tag clouds data structure
Joe Biden vs Paul Ryan transcript word lists tag clouds data structure
Barack Obama vs Mitt Romney (combined debates) transcript word lists tag clouds data structure
Barack Obama (2008 vs 2012) transcript word lists tag clouds data structure