Word Analysis of 2008 U.S. Presidential Debates

Joe Biden vs. Sarah Palin

2 October 2008



Word Statistics

Debate Word Count

Summary Word Count

The summary word count reports the total number of words and the number of unique, non-stop words used by each candidate. Word number is expressed as both absolute and relative values.

Table 1. Number of all words and unique words used by each speaker.
speaker word count
Joe Biden
7,375 1,291
49.2% 15.9%
62011174
Sarah Palin
7,615 1,321
50.8% 15.8%
64141201
all
14,990 1,999
100.0% 12.5%
131181872
Table 1 Analysis

Palin is a little bit chattier than Biden, with 7,615 total words compared to his 7,375 (+3.3% more). This difference is about half of the differnce between Obama and McCain (Obama had 7-8% more words than McCain).

On the other hand, both Biden and Palin had a substantially lower unique word ratio at 15.9% and 15.8%, respectively (compare 16.5% for Obama and 17.6% for McCain during the first debate). Table 1 Legend

a c
b d
3010
a :: total number of words
b :: proportion of words in the debate
c :: unique words in (a)
d :: (c) relative to (a)
bar :: proportion of (a-c):c

Stop Word Contribution

In the table below, the candidates' delivery is partitioned into stop and non-stop words. Stop words are frequently-used bridging words (e.g. pronouns and conjunctions) and do not carry inherent meaning. The fraction of words that are stop words is one measure of the complexity of speech.

Table 2. Expanded analysis of total, stop and non-stop word count.
speaker word category
all stop non-stop
Joe Biden
7,375 1,291
49.2% 17.5%
60841291
3,898 117
52.9% 3.0%
3781117
3,477 1,174
47.1% 33.8%
23031174
Sarah Palin
7,615 1,321
50.8% 17.3%
62941321
4,274 120
56.1% 2.8%
4154120
3,341 1,201
43.9% 35.9%
21401201
all
14,990 1,999
100.0% 13.3%
129911999
8,172 127
54.5% 1.6%
8045127
6,818 1,872
45.5% 27.5%
49461872
Table 2 Analysis

Biden has a relatively high non-stop word component in his speech, at 47.1%, higher than Palin (43.9%), Obama (43.4% for first debate) and McCain (44.3% for first debate). Unfortunately, he also suffers from the lowest unique word component, as mentioned above. Table 2 Legend

a c
b d
3010
a :: total number of words, for a given category (all, stop, non-stop)
b :: (a) relative to words in the debate if category=all, otherwise relative to words by the candidate
c :: number of unique words with set (a)
d :: (c) relative to (a)
bar :: proportion of (a-c):c

All further analysis uses debate content that has been filtered for stop words.

Word frequency

The word frequency table summarizes the frequency with which words were used. Specifically, the average word frequency and the weighted cumulative frequencies at 50 and 90 percentile. The average word frequency indicates how many times, on average, a word is used. For a given fraction of the entire delivery, the weighted cumulative frequency indicates the largest word frequency within this fraction (details about weighted cumulative distribution).

Table 3. Average, 50%, and 90% weighted cumulative word frequencies (content filtered for stop words).
speaker word frequency
Joe Biden
2.96 5.00 26.00
2.9625.00026.000
Sarah Palin
2.78 5.00 20.00
2.7825.00020.000
all
3.64 7.00 44.00
3.6427.00044.000
Table 3 Analysis

Biden is highly repetitive, with an average word frequency of 2.96 (+6.5% larger than Palin's 2.78, +12.5% larger than Obama's 2.63, and +17.9% larger than McCain's 2.51).

The speech of Palin is also more repetitive than Obama and McCain, with a 2.78 average word frequency (+5.7% higher than Obama and +10.8% higher than McCain).

Table 3 Legend
a b c
51025
a :: average word frequency
b :: largest word frequency in 50% of content
c :: largest word frequency in 90% of content
bar :: proportion of a:b:c

Sentence Size

Table 4. Number of words in a sentence, as measured by average number of words, 50% and 90% weighted cumulative values for three word groups (all words, stop words and non-stop words).
speaker sentence size (by word type)
all stop non-stop
Joe Biden
16.0 22.0 49.0
15.963
22.000
49.000
8.6 12.0 27.0
8.605
12.000
27.000
7.6 11.0 23.0
7.592
11.000
23.000
Sarah Palin
18.7 26.0 55.0
18.664
26.000
55.000
10.7 15.0 30.0
10.712
15.000
30.000
8.5 12.0 25.0
8.458
12.000
25.000
all
17.2 24.0 53.0
17.230
24.000
53.000
9.6 14.0 29.0
9.592
14.000
29.000
8.0 11.0 25.0
7.993
11.000
25.000
Table 4 Analysis

Palin' sentences are enormous, by far the largest of all speakers. She comes in an average sentence length of 8.5 non-stop words (Biden 7.6, Obama 7.7, McCain 7.1). Biden's sentence length is virtually the same as Obama, though Biden uses fewer stop words than Obama (8.6 vs 10.0 per sentence).

Table 4 Legend
a b c
15
30
75
a :: average sentence size
b :: largest sentence size for 50% of content
c :: largest sentence size for 90% of content
bar :: proportion of a:b:c

Part of Speech Analysis

In this section, word frequency is broken down by their part of speech (POS). The four POS groups examined are nouns, verbs, adjectives and adverbs. Conjunctions and prepositions are not considered. The first category (n+v+adj+adv) is composed of all four POS groups.

Part of Speech Count

Table 5. Count of words (total and unique) categorized by part of speech (POS).
parts of speech
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Joe Biden
3,294 1,139
100.0% 34.6%
125364047035923616310865
1,893 640
57.5% 33.8%
1253640
829 359
25.2% 43.3%
470359
399 163
12.1% 40.9%
236163
173 65
5.3% 37.6%
10865
Sarah Palin
3,213 1,167
100.0% 36.3%
111665147337723318212061
1,767 651
55.0% 36.8%
1116651
850 377
26.5% 44.4%
473377
415 182
12.9% 43.9%
233182
181 61
5.6% 33.7%
12061
all
6,507 1,824
100.0% 28.0%
264210181087592538276254100
3,660 1,018
56.2% 27.8%
26421018
1,679 592
25.8% 35.3%
1087592
814 276
12.5% 33.9%
538276
354 100
5.4% 28.2%
254100
Table 5 Analysis

Biden and Palin are evenly matched in their proportion of parts of speech. Biden tends to repeat his adverbs less than Palin (adverb uniqueness for Biden is 37.6% vs 33.7% for Palin), but repeats his nouns more (noun uniqueness for Biden is 33.8% vs 36.8% for Palin).

Both Biden and Palin show significantly greater noun and adjective repetition than Obama and McCain. For example, uniqueness for nouns was 39.1% and 41.1% for Obama and McCain, respectively, but drops to 33.8% and 36.8% for Biden and Palin. Overall adjective use by Biden and Palin (12.1% and 12.9%) is lower than Obama and McCain (15.1% and 14.2%).

Table 5 Legend
a c
b d
1535
a :: total number of words for a given POS (all, noun, verb, adjective, adverb)
b :: (a) relative to all words by candidate
c :: unique words in (a)
d :: (c) relative to (a)
bar :: proportion of (a-c):c

Part of Speech Frequency

Table 5. Frequency of words by part of speech (POS).
part of speech frequency
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Joe Biden
2.89 5.0 26
2.8925.00026.000
2.96 5.0 35
2.9585.00035.000
2.18 3.0 15
2.1813.00015.000
2.45 3.0 12
2.4483.00012.000
2.66 4.0 22
2.6624.00022.000
Sarah Palin
2.75 4.0 20
2.7534.00020.000
2.71 5.0 18
2.7145.00018.000
2.13 3.0 13
2.1263.00013.000
2.28 3.0 11
2.2803.00011.000
2.97 5.0 27
2.9675.00027.000
all
3.57 7.0 40
3.5677.00040.000
3.60 7.0 53
3.5957.00053.000
2.67 4.0 28
2.6714.00028.000
2.95 5.0 20
2.9495.00020.000
3.54 10.0 29
3.54010.00029.000
Table 5 Analysis

Increased repetition in the speech of Biden and Palin is clearly demonstrated by this table. Average frequency for all parts of speech is increased (except Biden' adverbs), with verbs seeing the smallest increase (2.18/2.13 Biden/Palin vs 2.10/2.06 Obama/McCain) and the largest for nouns (2.96/2.71 Biden/Palin vs 2.56/2.44 Obama/McCain). Biden's adverb repetition is nearly as low as McCain'.

In this debate, as others, verbs are the least repeated words.

Table 5 Legend
a b c
51025
a :: average word frequency
b :: largest word frequency in 50% of content
c :: largest word frequency in 90% of content
bar :: proportion of a:b:c

Part of Speech Pairing

Through word pairing, I attempt to capture the contextual use of parts of speech within a sentence and extract concepts from the text. Specifically, unique pairs of words indicate complexity and inter-relatedness between concepts in a sentence.

Table 6a (Joe Biden). Word pairs (total and unique) categorized by part of speech (POS) for Joe Biden.
parts of speech pairings — Joe Biden
noun verb adjective adverb
noun
6,268 4,687
30.3% 74.8%
15814687
verb
5,073 4,026
24.5% 79.4%
10474026
892 760
4.3% 85.2%
132760
adjective
2,734 1,994
13.2% 72.9%
7401994
1,051 802
5.1% 76.3%
249802
259 187
1.3% 72.2%
72187
adverb
928 793
4.5% 85.5%
135793
386 339
1.9% 87.8%
47339
192 156
0.9% 81.2%
36156
52 44
0.3% 84.6%
844
Table 6b (Sarah Palin). Word pairs (total and unique) categorized by part of speech (POS) for Sarah Palin.
parts of speech pairings — Sarah Palin
noun verb adjective adverb
noun
5,911 4,513
28.5% 76.3%
13984513
verb
5,210 4,314
25.1% 82.8%
8964314
1,062 930
5.1% 87.6%
132930
adjective
2,769 2,239
13.3% 80.9%
5302239
1,091 920
5.3% 84.3%
171920
283 244
1.4% 86.2%
39244
adverb
1,206 1,008
5.8% 83.6%
1981008
594 512
2.9% 86.2%
82512
294 246
1.4% 83.7%
48246
73 63
0.4% 86.3%
1063
Table 6c (Joe Biden vs Sarah Palin). Word Pairs (total and unique) categorized by part of speech (POS) for both candidates.
parts of speech pairings
noun (n) verb (v) adjective (adj) adverb (adv)
noun
6,268 5,911
  94.3%
74.8% 76.3%
6268.000
4687
5911.000
4513
verb
5,073 5,210
  102.7%
79.4% 82.8%
5073.000
4026
5210.000
4314
892 1,062
  119.1%
85.2% 87.6%
892.000
760
1062.000
930
adjective
2,734 2,769
  101.3%
72.9% 80.9%
2734.000
1994
2769.000
2239
1,051 1,091
  103.8%
76.3% 84.3%
1051.000
802
1091.000
920
259 283
  109.3%
72.2% 86.2%
259.000
187
283.000
244
adverb
928 1,206
  130.0%
85.5% 83.6%
928.000
793
1206.000
1008
386 594
  153.9%
87.8% 86.2%
386.000
339
594.000
512
192 294
  153.1%
81.2% 83.7%
192.000
156
294.000
246
52 73
  140.4%
84.6% 86.3%
52.000
44
73.000
63
Table 6 Analysis

Because Palin's sentences were longer than Biden', she is expected to have more word pairings. Indeed, her adverb/* and verb/verb pairings values are 120-150% those of Biden. Interestingly, Palin had fewer noun/noun pairs than Biden (94.3% of Biden).

Both candidates here showed greater repetition of pairs, when compared to Obama and McCain. For example, Biden/Palin unique noun/noun pairs accounted for 74.8%/76.3%, in comparison to 81.5%/79.7% for Obama/McCain.

However, between Biden and Palin, Palin had consistently higher ratio of unique pairs than Biden.

Table 6a,b Legend
a c
b d
3010
a :: total number of pairs, for a given category (e.g. verb/noun)
b :: (a) relative to all pairs
c :: number of unique pairs within set (a)
d :: (c) relative to (a)
bar :: proportion of (a-c):c
Table 6c Legend
a c
  d
b e
50
45
35
30
a :: total number of pairs for Joe Biden
b :: relative unique pairs for Joe Biden
c :: total pairs for Sarah Palin
d :: (c) relative to (a) (i.e. Sarah Palin relative to Joe Biden)
e :: relative unique pairs for Sarah Palin
bars :: values of (a), (b), (c) and (e)

Word usage

This section enumerates words that were unique to a canddiate (e.g. used by one candidate but not the other). For a given part of speech, the table breaks down the number of words that were spoken by only one of the candidates or both candidates (intersection). The last row includes all words (union).

Table 7. Total and unique words used exclusively by a candidate or by both candidates.
parts of speech
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Joe Biden
1,041 657
100.0% 63.1%
16.0% 36.0%
384657
2183679021557942039
585 367
56.2% 62.7%
16.0% 36.1%
218367
218367
305 215
29.3% 70.5%
18.2% 36.3%
90215
90215
151 94
14.5% 62.3%
18.6% 34.1%
5794
5794
59 39
5.7% 66.1%
16.7% 39.0%
2039
2039
Sarah Palin
1,100 685
100.0% 62.3%
16.9% 37.6%
415685
239378104233711132335
617 378
56.1% 61.3%
16.9% 37.1%
239378
239378
337 233
30.6% 69.1%
20.1% 39.4%
104233
104233
184 113
16.7% 61.4%
22.6% 40.9%
71113
71113
58 35
5.3% 60.3%
16.4% 35.0%
2335
2335
both
4,366 482
100.0% 11.0%
67.1% 26.4%
3884482
21852738931444106921126
2,458 273
56.3% 11.1%
67.2% 26.8%
2185273
2185273
1,037 144
23.8% 13.9%
61.8% 24.3%
893144
893144
479 69
11.0% 14.4%
58.8% 25.0%
41069
41069
237 26
5.4% 11.0%
66.9% 26.0%
21126
21126
all
6,507 1,824
100.0% 28.0%
100.0% 100.0%
46831824
264210181087592538276254100
3,660 1,018
56.2% 27.8%
100.0% 100.0%
26421018
26421018
1,679 592
25.8% 35.3%
100.0% 100.0%
1087592
1087592
814 276
12.5% 33.9%
100.0% 100.0%
538276
538276
354 100
5.4% 28.2%
100.0% 100.0%
254100
254100
Table 7 Analysis

The breakdown of nouns that were exclusive to Biden or Palin, or those that were spoken by both, was nearly the same as for Obama/McCain. Both Biden and Palin each contributed about 36-37% to the unique nouns, with 27% of unique nouns in the debate spoken by both.

The profile of verb, adjective and adverb use is also very similar to the first Obama/McCain debate. When it comes to contribution to unique adjectives. Biden contributed to 34.1% of the unique adjectives, which is similar to McCain at 33.2%, and Palin to 40.9%, which is similar to Obama at 39.8%.

When all parts of speech are considered, the breakdown of contribution to the unique pool is at 36-38% (very similar to 37% seen for both McCain and Obama).

Table 7c Legend
a d
b e
c f
4030
40302015105
a :: total number of words unique to a candidate, for a given POS group
b :: (a) relative to all unique words to the candidate
c :: (a) relative to all words
d :: unique words in (a)
e :: (d) relative to (a)
f :: (d) relative to all unique words
bar1 :: normalized ratio of (a-d):d
bar2 :: absolute ratio of (a-d):d for all POS groups (first column) or POS group (other columns)

Noun Phrase Usage

Noun phrases were extracted from the text and analyzed for frequency, word count, unique word count and richness.

Top-level noun phrases are those without a parent noun phrase (a parent phrase is one that a similar, longer phrase). Derived noun phrases are those with a parent (more details about noun phrase analysis).

The top-level noun phrases can be interpreted as independent concepts. Derived noun phrases can be interpreted as variants on concepts embodied by the top-level phrases.

Noun Phrase Count

This table reports the absolute number of noun phrases, which is related to the number of total words (specifically, nouns) delivered. The next table presents the number of phrases relative to the number of nouns.

Table 8. Number of noun phrases.
speaker noun phrase
all top-level derived
Joe Biden
923 773
100.0% 83.7%
150773
369 352
40.0% 95.4%
17352
554 421
60.0% 76.0%
133421
Sarah Palin
856 733
100.0% 85.6%
123733
380 373
44.4% 98.2%
7373
476 360
55.6% 75.6%
116360
Table 8 Analysis

Biden delivered significantly more noun phrases than Palin, at 923 vs 856 (+7.8% more). However, only 40% of his phrases were top-level, whereas Palin had a fraction of 44.4%. Both values are lower than for Obama and Mccain (who had 46.7% and 45.1%, respectively), though Biden's value was almost 5% lower than Obama, McCain and Palin.

Table 8c Legend
a c
b d
1070
a :: number of noun phrases
b :: (a) relative to number of all noun phrases
c :: number of unique phrases
d :: (c) relative to (a)
bar :: normalized ratio of (a-c):c

Noun Phrase Richness

The previous table presented the total number of noun phrases, which can be equated to individual concepts. In this table, this value is shown relative to the number of nouns used. The interpretation of this ratio is that of richness. In other words, how many noun phrases were constructed, per noun.

Table 9. Number of noun phrases relative to the number of nouns.
speaker noun phrase
all top-level derived
Joe Biden
0.49 1.21
0.4875858425779191.2078125
0.19 0.55
0.1949286846275750.55
0.29 0.66
0.2926571579503430.6578125
Sarah Palin
0.48 1.13
0.4844368986983591.12596006144393
0.22 0.57
0.215053763440860.572964669738863
0.27 0.55
0.2693831352574990.552995391705069
Table 9 Analysis

It looks like Biden and Palin managed to create a greater diversity of noun phrases with their nouns. Biden's value is the highest at 1.21 (ratio of unique noun phrases to unique nouns). Palin's value, 1.13, is higher than McCain (1.07), but lower than Obama (1.16).

When top-level noun phrases are considered (those without a parent noun phrase), Biden has the lowest value (0.55) of Palin (0.57), McCain (0.56) and Obama (0.61). Given that his ratio for all phrases is highest, and top-level phrases is lowest, it can be concluded that he repeats concepts more frequently and recycles his nouns more extensively, to construct similar noun phrases, than the other candidates.

Table 9c Legend
a b
25
a :: ratio of the number of noun phrases to number of nouns
b :: ratio of the number of unique noun phrases to number of unique nouns
bar :: ratio of a:b

Noun Phrase Frequency and Size

Table 10. Noun phrase frequency, word count and unique word count.
speaker noun phrase
avg frequency word count unique word count
Joe Biden
1.19 1.00 3.00
1.1941.0003.000
2.88 4.00 8.00
2.8854.0008.000
2.81 4.00 7.00
2.8144.0007.000
Sarah Palin
1.17 1.00 3.00
1.1681.0003.000
2.88 4.00 9.00
2.8834.0009.000
2.82 4.00 8.00
2.8174.0008.000
Table 10 Analysis

Both frequency and word count for Biden and Palin were nearly identical. Compared to Obama and McCain, both Biden and Palin had slightly longer noun phrases.

Table 10c Legend
a b c
51020
a :: average
b :: 50% weighted cumulative value
c :: 90% weighted cumulative value
bar1 :: normalized ratio of a:b:c

Windbag Index

The Windbag Index is a compound measure that characterizes the complexity of speech. A low index is indicative of succinct speech with low degree of repetition and large number of independent concepts.

Table 11. Windbag Index for each speaker. The higher the value, the greater the degree of repetition in the speech.
speaker Windbag Index
index value index terms
Joe Biden
606
+13.4%
606.879133217161
0.471 0.338 0.338 0.433 0.409 0.376 0.837 0.455 1.208
+7.5% -6.1% -8.2% -2.4% -6.8% +11.5% -2.2% -10.5% +7.3%
0.4714576271186440.3376473971814780.3380876914949820.4330518697225570.4085213032581450.3757225433526010.8374864572047670.4553686934023291.2078125
Sarah Palin
535
-11.8%
535.09073597189
0.439 0.359 0.368 0.444 0.439 0.337 0.856 0.509 1.126
-6.9% +6.5% +9.0% +2.4% +7.4% -10.3% +2.2% +11.7% -6.8%
0.4387393302692060.3594732116132890.3684210526315790.4435294117647060.438554216867470.3370165745856350.8563084112149530.5088676671214191.12596006144393
Table 11 Analysis

A spectacularly windbaggy showing for Biden, truly earning his nickname. His Windbag Index was 606, 13.4% higher than Palin (535), +43.6% higher than Obama (422) and +64.7% higher than McCain (368).

Biden's index is higher than Palin's due to poor performance in unique non-stop words, nouns, verbs, adjectives and unique noun phrases. In fact, the only place where Biden does better is the non-stop word ratio, adverbs and ratio of unique noun phrases to unique nouns.

Table 11c Legend
The Windbag Index is 1/(t1*t2*...*t9) where t1,t2,...,t9 are the individual terms. These terms are

t1 :: fraction of words which are non-stop
t2 :: fraction of non-stop words which are unique
t3 :: fraction of nouns which are unique
t4 :: fraction of verbs which are unique
t5 :: fraction of adjectives which are unique
t6 :: fraction of adverbs which are unique
t7 :: fraction of noun phrases which are unique
t8 :: fraction of noun phrases which have no parent
t9 :: ratio of unique noun phrases to unique nouns

Note that large individual terms t1...t9 contribute to a smaller index.

The percentage values below the index and each term are relative differences to the other speaker' corresponding term (i.e. 100*(x-x0)/x0 where x is the value for the present speaker and x0 for the other speaker).

Tag Clouds

In the tag clouds below, the size of the word is proportional to the number of times it was used by a candidate (tag cloud details).

Not all words from a group used to draw the cloud fit in the image. Specifically, less frequently used words for large word groups fall outside the image.

Debate Tag Clouds for Each Candidate — All Words

Each candidate's debate portion was extracted and frequencies were compiled for each part of speech (noun, verb, adjective, adverb), with words colored by their part of speech category. The words in these tag clouds include words unique to one candidate as well as words used by both candidates. For other tag clouds below, only words unique to a candidate are used.

Keep in mind that the word sizes between tag clouds cannot be directly compared, since the minimum and maximum size of the words in each tag cloud is the same. However, the distribution of sizes within a tag cloud reflects the frequency distribution of words (tag cloud details).

Debate Tag Cloud for Joe Biden — all words

Debate tag cloud for Joe Biden

Debate Tag Cloud for Sarah Palin — all words

Debate tag cloud for Sarah Palin
Debate Tag Cloud Analysis

Recall that Obama' tag cloud featured his debate opponent prominently — Obama used both "John" and "McCain" frequently. However, for Biden the word "Sarah" or "Palin" does not appear within his frequently used words. Instead, he refers to "Obama", "McCain", "John" and, unlike McCain, to "Barack".

Palin also mentions "John" and "McCain", and "Barack" and "Obama", but neither "Joe" nor "Biden" are within the center of her cloud.

Debate Tag Clouds for Each Candidate — Unique Words

The tag clouds below show only used exlusively by a candidate. For example, if candidate A used the word "invest" (any number of times), but the other candidate B did not, then the word will appear in the unique word tag cloud for candidate A.

Debate Tag Cloud for Joe Biden — words unique to Joe Biden

Debate tag cloud for Joe Biden

Debate Tag Cloud for Sarah Palin — words unique to Sarah Palin

Debate tag cloud for Sarah Palin
Unique Word Tag Cloud Analysis

Biden repeats his top nouns significantly — there are hardly any small (i.e. unfrequently used) words in his noun tag. Compare this to Obama's noun cloud, which is richly populated with small words. Take a look at some curious words exclusive to Biden, such as "heterosexual", "dirty" and "genocide".

Palin's tag cloud morphology is similar to McCain's, with a large component of large words. And, like Biden, Palin's repetition of words drowns out less frequently used words from her tag cloud. Curious words for Palin are "lame", "nice", and "blunders".

Part of Speech Tag Clouds

In these tag clouds, words by both candidates were categorized on the basis of exclusivity to a candidate. Words unique to each candidate are drawn with a different color. Words used by both candidates are shown in grey.

The size of the word is relative to the frequency for the candidate — word sizes between candidates should not be used to indicate difference in absolute frequency.

Words were further cateogorized by part of speech (noun, verb, adjective, adverb) and individual tag clouds were prepared for each category.

The last tag cloud in this section, which uses all (noun + verb + adjective + adverb) parts of speech.

Tag Cloud of noun words, by speaker

Noun Tag Cloud Analysis

The noun tag cloud is heavily weighted towards unique words by Palin. Unlike the two primary words in the noun tag cloud for Obama/McCain (which were "Obama" and "John"), we do not see the corresponding pairing (e.g. "Palin"/"Sarah" and "Joe"/"Biden") in this cloud. Both "Palin" and "Sarah" are conspicuously absent from the center of the cloud.

Palin's overwhelming contribution to the center of the tag cloud suggests that she touched on, and repeated, concepts not raised by Biden.

Tag Cloud of verb words, by speaker

Verb Tag Cloud Analysis

Contribution by Biden and Palin to the verb tag cloud is more equal. There is a large number of frequently used verbs by either candidate, or both, and therefore the cloud contains relatively few words (when compared to the corresponding cloud for McCain/Obama).

Tag Cloud of adjective words, by speaker

Adjective Tag Cloud Analysis

Biden presses with "fair" and "wealthy", words not used by Palin, whereas Palin uses "right" and "huge". Biden uses "unpatriotic".

Tag Cloud of adverb words, by speaker

Adverb Tag Cloud Analysis

Like Obama, Biden loves "absolutely" and this is his exclusive and most favourite adverb. Biden also uses "contemporaneously", a great word and one that likely was lost on many people.

Tag Cloud of all words, by speaker

All Tag Cloud Analysis

The cloud of all parts of speech clearly illustrate just how much more exclusive and repeated content was found in Palin' delivery.

The cloud also nicely contrasts with the corresponding cloud for the McCain/Obama debate — the vice-presidential debate in general saw a great deal more repetition.

Word Pair Vignette Tag Clouds for Each Candidate

Tag Cloud of word pairs by Joe Biden

adjective/adjective by Joe Biden
adjective/adverb by Joe Biden
adjective/noun by Joe Biden
adjective/verb by Joe Biden
adverb/adverb by Joe Biden
adverb/noun by Joe Biden
adverb/verb by Joe Biden
noun/noun by Joe Biden
noun/verb by Joe Biden
verb/verb by Joe Biden
Word Pair Tag Cloud Analysis for Joe Biden.

Biden' clouds are similar to Obama's in morphology, except for verb/verb which Biden repeats frequently and therefore his cloud has no small text. Biden also has a sparse (i.e. repetitive) cloud for adjective/adverb, suggesting low complexity in combinations of these two modifiers.

Tag Cloud of word pairs by Sarah Palin

adjective/adjective by Sarah Palin
adjective/adverb by Sarah Palin
adjective/noun by Sarah Palin
adjective/verb by Sarah Palin
adverb/adverb by Sarah Palin
adverb/noun by Sarah Palin
adverb/verb by Sarah Palin
noun/noun by Sarah Palin
noun/verb by Sarah Palin
verb/verb by Sarah Palin
Word Pair Tag Cloud Analysis for Sarah Palin.

Palin mongers with "many nuclear" for adjective/adjective and "nuclear weapons" appers prominently in her adjective/noun cloud. Her verb/verb cloud is very sparsely populated, just like Biden's and McCain's. In fact, only Obama had a relatively complex verb/verb tag cloud.

Downloads

debate transcript (courtesy of CNN).

parsed word lists (analyzed transcript, including words by speaker, by POS, and all POS pairings).

tag cloud images

data structure

Please see the methods section for details about these files.