Dept of Mathematics and Computer Science - Duquesne University
As a social media platform, Twitter (twitter.com) offers opportunities to examine public reactions in what could be described as the world's largest village square. Twitter provides an opportunity not only to participate in public discourse, but even to shape it. However, to fully understand this discourse, it is helpful to understand the people behind the discourse. Stylometric authorship profiling (Argamon, et al, 2009) provides an example of this. People are free to post their opinions, but by analyzing the writing style of the individual tweets, we can infer other attributes of the individual authors.
In this study, we infer personality categories using the well-known Myers-Briggs Type Inventory (MBTI) to analyze whether or not there are any differences between the supporters (and detractors) of the major party candidates, Hillary Clinton (D) and Donald Trump (R). The MBTI categorizes people along four major axes, as encoded in a four-character summary (for example, ENFJ: Extroverted, iNtuitive, Feeling, Judging). We have shown (Gray and Juola, 2011; Juola et al., 2013) that personality can be inferred with high accuracy from writing, and further that MBTI personality types can be gleaned specifically from Twitter feeds. Using The EthosIO system developed by Juola & Associates (www.ethosio.com), we have applied this (Juola, Vinsick, and Ryan, 2016) to large-scale analyses of the demographics of personality on Twitter, finding substantial differences between the accepted distribution of personality in the general US population and between the distribution of personality among active Twitter participants. For example, introverts make up approximately 50% of the general US population, but nearly 80% of active Twitter users. Similarly, nearly 3/5 of the general US population are "sensing" (S) [as opposed to "intuitive" (N)], but half or fewer Twitter users are. Two specific subgroups, INFP and INFJ, are vastly overrepresented on Twitter, being only about 5% of the overall US population, but 30% or more of the samples gleaned from Twitter.
We extend this to analyzing personality differences between politically disparate groups of Twitter participants. As with (Juola, Vinsick, and Ryan, 2016), we harvested a large group of user names from the Twitter sample feed, selecting users whose public tweets included one of several political hashtags. Based on the hashtags seen, we divided participants into four groups: anti-Clinton (identified by one or more of
'#NeverHillary', '#CrookedHillary', '#WhichHillary', '#DraftOurDaughters', '#hillary4prison', '#hil-lary4prison2016', '#StopHillary', '#CrimeWithHer', or '#Killary'); pro-Clinton ('#ImWithHer', '#Clinton', '#ClintonKaine16', '#ClintonKaine2016', '#Hil-laryClinton'], '#ClintonKaine', '#WhyIWantHillary', '#HillarysArmy', or '#Hillary2016'); anti-Trump ('#NeverTrump', '#dumptrump', '#trumptaxreturns', '#dontvotefortrump', '#dumpthetrump', '#boy-cotttrump', '#trumpsexism'); and pro-Trump ('#Im-WithYou', '#TrumpTrain', '#MakeAmericaGreatA-gain', '#TrumpPence16', '#TrumpPence2016', '#Trump', '#AltRight', '#VoteTrump', '#TeamTrump') This gave us approximately 600 user names for each of the four groups in our preliminary dataset. These user names were submitted to the EthosIO personality analyzer to produce distributional data for each subgroups.
We therefore had in our preliminary corpus 651 pro-Clinton subjects, 587 pro-Trump subjects, 635 anti-Clinton subjects and 639 anti-Trump subjects, for a total of 2512 total user names, divided across 16 MBTI categories (details in full paper). As expected from previous work, the overall statistics do not match US demographics; for example, types ISFP and INFP are strongly overrepresented in all samples, as are introverts in general. Our interest, however, is in whether or not political differences also show up as personality differences as well. In plainer language, does the average Clinton supporter have a different personality than the average Trump supporter?
We tested this hypothesis with a variety of chi-squared tests (df=15 throughout). At the most basic
level, we found extremely significant differences (p ~
10A-9) between Clinton supporters and Trump supporters. We also found significant differences (p ~ 10A-12) between "Democrats" (either pro-Clinton or anti-Trump) and "Republicans" (pro-Trump and antiClinton). Examining cells in detail suggest that ISFJ and ISFP are both overrepresented among Democrats while ESTJs and INFJs are overrepresented among Republicans.
By contrast, there was no significant difference (p > 0.10) between "Anti" and "Pro" subjects, despite the possible participation, for example, of third party supporters who are opposed to both Trump and Clinton. Similarly, we found no significant difference between pro-Trump and anti-Clinton subjects, or between antiTrump and pro-Clinton subjects, suggesting that other factors than personality are affecting whether a person chooses to self-express in favor of a particular candidate or in opposition to that candidate's rival(s).
In this study, there are a number of potential confounding factors, the effects of which have not yet been assessed. The first is simply the presence of active attempts to manipulate the dialogue, for example, through the use of automated 'bots' (Kollanyi, Howard, and Wolley, 2016), or simply through the use of "sock puppets," multiple identities in an attempt to create a an appearance of consensus and of larger margins. A second factor is the issue of overlapping categories. Approximately 20% of our preliminary "anti-Trump" sample also self-identified as "pro-Clinton," and similarly, approximately 20% of the anti-Clinton sample self-identified as pro-Trump. More counterintuitively, approximately 5% of the anti-Clinton sample also identified as anti-Trump, and approximately 5% of the pro-Clinton sample was also pro-Trump. This may be due to a third confounding factor, the inability of simple keyword spotting to identify the use of irony (for example, in posting a link to an article highly critical of Trump's campaign and using the '#Trump' hashtag to draw attention to it). As we continue this analysis, based in part on data to be collected during the final and most intense week of the campaign, we will address these issues (and discuss our methods of address in the final paper).
We have therefore shown, using text analysis of Twitter on a moderately large scale, that there are significant differences between the types of people who self-identify as supporters (and opponents) of one of the major candidates in the 2016 US presidential election. We have also shown that there does not appear to be significant personality-related differences between whether one supports one's chosen candidate or opposes the other one. We have also confirmed the previous results (Juola, Vinsick, and Ryan, 2016) about the general distribution of personality types on Twitter, and hasten to point out that the differences we have identified are still relatively minor and that the overall distribution of personality types in both camps are broadly similar to the distribution of personality types on Twitter in general. However, our results show, first, that, in keeping with prior work, inferring personality type via Twitter is practical and useful. Second, they show that personality may play a factor in the selection of one's chosen candidate.
Finally, the question of "who are Trump's voters?" against "who are Clinton's voters?" will no doubt interest historians for decades. Our results provide some insight into possible psychological motivations in addition to the more traditional social, political, and economics reasons, and may therefore enrich future discussion and scholarship.
This version of the paper was written approximately one week before the actual 2016 election and will be updated as appropriate.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Hosted at McGill University, Université de Montréal
Aug. 8, 2017 - Aug. 11, 2017
438 works by 962 authors indexed
Conference website: https://dh2017.adho.org/
Series: ADHO (12)