The rise of the machines in the study of politics: 5 things I learned from studies using #TextAsData: this is how Joshua Ticker headed his Sep. 28, 2013 article for The Washington Post on the lessons he learned in a conference on the use of “text as data” he had just attended for the previous two days.
What’s the motivation?, he asked himself.
Social scientists like to be able to work with quantitative data because it allows us to make more precise estimates (e.g., Romney is losing Ohio by 5 percent vs. Romney seems to be doing poorly in Ohio) and because it allows us to estimate our “uncertainty” in any statements we might make (remember those percentages at Nate Silver’s blog on how likely Obama was to win the election?).
At the same time, however, we know that there is a tremendous amount of information about the world that we can collect and study that comes in the form of words, not numbers. Moreover, the quantity of text that we can now collect has increased dramatically in recent years with the explosion of #BigData, including social media (e.g., tweets, blogs, status updates, etc.) but also the digitization of information that has moved online (newspapers, laws, speeches, house prices, etc.). Thus, not only do we want to be able to quantify text so we can analyze it using statistical methods, but it also turns out that there is so much text, there is no way we can read it all — let alone analyze it — without the help of machines… MORE