Random Article

randomly select again >>

Scaling Social Science with Hadoop

“The methods of social science are dear in time and money and getting dearer every day.” — George C. Homans, Social Behavior: Its Elementary Forms, 1974……When Homans — one of my favorite 20th century social scientists — wrote the above, one of the reasons the data needed to do social science was expensive was because collecting it didn’t scale very well…

This is an insightful blog entry on the significance of scientific computing for the social sciences, in particular new data aggregation (involuntary digital traces) and analysis methods (social network analysis) applied to the social sciences. According to the author this ‘turn’ might allow the core question of sociology to be answered: how do entire societies work? This opportunity is predicated on the assumption that involuntary digital traces capture ‘actual behavior’ (what people actually do and think) as opposed to traditional methods (surveys, interviews) that are subject to various kinds of misrepresentation. The author also reports on his experience using a computing cluster armed with Hadoop (a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model) as a scalable method of crunching very large data sets.

Golder, S. “Scaling Social Science with Hadoop.” Cloudera, 5 April 2010.

Expertise Level

Professional Field
Science Studies, Sociology

Link to Document