Study shows not much racism, but researcher carelessness

Maybe it’s a story about the ever-declining ability of people to think critically. Or perhaps just one where analysis gets overwhelmed by wishful ideological thinking. Regardless, Louisiana meriting a small mention in this effort brings to this work rigorous analysis that leads to the dismissal of its narrative.

The New Orleans Times-Picayune (or what’s left of it) picked up on a piece, mentioned in several, mostly trendy lefty, media sites, about an investigation of Twitter microblog communications (“tweets”) by a group of geographers in the runup to election day last week that contained what they coded as “racist” in nature. It proclaimed that Louisiana was the fifth-highest location of such tweets, at 3.3 times the norm.

The idea found its inspiration from the website Jezebel, not exactly celebrated for the analytical quality nor the intellectual heft of its post-feminist contents and writing (as of this writing, its lead article weighed the question no doubt every intelligent woman of high self-esteem routinely ponders, “Some Things to Consider When You Think You Want to be a Prostitute”), where somebody bored enough decided to collect some post-election tweets with decidedly anti-black language. The geographer crowd at a group called Floatingsheep picked up on it, produced results and extended commentary, pronouncing conclusions echoing a mildly ersatz version of the identity politics/post-Marxism all too prevalent coming from academia: “Racist behavior, particularly directed at African Americans in the U.S., is all too easy to find both offline and in information space.”

And, “we believe that the concentration of racist tweets in the South is indicative of the persistence of racism in the South, which is correlated with, though not necessarily causally-related to, statewide voting for Mitt Romney.” And, finally, “we hoped to use this exercise to show the persistence of racism.” But their problem is, their analysis does nothing of the sort.

Understanding that what they did was not designed as a major research effort but more on the fly, so you can be a bit picky about it. Their coding scheme (what words were used as triggers to identify a message coded as racist) was somewhat incomplete and subjective (they questionably imputed non-racist content to at least one slang term generally considered racist). Theirs may not be a representative sample because the protocol they used to be able to identify the state from where a tweet came (“geotagging”) may not be randomly distributed among cell phone users, nor did they control for the number of messages sent out by users (for example, very plausibly a true racist would have been more excitable and exercised by the election and tweeted out of proportion to other users, where they assumed each tweet represented a unique individual). Any or all of these cast questions of validity over their conclusions.

However, setting these aside, they commit a fundamental error in statistical inference – it’s not the quality of the data that may mislead them, but in their understanding about what it means. As a teacher of research methods, I have found decade in and decade out the most difficult task of students is to interpret tests of data into a substantive meaning. And while I do not know whether graduate level geography study requires coursework in research methods, if this group had them, then the mistake they made in logical inference does them no credit.

Even if we did a test of means that showed a significant difference, for example, between Louisiana and the nation in the number of “racist” tweets, consider that they identified 395 tweets coded as “racist” that represented about 0.05 percent of the total reviewed – in other words, they collected a sample of size roughly 790,000 and found only 395 “racist” tweets among them. In other words, about 1 in every 2,000 tweets sent had “racist” content nationally, and in Louisiana the level was around 1 in 600. And they want us to buy the argument that there is “persistence of racism” when 1 out of 600 people make such remarks? You’d have a stronger chance of arguing racism was on its deathbed with a statistic like that, if that’s all the people out there willing to voice racist sentiments (assuming the distribution of “racism” was the same between the tweeting and non-tweeting public).

The untutored student, if I set out for them a question with a null hypothesis that racism doesn’t exist (that is, the population mean of remarks is 0) with a sample that found some, and then did means tests comparing regions, with a sample size of n=790,000 for the first and among all for the second, I’ll bet analyses mathematically would find statistically significant relationships that might tempt the conclusions that racism is alive and well and it’s more common in the South, leading one to say it could be related to a vote for a white presidential candidate facing a black one. But it has no substantive meaning because the levels are so low. It’s like taking 600 ml of olive oil, adding a single cc of vinegar, and declaring it salad dressing.

It’s not, it’s just very slightly diluted oil. But that’s the rationale behind this declaration of significant racism in America. A miniscule number of tweets relative to a far larger whole simply does not sustain that interpretation; all it tells us is that a few nimrods are out there but in no way represent any constituency of even the tiniest significance in American politics and society. The untutored student might be pleased to have found a significant relationship that the two means differ in the first instance and that in the second case the mean varies significantly by region and even strongly, but these results together do not then warrant a declaration that the sample is evidence that the population exhibits characteristics in a way that demonstrates “persistence of racism.”

However, that does not fit the narrative pushed in academia that a significant portion of American political behavior, particularly among white Southerners, can be explained by “racism” (whether genuine or manufactured by researchers relying on an invalid concept called “symbolic racism,” that, for examples, says it’s “racist” not to think that racism explains why generally blacks are poorer than whites). The 1991 governor’s contest in Louisiana provided potential fodder for this view, since refuted, and, despite revisionist efforts since, imputed racist motives never have been demonstrated to impact Louisiana voting outcomes in the post-civil rights era, because the proportion of the public that uses racist attitudes in a meaningful way in their voting choice remains so small. Yet the mythology from the academy continues.

With their data, you can argue persuasively that “racist” comments from America get made on Twitter, and you can argue that you see more of them come from Louisiana than typically elsewhere in America. But you can’t make a case that racism is persistent, and not even “easy to find” under these needle-in-a-haystack conditions. Unless you’re either a sloppy analyst or one predisposed by your ideological milieu to find it as an article of faith. I don’t care to guess which applies in this case; all I know is, one or both of these conditions applies to the disseminators of this research mini-project, and their conclusions therefore tell us nothing useful.

