Saturday, July 17, 2010

Going Historical on the Pakistan Google Trends Thing

I remember the mini-meme of Pakistan being first on some weird google trends from at least the summer of 2007. Some American on Orkut (yes an actual white American who used Orkut, shocking eh?) was using these weird Google trend results to hassle on some Pakistanis on a forum.

And then there's NB's hilarious take on this, and the original post from Karachi Metblogs that seemed to start it all. Interestingly the author of the metblog post dates from early 2007, and also mentions that he saw this google trends assertion on some page that talked about Muslim countries. And that American troll in Orkut was going off, on a forum talking about Muslim countries. So maybe the author of the metblog piece was on Orkut too?

That's irrelevant. What is relevant is that the comments section of the Metblogs piece seems to go into some sort of detail about the mechanics of how this search trend might be calculated and where statistical errors creep in.

If it's statistics you want, here's a bone anyone can use for any inadvertent Dawn bashing. The Dawn Counter Article, smells. Beyond the lousy editing, they make the assertion that there are only 8 million Pakistani internet users quoting the ISP association. XYZ on the other hand has pointed a case already with the ISP's reluctance to report or ban porn as one example where these people have been willing to fudge facts just to make money. It would be reasonable to expect that they probably under report their consumers to fudge on taxes, but the ISP's are probably right . The reason I have reluctance with their numbers is because, The World Bank, the CIA Factbook and multiple academic sources put Pakistan's internet user number at somewhere around 18 million. However there are rumours that they have just swallowed whole without verification the number of 18 million, because an Islamabad bureaucrat told them this number. So maybe the ISP association number is the closest we have to a verifiable value.

But beyond that, a Google spokesperson really admitted that the Pakistani search trends were inaccurate. A lot of Pak-clone news blogzines have picked up the Google-admitting-google-trends-screws-up story, but the one company that does not show up on my google search for the story is, Google.

An off handed defence I have heard is that Pakistan only constitutes a small portion of the global internet using public, so this is not possible. And what is possible is that Pakistan's main undersea cables for the internet pass from Karachi to Mumbai and Karachi to Dubai, so maybe they're carrying info from India that gets conflated with Pakistan. India also seemed to be strangely very high up in these searches.

Maybe the new cables like SE-ME-WE-4 allow discrete information from Pakistan, but since, to be honest, this is a story that's been kicking around since *AT*LEAST* 2007, this might be an aggregation of old data that passed through SE-ME-WE-3 and SE-ME-WE-2.

Speaking of which, I'm trying to find some info on SE-ME-WE-2 but *google* seems to be failing me on this. Maybe I'll need to someday troll through the scholarly articles for info on SE-ME-WE-2.

Update: OK I found info on SE-ME-WE-2. Where? On the website for SE-ME-WE-3. Here's what it says:
" Further to the success of the Sea-Me-We 2 ("South East Asia Middle East Western Europe 2") submarine cable project started during the late 80's, Singapore Telecom and France Telecom started in 1993 some preliminary studies for a follow-on high capacity cable linking Europe to the Asia-Pacific region."

So Goddamn SE-ME-WE-2 is 20+ years old. God knows how much misdirection must have taken place. And this info was on the website for SE-ME-WE-3. And that has been replaced over just the last few years with SE-ME-WE-4.

It would be interesting to hear Google analyse the interaction of Software and Hardware on their Google Trends app, and how their interaction produces results.

Postscript: It appears someone at the Daily Times systematically took apart Fox News's research "methodologies" (or lack thereof), and reached a partial conclusion on these searches being erroneous, because Pakistanis, with their lack of grammatical skill in English, entered two words with no adjoining words in between. Whereas if you search "sex with x" ("x" here being any animal, which I'm not going to type because I don't want to skew my own search results), the United States would generally top that ranking on google trends.

No comments: