This is an early version of something that I want to make a little more beautiful and rigorous as I have time over the next few weeks, but I thought people might enjoy the work in progress. Don’t go drawing any big conclusions from it; it’s missing some sanity checks.
The red channel is population,* the green channel is economic activity,* and the blue channel is fully geolocated tweets per day (over a sample of just under 18 days). Each channel is log2 and approximately normalized.
* Data from G-Econ. I had some trouble with this data, and I’m not sure whether it’s in the original or just because I don’t have a foolproof way of turning an .xls into real information. I happened to notice that a few grid cells in and around Rwanda had latitudes of about −1500, for example, which doesn’t make me overly trustful of the accurate-appearing data. (I think there was one too few \ts somewhere.) Still, it was the best source I could find for gridded demographics. Population is for 2005, and economic activity is “Gross cell product, 2005 US$ at purchasing power parity exchange rates, 2005”.
Notice that we are not looking at all tweets! Tweets can have no location, city- to neighborhood-level location, or full/GPS-style location, and we’re only looking at the third kind, which is something like 1/200 of all tweets, and presumably strongly biased towards smartphones.
Also, this is not an equal-area projection. It’s plate carrée, which spreads things out near the poles. For example, at ±45° latitude (Portland, Harbin, Coihaique, Christchurch), the cells cover 1/sqrt(2) = 0.7ish as much longitude as they do at the equator.