fun
+ Amusement
Search Globe — Global Visualization of Google Searches by Language
Shown here is a globe visualization of world-wide Google searches, categorized by one of 21 languages. The visualization is created with WebGL toolkit and bundled data from Chrome Experiments.
1 · Data annotations — geotagged and ranked
I have annotated the data with geographical information from MaxMind, to include city, region, and country for each search location. The closest city was determined by finding the entry in the MaxMind data set (2.8M cities) with the smallest haversine distance to the coordinates of the search term. Note that latitude and longitude were provided to 3 decimal places in the original data file but are available to 7 decimal places in the MaxMind set.
The annotated data file includes new fields
rank
(1-indexed rank of magnitude of search data point)
cumulative_value
(fractional total of all search terms with equal or smaller magnitude)
language_name
(name of the search language)
city
(closest city to latitude/longitude of search data point)
region
(region of closest city)
country
(country of closest city)
city_latitude, city_longitude
(coordinates of closest city)
Download geotagged data
Thanks to Evan Applegate from UC Davis for requesting an explanation of the additional fields. They were not obvious.
View all languages or individual data for the following languages:
Arabic
Belgian
Chinese
Dutch
English
Finnish
French
German
Indonesian
Italian
Japanese
Korean
Norwegian
Polish
Portuguese
Romanian
Russian
Spanish
Swedish
Thai
Turkish
View top 5%, 10%, 15% of data.
View top
10
20
50
100
search locations.
View search density.
Showing top 10 locations.
The color legend was created based on the color scheme used in the original webgl-globe code.
3 · Observations on the data
3.1 · I'm an illegal alien
There are 11 locations in the US with searches in Spanish: Dillard, Douglas, Flint Hill, Floyds Knobs, Great Falls, Orrs Island, Redwood Estates, Simpsonville, Spanish Fork, Spanish Fort, and Washington. Conspicuously, Los Angeles is missing.
▲ Concentration of Spanish searches from continental US.
(
see results)
The northern-most town in Mexico with a Spanish search is Mexicali (Baja Californa, lat 32.65 long -115.47).
The Chinese takeover (but not takeout) has been largely overestimated. Only two towns in the US participate in Chinese language searches: Williamsport and Evensville.
▲ Concentration of Chinese searches from continental US.
(
see results)
3.3 · English around the world
3.3.1 · English in South America
With the exception of Albouystown (Demerara-Mahaica, Guyana) and Paramaribo (Suriname), South America shows no English searches.
▲ Concentration of English searches in South America.
(
see results)
Asia shows interesting patterns. Namely, no English searches are seen from China. No doubt, political firewalls are the cause. By country, India leads with 82 searches, followed by Malaysia (64) and Pakistan (11). The full list is India (82), Malaysia (64), Pakistan (11), United (5), Bangladesh (4), Sri (3), Philippines (3), Nepal (3), Korea (3), Japan (2), Iran (2), Singapore (1), Papua (1), Myanmar (1), Maldives (1), Cambodia (1), Brunei (1), Bhutan (1), Afghanistan (1).
▲ Concentration of English searches in Asia.
(
see results)
3.3.3 · English in the Far North
There are 25 locations with English language searches at latitude ≥ 60°. There are 15 cities in Alaska with searches (Anchorage, Barrow, Bethel, Cordova, Delta Junction, Eagle River, Fairbanks, Kenai, Nome, North Pole, Palmer, Seward, Soldotna, Valdez, Wasilla), of which Barrow is furthest north (lat 71.29°). The other 10 cities are mostly in Canada: Lerwick (Shetland Islands, United Kingdom, lat 60.160°),
Whitehorse (Yukon Territory, Canada, lat 60.720°),
Jarstad (Sogn og Fjordane, Norway, lat 61.360°),
Fort Providence (Northwest Territories, Canada, lat 61.380°),
Yellowknife (Northwest Territories, Canada, lat 62.450°),
Frobisher Bay (Nunavut, Canada, lat 63.750°),
Keflavík Gullbringusysla Iceland lat 64.010°),
Inuvik (Northwest Territories, Canada, lat 68.340°),
Gjoa Haven (Nunavut, Canada, lat 68.630°),
Igloolik (Nunavut, Canada, lat 69.380°).
▲ Concentration of English searches in the Far North.
(
see results)
3.3.4 · English in the Far South
New Zealand and Australia dominate search loations in the far south. The southermost English search is from Invercargill (Southland, New Zealand, lat -46.4° — compare this to the northmost search from Barrow in Alaska at lat 71.29°). In Australia, the southermost search is from Davenport (Tasmania, Australia, lat -41.17°). In South Africa, the southermost search is from Hermanus (Western Cape, South Africa, lat -34.42°).
▲ Concentration of English searches in the Far South.
(
see results)
3.4 · Most remote locations
What is the most remote search location? Here, I define distance between locations by the haversine distance.
I tabulate three types of remote locations, by language, by finding
- most remote, regardless of language of nearest city
- most remote, with nearest city searching in the same language
- most remote, with nearest city searching in a different language
▲ Three of the most remote search locations.
(
see results)
Cities, by language, most distant from their closest city.
The most remote search location of alll is Papeete, whose closest search data point is 2,287 km away — Fusi in American Samoa. Also interesting is the Belgian-speakinng Westerschelling in the Netherlands, which has the smallest maximum distance to its nearest city, by language. It is 25 km from Harlingen, Netherlands.
- French Papeete (French Polynesia, lat -17.540° long -149.570°) 2287 km from English Fusi (American Samoa, United States)
- English Mahé (Beau Vallon, Seychelles, lat -4.620° long 55.440°) 1347 km from English Hamar (Banaadir, Somalia)
- Russian Yakutsk (Sakha, Russian Federation, lat 62.040° long 129.750°) 1119 km from Chinese Kuchiku (Heilongjiang, China)
- Dutch Godthaab (Vestgronland, Greenland, lat 64.180° long -51.720°) 818 km from English Frobisher Bay (Nunavut, Canada)
- Portuguese Boa Vista (Roraima, Brazil, lat 2.820° long -60.670°) 522 km from English Albouystown (Demerara-Mahaica, Guyana)
- Indonesian Lette (Indonesia, lat -5.150° long 119.410°) 516 km from Indonesian Balikpapan (Kalimantan Timur, Indonesia)
- Spanish San Juan de Miraflores (Loreto, Peru, lat -3.760° long -73.270°) 458 km from Spanish San Martin (San Martin, Peru)
- Chinese Hotan (Xinjiang, China, lat 37.110° long 79.920°) 431 km from Chinese Kaschgar (Xinjiang, China)
- Arabic Ara`ar (Al Hudud ash Shamaliyah, Saudi Arabia, lat 30.980° long 41.030°) 390 km from Arabic Hael (Ha'il, Saudi Arabia)
- Japanese Nase (Kagoshima, Japan, lat 28.380° long 129.490°) 248 km from Japanese Nago (Okinawa, Japan)
- Thai Amphoe Muang Ranong (Ranong, Thailand, lat 9.970° long 98.640°) 225 km from Thai Amphoe Muang Nakhon Si Thammarat (Nakhon Si Thammarat, Thailand)
- Turkish Thospia (Van, Turkey, lat 38.490° long 43.380°) 177 km from English Sangar-e Beru Khan (Azarbayjan-e Bakhtari, Iran)
- Norwegian Guovdagæidno (Finnmark, Norway, lat 69.010° long 23.040°) 107 km from Norwegian Bosekop (Finnmark, Norway)
- Swedish Lofsdalen (Jamtlands Lan, Sweden, lat 62.120° long 13.270°) 106 km from Norwegian Nybergsund (Hedmark, Norway)
- Finnish Kansela (Oulu, Finland, lat 65.970° long 29.170°) 98 km from Finnish Märkäjärvi (Lapland, Finland)
- Romanian Sisesti (Gorj, Romania, lat 45.060° long 23.300°) 68 km from Romanian Drobeta-Turnu Severin (Mehedinti, Romania)
- Italian Nuoro (Sardegna, Italy, lat 40.320° long 9.330°) 60 km from Italian Santu Lussurgiu (Sardegna, Italy)
- Polish Vlodava (Poland, lat 51.550° long 23.550°) 45 km from Polish Bielawin (Poland)
- Korean Bontoku (Kyongsang-bukto, Korea, lat 36.410° long 129.370°) 43 km from Korean Eijitsu (Kyongsang-bukto, Korea)
- German Monplaisir (Brandenburg, Germany, lat 53.060° long 14.270°) 39 km from German Prenzlau (Brandenburg, Germany)
- Belgian Westerschelling (Friesland, Netherlands, lat 53.360° long 5.220°) 25 km from Belgian Harlingen (Friesland, Netherlands)
3.4.2 · Most remote — nearest city searching in the same language
Cities, by language, most distant from their closest city, in which people speak (i.e. search) in the same language.
English searches are the most spread out on the globe. Of all search languuages, Mahe in Seychelles is furthest from its same-language nearest loccation of all other languages. It is 1,347 from Hamar in Somalia, in which English searches are found.
- English Mahé (Beau Vallon, Seychelles, lat -4.620° long 55.440°) 1347 km from English Hamar (Banaadir, Somalia)
- Indonesian Lette (Indonesia, lat -5.150° long 119.410°) 516 km from Indonesian Balikpapan (Kalimantan Timur, Indonesia)
- Spanish San Juan de Miraflores (Loreto, Peru, lat -3.760° long -73.270°) 458 km from Spanish San Martin (San Martin, Peru)
- Chinese Hotan (Xinjiang, China, lat 37.110° long 79.920°) 431 km from Chinese Kaschgar (Xinjiang, China)
- Arabic Ara`ar (Al Hudud ash Shamaliyah, Saudi Arabia, lat 30.980° long 41.030°) 390 km from Arabic Hael (Ha'il, Saudi Arabia)
- Japanese Nase (Kagoshima, Japan, lat 28.380° long 129.490°) 248 km from Japanese Nago (Okinawa, Japan)
- Thai Amphoe Muang Ranong (Ranong, Thailand, lat 9.970° long 98.640°) 225 km from Thai Amphoe Muang Nakhon Si Thammarat (Nakhon Si Thammarat, Thailand)
- Norwegian Guovdagæidno (Finnmark, Norway, lat 69.010° long 23.040°) 107 km from Norwegian Bosekop (Finnmark, Norway)
- Finnish Kansela (Oulu, Finland, lat 65.970° long 29.170°) 98 km from Finnish Märkäjärvi (Lapland, Finland)
- Romanian Sisesti (Gorj, Romania, lat 45.060° long 23.300°) 68 km from Romanian Drobeta-Turnu Severin (Mehedinti, Romania)
- Italian Nuoro (Sardegna, Italy, lat 40.320° long 9.330°) 60 km from Italian Santu Lussurgiu (Sardegna, Italy)
- Polish Vlodava (Poland, lat 51.550° long 23.550°) 45 km from Polish Bielawin (Poland)
- Korean Bontoku (Kyongsang-bukto, Korea, lat 36.410° long 129.370°) 43 km from Korean Eijitsu (Kyongsang-bukto, Korea)
- German Monplaisir (Brandenburg, Germany, lat 53.060° long 14.270°) 39 km from German Prenzlau (Brandenburg, Germany)
- Belgian Westerschelling (Friesland, Netherlands, lat 53.360° long 5.220°) 25 km from Belgian Harlingen (Friesland, Netherlands)
3.4.3 · Most remote — nearest city searching in a different language
Cities, by language, most distant from their closest city, which is foreign (i.e. searching in a different language).
- French Papeete (French Polynesia, lat -17.540° long -149.570°) 2287 km from English Fusi (American Samoa, United States)
- Russian Yakutsk (Sakha, Russian Federation, lat 62.040° long 129.750°) 1119 km from Chinese Kuchiku (Heilongjiang, China)
- Dutch Godthaab (Vestgronland, Greenland, lat 64.180° long -51.720°) 818 km from English Frobisher Bay (Nunavut, Canada)
- Portuguese Boa Vista (Roraima, Brazil, lat 2.820° long -60.670°) 522 km from English Albouystown (Demerara-Mahaica, Guyana)
- Turkish Thospia (Van, Turkey, lat 38.490° long 43.380°) 177 km from English Sangar-e Beru Khan (Azarbayjan-e Bakhtari, Iran)
- Swedish Lofsdalen (Jamtlands Lan, Sweden, lat 62.120° long 13.270°) 106 km from Norwegian Nybergsund (Hedmark, Norway)
About 10% of all searches come from the top 10 locations.
- English New York (United States)
- French Paris (France)
- Turkish Istanbul (Turkey)
- English London (United Kingdom)
- Portuguese Sao Paolo (Brazil)
- English Miami (United States)
- German Berlin (Germany)
- Spanish Madrid (Spain)
- Spanish Mexico City (Mexico)
- Thai Bangkok (Thailand)
I am surprised to see Miami here (bored retirees?) as well as Istanbul — I don't have a theory for that one.
38% of all searches come from the top 100 locations (out of 22,826), with English dominating (33/100) followed by Spanish (11/100).
The full breakdown for the top 100 locations by language is English (33), Spanish (11), German (8), Japanese (6), Dutch (6), Portuguese (5), French (5), Turkish (4), Italian (4), Chinese (4), Russian (3), Arabic (3), Polish (2), Thai (1), Swedish (1), Romanian (1), Korean (1), Indonesian (1), Finnish (1).
By country, the top 100 locations fall in United States (11), Germany (6), India (6), Japan (6), Brazil (5), United Kingdom (5), Italy (4), Turkey (4), Australia (3), France (3), Mexico (3), Russian Federation (3), Canada (2), China (2), Colombia (2), Poland (2), Saudi Arabia (2), Spain (2), Vietnam (2), Algeria (1), Argentina (1), Austria (1), Chile (1), Egypt (1), Finland (1), Greece (1), Hong Kong (1), Hungary (1), Indonesia (1), Ireland (1), Israel (1), Korea (1), Malaysia (1), Peru (1), Philippines (1), Romania (1), Serbia (1), Singapore (1), Sweden (1), Switzerland (1), Taiwan (1), Thailand (1), Tunisia (1), Ukraine (1), United Arab Emirates (1), Venezuela (1)
The top 100 locations are
- English New York (New York, United States)
- French Saint-Merri (Ile-de-France, France)
- Turkish Küçükpazar (Istanbul, Turkey)
- English City of London (Essex, United Kingdom)
- Portuguese Liberdade (Sao Paulo, Brazil)
- English Miami (Florida, United States)
- German Berlin (Berlin, Germany)
- Spanish Entrevías (Madrid, Spain)
- Spanish Ciudad de México (Distrito Federal, Mexico)
- Thai Amphoe Bang Rak (Krung Thep, Thailand)
- Spanish Bogotá (Cundinamarca, Colombia)
- English City of Sydney (New South Wales, Australia)
- Spanish Hacienda Huachipa (Lima, Peru)
- Spanish San Telmo (Distrito Federal, Argentina)
- Italian Roma (Lazio, Italy)
- Polish Powisle (Poland)
- Italian Mailand (Lombardia, Italy)
- English South Melbourne (Victoria, Australia)
- English Los Angeles (California, United States)
- Portuguese São Cristavem (Rio de Janeiro, Brazil)
- Russian Moscou (Moscow City, Russian Federation)
- Turkish Maltepe (Ankara, Turkey)
- Indonesian Pasarmanggis (Jakarta Raya, Indonesia)
- Dutch Ho Chi Minh City (Ho Chi Minh, Vietnam)
- Spanish Barcelona (Catalonia, Spain)
- English Toronto (Ontario, Canada)
- Spanish La Reina (Region Metropolitana, Chile)
- Spanish Los Caobas (Distrito Federal, Venezuela)
- English Chicago (Illinois, United States)
- Russian KievPetrovsky Port (Kyyivs'ka Oblast', Ukraine)
- Arabic Az Zahra' (Ar Riyad, Saudi Arabia)
- Dutch Xóm Trong (Vietnam)
- German München (Bayern, Germany)
- English Connaught Place (Delhi, India)
- Portuguese Venda Nova (Minas Gerais, Brazil)
- Dutch Afini (Attiki, Greece)
- English Bangalore (Karnataka, India)
- English Kampong Haji Abdullah Hukum (Kuala Lumpur, Malaysia)
- German Hamburg (Hamburg, Germany)
- Chinese Beijing (Beijing, China)
- Arabic Rawd al Faraj (Al Qahirah, Egypt)
- English Singapore City (Singapore)
- English Houston (Texas, United States)
- English Paddington (Essex, United Kingdom)
- Turkish Azmir (Izmir, Turkey)
- Japanese Nishi-okubo (Tokyo, Japan)
- English Spring Hill (Victoria, Australia)
- English Bombay Wadala (Maharashtra, India)
- Dutch Hakiriah (Tel Aviv, Israel)
- French Fourvière (Rhone-Alpes, France)
- Chinese Shanghaishih (Shanghai, China)
- Arabic Bani Malik (Makkah, Saudi Arabia)
- English Daira (Dubai, United Arab Emirates)
- Dutch Kiyabo (Manila, Philippines)
- German Inner City (Wien, Austria)
- Italian Naples (Campania, Italy)
- English Montreal (Quebec, Canada)
- English Kilmainham (Dublin, Ireland)
- German Alt-Wiedikon (Zurich, Switzerland)
- Japanese Kyobashi (Osaka, Japan)
- Dutch Buda (Budapest, Hungary)
- Romanian Bucarest (Bucuresti, Romania)
- Chinese Central District (Hong Kong)
- Japanese Sengendai (Kanagawa, Japan)
- Japanese Hibiyakoen (Tokyo, Japan)
- English Thousand Lights (Tamil Nadu, India)
- English San Francisco (California, United States)
- English Farragut Square (District of Columbia, United States)
- English Victoria Park (Manchester, United Kingdom)
- Swedish Norrmalm (Stockholms Lan, Sweden)
- German Frankford-on-Main (Hessen, Germany)
- German Augusta Ubiorum (Nordrhein-Westfalen, Germany)
- Chinese Fantzupo (T'ai-pei, Taiwan)
- Korean Kyedong (Seoul-t'ukpyolsi, Korea)
- English Lambeth (Lambeth, United Kingdom)
- German Stutengarten (Baden-Wurttemberg, Germany)
- Japanese Sarugakucho (Tokyo, Japan)
- English Seattle (Washington, United States)
- Finnish Gloet (Southern Finland, Finland)
- Italian Borgo Po (Piemonte, Italy)
- Spanish Guadalajara (Jalisco, Mexico)
- Spanish Alpujarra (Antioquia, Colombia)
- French Toulouse (Midi-Pyrenees, France)
- English San Diego (California, United States)
- English Dallas (Texas, United States)
- English Denver (Colorado, United States)
- English Dorcol (Serbia)
- English Aston (Essex, United Kingdom)
- English Romanovskiy (Moskva, Russian Federation)
- Polish Kleparz (Poland)
- Russian Aptekarskiy (Leningrad, Russian Federation)
- Spanish Monterrey (Nuevo Leon, Mexico)
- French El Bia (Alger, Algeria)
- French Al `Umran (Tunisia)
- Portuguese Bahia (Bahia, Brazil)
- Portuguese Brasília (Distrito Federal, Brazil)
- Turkish Adana (Adana, Turkey)
- Japanese Edo (Tokyo, Japan)
- English Bhaganagar (Andhra Pradesh, India)
- English Mali and Munjeri (Maharashtra, India)
news
+ thoughts
Mon 16-09-2024
I don’t have good luck in the match points. —Rafael Nadal, Spanish tennis player
In many experimental designs, we need to keep in mind the possibility of confounding variables, which may give rise to bias in the estimate of the treatment effect.
▲ Nature Methods Points of Significance column: Propensity score matching.
(
read)
If the control and experimental groups aren't matched (or, roughly, similar enough), this bias can arise.
Sometimes this can be dealt with by randomizing, which on average can balance this effect out. When randomization is not possible, propensity score matching is an excellent strategy to match control and experimental groups.
Kurz, C.F., Krzywinski, M. & Altman, N. (2024) Points of significance: Propensity score matching. Nat. Methods 21:1770–1772.
Sat 23-03-2024
We'd like to say a ‘cosmic hello’: mathematics, culture, palaeontology, art and science, and ... human genomes.
▲ SANCTUARY PROJECT | A cosmic hello of art, science, and genomes.
(
details)
▲ SANCTUARY PROJECT | Benoit Faiveley, founder of the Sanctuary project gives the Sanctuary disc a visual check at CEA LeQ Grenoble (image: Vincent Thomas).
(
details)
▲ SANCTUARY PROJECT | Sanctuary team examines the Life disc at INRIA Paris Saclay (image: Benedict Redgrove)
(
details)
Fri 22-03-2024
All animals are equal, but some animals are more equal than others. —George Orwell
This month, we will illustrate the importance of establishing a baseline performance level.
Baselines are typically generated independently for each dataset using very simple models. Their role is to set the minimum level of acceptable performance and help with comparing relative improvements in performance of other models.
▲ Nature Methods Points of Significance column: Comparing classifier performance with baselines.
(
read)
Unfortunately, baselines are often overlooked and, in the presence of a class imbalance, must be established with care.
Megahed, F.M, Chen, Y-J., Jones-Farmer, A., Rigdon, S.E., Krzywinski, M. & Altman, N. (2024) Points of significance: Comparing classifier performance with baselines. Nat. Methods 21:546–548.
Sat 09-03-2024
Celebrate π Day (March 14th) and dig into the digit garden. Let's grow something.
▲ 2024 π DAY | A garden of 1,000 digits of π.
(
details)
Thu 18-01-2024
Huge empty areas of the universe called voids could help solve the greatest mysteries in the cosmos.
My graphic accompanying How Analyzing Cosmic Nothing Might Explain Everything in the January 2024 issue of Scientific American depicts the entire Universe in a two-page spread — full of nothing.
▲ How Analyzing Cosmic Nothing Might Explain Everything. Text by Michael Lemonick (editor), art direction by Jen Christiansen (Senior Graphics Editor), source: SDSS
The graphic uses the latest data from SDSS 12 and is an update to my Superclusters and Voids poster.
Michael Lemonick (editor) explains on the graphic:
“Regions of relatively empty space called cosmic voids are everywhere in the universe, and scientists believe studying their size, shape and spread across the cosmos could help them understand dark matter, dark energy and other big mysteries.
To use voids in this way, astronomers must map these regions in detail—a project that is just beginning.
Shown here are voids discovered by the Sloan Digital Sky Survey (SDSS), along with a selection of 16 previously named voids. Scientists expect voids to be evenly distributed throughout space—the lack of voids in some regions on the globe simply reflects SDSS’s sky coverage.”
voids
Sofia Contarini, Alice Pisani, Nico Hamaus, Federico Marulli Lauro Moscardini & Marco Baldi (2023) Cosmological Constraints from the BOSS DR12 Void Size Function Astrophysical Journal 953:46.
Nico Hamaus, Alice Pisani, Jin-Ah Choi, Guilhem Lavaux, Benjamin D. Wandelt & Jochen Weller (2020) Journal of Cosmology and Astroparticle Physics 2020:023.
Sloan Digital Sky Survey Data Release 12
constellation figures
Alan MacRobert (Sky & Telescope), Paulina Rowicka/Martin Krzywinski (revisions & Microscopium)
stars
Hoffleit & Warren Jr. (1991) The Bright Star Catalog, 5th Revised Edition (Preliminary Version).
cosmology
H0 = 67.4 km/(Mpc·s), Ωm = 0.315, Ωv = 0.685. Planck collaboration Planck 2018 results. VI. Cosmological parameters (2018).
Tue 02-01-2024
It is the mark of an educated mind to rest satisfied with the degree of precision that the nature of the subject admits and not to seek exactness where only an approximation is possible. —Aristotle
In regression, the predictors are (typically) assumed to have known values that are measured without error.
Practically, however, predictors are often measured with error. This has a profound (but predictable) effect on the estimates of relationships among variables – the so-called “error in variables” problem.
▲ Nature Methods Points of Significance column: Error in predictor variables.
(
read)
Error in measuring the predictors is often ignored. In this column, we discuss when ignoring this error is harmless and when it can lead to large bias that can leads us to miss important effects.
Altman, N. & Krzywinski, M. (2024) Points of significance: Error in predictor variables. Nat. Methods 21:4–6.
Background reading
Altman, N. & Krzywinski, M. (2015) Points of significance: Simple linear regression. Nat. Methods 12:999–1000.
Lever, J., Krzywinski, M. & Altman, N. (2016) Points of significance: Logistic regression. Nat. Methods 13:541–542 (2016).
Das, K., Krzywinski, M. & Altman, N. (2019) Points of significance: Quantile regression. Nat. Methods 16:451–452.