The top GitHub projects per country

9 min read Original article ↗

Let’s find the most popular GitHub projects for each country. A simple query can give us that result: Take every star given on GitHub and look for the country of origin of users who have chosen to show their country on their GitHub profile. But we are going to go deeper than that.

Felipe Hoffa

Step 1: The most popular GitHub project per country — naive count

The first pass results are somehow monotone, and not surprising: FreeCodeCamp was the most starred GitHub project by far in 2016:

Press enter or click to view image in full size

The most starred project per country on GitHub during 2016 — naive count.

Step 2: What about the 2nd most popular project for each country?

Press enter or click to view image in full size

The 2nd most starred project per country on GitHub during 2016 — naive count.

The 2nd most starred project for each country shows more variety results:

Let’s skip the top 75 worldwide GitHub projects, and only then choose the most popular project per country:

A list of the most interesting GitHub projects for each country:

Let’s get a list of top projects for each country, this time with some rules:

  • 10 top projects per country
  • Removing projects in the GitHub global top 75.
  • We are looking for projects special to each country — if a project shows up in many countries top 10, we’ll skip it and look for a more “special” one.

And these are our results:

Press enter or click to view image in full size

appears on top of the US — Americans show a lot of attention to compensation. It’s great to see in this list too with his Kubernetes in hard mode. On data — I love data, in this case coming from . And thanks for teaching us “what happens when”, for chat bots, for easy applications, and everyone else listed here and beyond!

Press enter or click to view image in full size

China — they love WeChat — and they have their own version of FreeCodeCamp.

Press enter or click to view image in full size

Germany —

is Canadian who lives in the UK, nevertheless Germans starred his project paperless to the top.

Press enter or click to view image in full size

UK — great to see

and again. And on top of the UK, ’s Svelte.

Press enter or click to view image in full size

India seems to pay a lot of attention to CS educational repositories.

Press enter or click to view image in full size

Brazil — where the top project translates to “love Serenade” and its mission is to give “Artificial Intelligence for social control of public administration”. Tell me more,

at .

Press enter or click to view image in full size

Russia — yandex is big there, as well as YoptaScript (as in Ukraine).

Press enter or click to view image in full size

France seems to be big into PHP. I’d love to learn more.

Press enter or click to view image in full size

Japan — I don’t know much about these projects, but they seem very Japanese oriented.

Press enter or click to view image in full size

Canada — seems not only the us cares about stock options compensation.

Press enter or click to view image in full size

Australia — not many Australian exclusive projects show up, but some of the ones we already saw. Not sure why relax/relax is big here, as their top maintainers live in Portugal (

, for example).

Press enter or click to view image in full size

Taiwan — where the top project mission is to “Make the Taiwan a Better Place to Work”. I like it.

Press enter or click to view image in full size

Ukraine — top project samgozman/YoptaScript says “Скриптовый язык программирования для гопников и реальных пацанов”. Makes sense.

Press enter or click to view image in full size

Spain — Karumi is a Spanish company, and their Android framework is the top project for Spain. Great to see open data as their #2.

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

Netherlands — They also pay a lot of attention to Max’s big list of naughty strings.

Press enter or click to view image in full size

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

South Korea — open government on top. Also interesting to see Apache Zeppelin on top of this country ranking, as I met many of its developers in Seoul (hello

!).

Press enter or click to view image in full size

Italy — where you can find the constitutional reform on top of their GitHub projects (and maybe send pull requests?).

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

Switzerland — not only Germany has danielquinn/paperless within the top, Switzerland too. But the top project is “an opensource, publicly owned evoting system. It is the result of the collaboration between the Geneva State Chancellery and the Geneva IT Department”. Good.

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

Portugal —

again between the top projects with his big list of naughty strings.

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

Romania — where a list of cool projects made in Romania is on top of the Romanian projects. Thanks

!

Press enter or click to view image in full size

Ireland: Not only the US and Canada care about stock options compensation. Probably due to a big presence of American tech companies, Ireland also shows its interest.

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

Israel — on top of this list we can see

’s re:dash — my favorite open source dashboarding tool. This project is not exclusive to Israel, but since Arik lives there that apparently makes an awareness impact.

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

Philippines —

lives there, and his project shows on top.

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

(comments pending)

Press enter or click to view image in full size

Chile — my home country! Great to see projects from

, , , , on top.

More countries? Find the raw list on a spreadsheet.

Queries

Datasets

Most popular GitHub project per country, naive count:

#standardSQL
SELECT REGEXP_EXTRACT(a.name, r'/(.*)'), stars, iso3
FROM (
SELECT * FROM (
SELECT country_code, name, stars, ROW_NUMBER() OVER(PARTITION BY country_code ORDER BY stars DESC) rn
FROM (
SELECT country_code, repo.id repo_id, COUNT(DISTINCT actor.login) stars, APPROX_TOP_COUNT(repo.name, 1)[OFFSET(0)].value name
FROM `githubarchive.year.2016` a
JOIN `ghtorrent-bq.ght_2017_01_19.users` b
ON a.actor.login=b.login
WHERE country_code IS NOT null
AND a.type='WatchEvent'
GROUP BY 1, 2
HAVING stars>5
)
)
WHERE rn=1
) a
JOIN `gdelt-bq.extra.countryinfo` c
ON a.country_code=LOWER(c.iso)
ORDER BY stars DESC

Most popular GitHub projects per country, removing the top 75, special rank:

#standardSQL
SELECT REGEXP_EXTRACT(a.name, r'/(.*)'), stars, iso3
FROM (
SELECT * FROM (
SELECT country_code, name, stars, ROW_NUMBER() OVER(PARTITION BY country_code ORDER BY stars DESC) rn
FROM (
SELECT country_code, repo.id repo_id, COUNT(DISTINCT actor.login) stars, APPROX_TOP_COUNT(repo.name, 1)[OFFSET(0)].value name
FROM `githubarchive.year.2016` a
JOIN `ghtorrent-bq.ght_2017_01_19.users` b
ON a.actor.login=b.login
WHERE country_code IS NOT null
AND repo.id NOT IN (SELECT id FROM (
SELECT repo.id, COUNT(*) c
FROM `githubarchive.year.2016` a
WHERE a.type='WatchEvent'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 75
))
AND a.type='WatchEvent'
GROUP BY 1, 2
HAVING stars>5
)
)
WHERE rn=1
) a
JOIN `gdelt-bq.extra.countryinfo` c
ON a.country_code=LOWER(c.iso)
ORDER BY stars DESC

Most popular GitHub projects per country, removing the top 75:

#standardSQL
SELECT country, a.name, stars, description, stars_country, other_countries_sc
FROM (
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER(PARTITION BY country_code ORDER BY stars DESC) rn2
FROM (
SELECT *, ROUND(SUM(1/(1+ROUND(rn/20,0))) OVER(PARTITION BY repo_id),1) other_countries_sc
FROM (
SELECT * FROM (
SELECT country_code, name, stars, ROW_NUMBER() OVER(PARTITION BY country_code ORDER BY stars DESC) rn, SUM(stars) OVER(PARTITION BY country_code) stars_country, repo_id
FROM (
SELECT country_code, repo.id repo_id, COUNT(DISTINCT actor.login) stars, APPROX_TOP_COUNT(repo.name, 1)[OFFSET(0)].value name
FROM `githubarchive.year.2016` a
JOIN `ghtorrent-bq.ght_2017_01_19.users` b
ON a.actor.login=b.login
WHERE country_code IS NOT null
AND repo.id NOT IN (SELECT id FROM (
SELECT repo.id, COUNT(*) c
FROM `githubarchive.year.2016` a
WHERE a.type='WatchEvent'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 75
))
AND a.type='WatchEvent'
GROUP BY 1, 2
HAVING stars>5
)
)
WHERE rn<300
)
)
WHERE other_countries_sc<3
)
WHERE rn2<11
) a
LEFT JOIN (SELECT url, ANY_VALUE(description) description FROM `ghtorrent-bq.ght_2017_01_19.projects` GROUP BY 1) b
ON a.name=REPLACE(url, 'https://api.github.com/repos/', '')
JOIN `gdelt-bq.extra.countryinfo` c
ON a.country_code=LOWER(c.iso)
ORDER BY stars_country DESC, country_code, stars DESC

Want more?

Want more stories? Check my Medium, follow me on twitter, and subscribe to reddit.com/r/bigquery. And try BigQuery — every month you get a full terabyte of analysis for free.