Swedish Wikipedia just became the world’s second-largest edition of the site, which is a bit of mind-bending insanity best explained by a chart.
Those languages — English, Swedish, Dutch, German and French — are the five largest Wikipedias by article volume. The red bars indicate how many articles are contained in each. And the blue bars? Those show the number of speakers each language has worldwide, which should theoretically correspond to the number of Wikipedia editors in that language, which should theoretically correspond to the size of the edition.
Except … it clearly doesn’t. Somehow, this teeny-tiny Nordic language that has absolutely no use outside the two sparsely populated countries where it’s spoken is holding its own against megaliths like French and German. Even English Wikipedia isn’t inconceivably far out of Swedish Wikipedia’s reach: It has 4.6 million articles to the Swedes’ 1.8 million, and, for a period last year, the Swedes were adding upwards of 5,000 new articles each day.
But they’re not doing it without a little help.
As the Wall Street Journal chronicled earlier this month, the story of Swedish Wikipedia is really the story of one man, Sverker Johansson, and his quest to make the site “more democratic.” Johansson developed a bot, called Lsjbot, that combs databases, scrapes them for useful information and packages them into Wikipedia articles. The bot has, thus far, contributed nearly 3 million posts to the Swedish and Filipino Wikipedias, many of them about obscure plants, animals or geography. It’s next project will be the National Library of Sweden’s catalog of authors, which — according to Johansson — contains a number of names that Wikipedia’s human tenders have unfairly ignored.
And that’s the really cool thing about Wikipedia bots, and about the success of Johansson’s bot, in particular. Wikipedia’s vision for an infinite compendium of human knowledge is wonderful, but it’s also impossible, particularly outside the mainstream — the world will simply never produce enough Tagalog-speakers who care about obscure species of fish. Or Swedes who want to comb the the obscurest reaches of the Kungliga Biblioteket, and report back to Wikipedia. (As of this writing, the Swedish Wikipedia has fewer than 2,500 active editors.)
Bots, on the other hand, can transcend both the size of those language communities and the human biases that tend to keep certain subjects — and languages — off the site. But the bot knows no gender … or language, or age. And it certainly doesn’t know that there are only 9 million people around to read its output, versus the 75 million who speak French, or the 78 million who speak German.
“Wikipedia is written by affluent white nerds, in languages only affluent white nerds know,” Johansson said at Sweden’s Free Society Conference last year. “If you know French, you have much better access to ‘the sum of all knowledge’ than if you only know Japanese… In the sum of all human knowledge, we’re still just scratching the surface.”