Social Crawling

Last weekend, having nothing better to do, I'm back on the GNU Social topic and I've implemented and executed a crawler on the network, harvesting some information about users, relationships, and instances. The result is here: a force graph rappresenting the direct connections of the most followed accounts, and a chord of the connections among different instances.

The crawler is still running in this moment, and many of the intercepted accounts have to be parsed looking for more connections (and more following accounts, and more connections, and more following accounts...), so the graphs are still slightly changing hour by hour. Anyway, a few random considerations:

  • within the federated network there are many well connected Mastodon instances. But less than expected, given the great media exposure of the project. Eventually, the Mastodon network grown in a different direction. But it is difficult to measure, due the inavailability of a public API: you need an authentication key to query each single instance
  • rulez, both in quantity (number of accounts) and quality (number of connections)
  • manga (and hentai) contents are ubiquitous. A widely connected instance is this one, self-described as "a GNU Social instance that cares about cute anime girls" (with abundance of NSFW posts). The kind of media that are usually banned from more traditional social networks

The code of the crawler is here, and I will provide to publish also the collected dataset, update the informations every once (with moderation, to avoid to DoS the whole network...), and collect more data for more analysis. Suggestions about more effective visualizations are welcome.