The Ultimate Data Source

The Ultimate Data Source
Photo by Arseny Togulev / Unsplash

"Ehi! Two years ago many municipalities in Italy have been suppressed, created and merged! I need to update the database for a customer of mine, who didn't found one of them! Let's look for an updated CSV!"

And, of course, I found nothing. At least nothing usable in my own use case, as the local dialing code for each municipality was a requirement and not even the official data from ISTAT (the Italian national institute for statistics) had this information. The only option was to buy a dataset maintained by a guy who is monetizing this lack of information (and official sources), selling the complete CSV to the many desperate Italian developers looking for a way to correctly populate their databases.

Than, another option raised in my mind. Wikidata.

As Wikipedia is maniacally kept up-to-date to any change occourring in the real world, including the burocratic and administrative ones, I tried to execute a few SPARQL queries on the Wikidata endpoint and found that really all the data I needed was here.

The following PHP code used EasyRDF to execute a SPARQL query and extract the name, the district code, the local dialing code and the ZIP code of each entity which is "instance of comune of Italy":

use EasyRdf\Sparql\Client;

$client = new Client('https://query.wikidata.org/bigdata/namespace/wdq/sparql');

$result = $client->query('SELECT ?itemLabel ?plate ?prefix ?cap WHERE {
  ?item wdt:P31 wd:Q747074 .
  ?item wdt:P131 ?district .
  ?district wdt:P395 ?plate .
  ?item wdt:P473 ?prefix .
  ?item wdt:P281 ?cap .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],it". }
}');

foreach($result as $res) {
  $row = [];

  foreach($res as $index => $info) {
    $row[$index] = (string) $info;
  }
  
  // Do what needed with $row
}

The data needs a bit of iteration, as there are a few minor incoherences, but still are complete and almost ready to be used to update (once a year?) your local database of municipalities.

Keywords for Italian Google miners: download CSV comuni italiani prefissi telefonici CAP.