Sat, 04 Feb 2012

OpenStreetMap's pseudo XML format

OpenStreetMap's pseudo XML .osm format is wrong on so many levels that you'd have to see it for yourself (excerpt from mittelfranken.osm)
 <node id="17192249" lat="49.6323053" lon="10.9829889" version="3" changeset="1173972" 
	user="GeoGrafiker" uid="69127" timestamp="2009-05-13T11:48:59Z"/>
 <node id="17192250" lat="49.6321964" lon="10.9817792" version="4" changeset="1174159" 
	user="GeoGrafiker" uid="69127" timestamp="2009-05-13T12:08:14Z"/>
 <node id="17193023" lat="49.5980682" lon="11.0037364" version="15" changeset="8691470" 
	user="okilimu" uid="212111" timestamp="2011-07-11T08:26:21Z">
  <tag k="is_in" v="Mittelfranken,Bayern,Bundesrepublik Deutschland,Europe" />
  <tag k="is_in:continent" v="Europe" />
  <tag k="name" v="Erlangen" />
  <tag k="name:de" v="Erlangen" />
  <tag k="name:en" v="Erlangen" />
  <tag k="name:ru" v="Э�\x80ланген" />
  <tag k="name:sr" v="�\x95�\x80ланген" />
  <tag k="openGeoDB:auto_update" v="population,is_in" />
  <tag k="openGeoDB:community_identification_number" v="09562" />
  <tag k="openGeoDB:is_in" v="Mittelfranken,Bayern,Bundesrepublik Deutschland,Europe" />
  <tag k="openGeoDB:is_in_loc_id" v="164" />
  <tag k="opengeodb:lat" v="49.5978347" />
  <tag k="openGeoDB:layer" v="5" />
  <tag k="openGeoDB:license_plate_code" v="ER" />
  <tag k="openGeoDB:loc_id" v="608" />
  <tag k="opengeodb:lon" v="11.0048476" />
  <tag k="openGeoDB:name" v="Erlangen" />
  <tag k="openGeoDB:sort_name" v="ERLANGEN" />
  <tag k="openGeoDB:type" v="Stadt" />
  <tag k="openGeoDB:version" v=" / 2007-12-04 /" />
  <tag k="place" v="city" />
  <tag k="population" v="105554" />
  <tag k="website" v="" />
  <tag k="wikipedia:de" v="Erlangen" />
  1. They have tag tags !!!!
  2. The node tags at the start presumably all refer to the city of Erlangen (since they all have roughly the same lat/lon coordinates), but this can only be inferred by observing that they are placed before a node with actual content naming the city.
  3. The type of a node is hidden in the v attribute of a tag tag's k attribute, iff that attribute has the value place, in the case of Erlangen the v is city. This type seem's not to be checked in any way, I found place attributes with ks of yes, no, Naturflaechendenkmal and various others.
  4. There's no way to automatically verify this mess so why use XML at all?!??
  5. The embedded openGeoDB content repeats values of other attributes. This almost necessarily results in outdated entries, mismatches, and general headaches.
  6. The is_in v value repeats contents of the is_in:continent attribute.
  7. There is a continent Europe in the OSM database, as are the state, county, etc. and their nodes surely have ids. So why not use XML's idref to refer to them?

