Tuesday, January 4, 2011

PSMA, Sensis or OpenStreetMap: what makes Spatial Data “Authoritative”?

You are probably aware that recently in Australia, Google switched map data providers. They ditched the government owned data provider PSMA in favour of Sensis' Whereis, and this data will (at least in part) be maintained by users themselves.


The switch has created a bit of a stir in the Australian spatial information community. Our industry association (SIBA) sent out a stern word of warning about authoritative data and the risks of dealing with User Generated Content (UGC).

Firstly, let’s recognise that Google Australia is not alone in moving to a model where it’s reducing dependency on 3rd party content providers. Since 2009, Google worldwide has been sourcing its own streetmaps, and making them updateable by its users . And major online mappers such as Mapquest and Microsoft’s Bing Maps have already adopted the free, crowdsourced basemaps from OpenStreetMap

Now why would they do this? There is of course the driver to reduce commercial dependencies. But more importantly: these companies recognise that crowdsourcing spatial data is not only much cheaper, but also much quicker. You can read blogger James Fee’s excellent analysis on this phenomenon.

The second thing to note here is the implicit assumption about what makes spatial datasets authoritative. In warning about the risks of using UGC, SIBA writes:
“In applications where data integrity/quality is of high importance, spatial data should be current, accurate, the best quality possible and from an authoritative source.”
Actually, there are two assumptions hidden in this statement: (1) crowdsourced data is not authoritative, and (2) only authoritative sources can provide data that is current, accurate and of the highest possible quality. However, neither of these two assumptions really stand up to scrutiny.

Crowdsourcing has given us the concept of ‘decentralised authority’, where the confidence in an (online) information source is derived from the community that maintains it, instead from the reputation or mandate of a central organisation. The success of Wikipedia over its traditional counterpart the Encyclopaedia Britannica is the most well-known of many such examples.

What’s more, UGC often provides us with content that is more accurate, and certainly more current than traditionally sourced data. Where updates such as new roads or developments typically take over 6 months to make it to the PSMA datasets, they are visible in OpenStreetMap almost instantaneously. It’s been reported to me that in for instance in Brisbane OpenStreetMap is routinely more up to date than other sources.

For many users data currency is a critical component in determining quality and fitness for purpose. To suggest that only so-called ‘authoritative’ providers can give us ‘the best quality possible’ represents a rather limited perspective.

17 comments:

  1. Ah, but Mr Maurits you never answered the obvious question ... what would you use? :)

    Although OSM may be updated more frequently (Brisbane in your example) does it have the required coverage? Would you rather data being updated more frequently, captured more accurately or vetted as being "official"?

    It kinda irks me when UGC turns into a frequency debate as if that's a determining factor.

    Frequency != currency != quality.

    ReplyDelete
  2. C'mon Mr. Chris, you've been out of Landgate long enough now :) :).

    Of course I agree. What I tried to get across is that while currency != quality, at the same time for many users 'vetted but slow & expensive' (aka authoritative) != quality either.

    To answer your opening question: In 99% of the cases, I'm happy to use OSM, and fix it where I encounter errors.

    ReplyDelete
  3. The one thing I haven't read about in any of these posts is fitness for purpose. Someone in the other blog (by James Fee) mentioned emergency services using OSM, but what about telcos, defence and intelligence organisations, etc.? Are they going to use OpenStreetMap or commercial data or their own? These organisations are the backbone of our daily lives, has anyone asked them for their view on 'authoritative' data (I really dislike that term). Work done in Haiti provides a good starting point, looking at Ushahidi, Sahana, etc.

    Also what about the investments already made in data from Tele Atlas and Navteq, there are production systems with their data embedded, what's the cost to swap that out and the associated data integration and data maintenance challenges? I guess not that high vs the perceived long-term gains from what you have written.

    Finally, before everyone proclaims Navteq and Tele Atlas as finished, shouldn't we actually look at some real numbers? How many users do they have worldwide? How much investment have they made in systems using those data? Maybe ask those users for their view on authoritative data, their concerns may actually be more related to long-term sustainability and other issues. Both of these organisations are now part of much larger entities, notably Nokia, so there will be a whole host of people using their data in mobile services who don't actually care, they just want fit for purpose data for their application. It's then up to Nokia and TomTom to recoup their investments somehow.

    ReplyDelete
  4. Some related discussions going on in the Northern Hemisphere: http://knowwhereconsulting.co.uk/it-is-that-time-of-year/comment-page-1/#comment-295

    ReplyDelete
  5. Thanks all for the comments and the debate. In the 24 hours since the post, I've come across numerous other blogs & discussions on this topic, including data quality comparisons such as: http://povesham.wordpress.com/2009/07/15/openstreetmap-and-ordnance-survey-master-map-%E2%80%93-beyond-good-enough/

    I see strong parallels with the Open Source Software evolution. It took Open Source many years for to be seen as a viable enterprise option, but now Linux, Geoserver, Postgres Mapguide, and others are reputable, mainstream products underpinning mission critical production systems.

    OSM will go the same way. Give it some time.

    ReplyDelete
  6. The one thing tht nobody seems to be discussing here is that the data that has been changed by Google is the cadastral framework of property boundaries. That has nothing to do with the location of houses, fences, roads and all the other things that the crowd can see and locate on the ground and, therefore, update on a map.

    In Australia, we have a Torrens Titles system that relies on very accurate cadastral boundaries measured by registered surveyors who are actually quasi-legal officers when they reinstate and establish those boundaries.

    It gets back to fitness for purpose as somebody said above. If you only use the map for the location of physical objects, then there is no need to have an "authoritative" source of the cadastral boundaries. But if that is important to you then take care because the crowd cannot update that data.

    And that, I believe, is what the SIBA note was all about.

    ReplyDelete
  7. In the LinkedIn group OpenCadastremap and related groups, the question is raised about authoritativeness of crowd-sourced cadastral data. For me the two owners at both sides of a boundary are part of the crowd and are perfectly able to define the property boundary. In well organised countries as Australia cadastral surveyors measure the will of both parties and enter the information following legal procedures in openly accessible systems. But well organised countries are an exception in this world so what happens if this official surveyor just doesn't show up and people start doing it themselves? That is a social phenomenen that can happen, seen technology that is available.
    Authoritativeness can come step by step, what if banks take account of this data when considering motgages? I don't know about Australia's land history but I cannot imagine the occupation of land and formalisation of rights in the past was a complete smooth process, seen examples in other countries.
    That can happen in other countries also.

    Of course there is not always agreement about boundaries and conflicts have to be solved by governments or at least a conflict resolution framework has to be installed. But that is also the case for topographic info. That is not so value free as one might expect. See the edit-wars about placenames between greek and turkish OSM-editors and the non-appearance of palestinian settlements in Israel maps (see Atlas of the Conflict).


    Peter Laarakker

    ReplyDelete
  8. In this light, it's interesting to see how users expect to know if a datasource is 'authoritative'. If you have access to it: see Tony Gill's LinkedIn Poll on this subject: http://linkd.in/gR3kuX

    (Very) Preliminary result: reputation or provider claims win over certification.

    ReplyDelete
  9. I'm very interested in this debate. I noticed very quickly when Google changed their data source to Sensis, and made a couple of mark-ups on the Melbourne map as a way of not only identifying issues, but also seeing how long they take to update. You can see my mark-ups here
    http://maps.google.com.au/maps/ms?ie=UTF8&hl=en&msa=0&msid=210198005946966740057.0004963cbb58a18b55b68&t=h&z=15

    Maurits you note in your post that "...these companies recognise that crowdsourcing spatial data is not only much cheaper, but also much quicker"....
    I created the markups on 29 Nov 2010, and 2 weeks later received an email from Google indicating that the map would be updated by end of January...3 weeks to go, let's see how long it takes them....

    ReplyDelete
  10. Good points Laura, and great to see that you are taking the effort to report anomalies for Google (or Sensis) to correct. Love your comments about the 'short pier' :)

    Even if it takes a few weeks for the updates to get through (resulting from some moderation process, OSM is of course much quicker), it's still a lot faster than the 6+ month cycle through the state mapping agencies and then PSMA.

    In all fairness: DSE in Victoria have deployed an online Notification and Editing facility (NES: http://bit.ly/eI2YyS), though that is currently only available to registered users.

    ReplyDelete
  11. Sensis has now come out of the woodwork to comment on this issue.

    Their view is that Google chose them because they provide better turn-by-turn information - which is required for accurate directions.

    Note that Google only bought a snapshot of the data, not continuous updates. Which casts doubt on the longevity of the relationship.

    Read the full article in Spatial Source online: http://www.spatialsource.com.au/2011/01/25/article/Sensis-comments-on-Google-development/YLZVBBGXGV

    ReplyDelete
  12. It is interesting to note that 'Google only bought a snapshot of the data'. With regard to the cadastre - who did they actually buy it from? We know it wasn't PSMA, nor the individual jurisdictions! Any cadastral data Sensis acquired by virtue of transactions with PSMA (under licence) in the past would not permit its use in the current guise. That data would have been licensed for internal purposes only and certainly wouldn't allow then them to commercialise as their own now! This needs some exploring people. I smell a breach of licence here. Fess up Sensis

    ReplyDelete
  13. Interesting point andre. The Whereis website shows parcels and addresses, but doesn't mention PSMA as a source. Hard to imagine they've collected cadastral data themselves.

    From http://about.sensis.com.au/Terms-of-use/:
    " Copyright in the Whereis® website and the digital map data presented on the website is jointly owned by Telstra, Sensis and others. This website also incorporates data which sourced from the Department of Treasury and Finance, Geographic Data Victoria, Geoscience Australia and Queensland Department of Transport and Main Roads."

    ReplyDelete
  14. It is my understanding that Sensis is a subsidary of Telstra. Based on my looking around the NT I am of the belief that Telstra are using the survey plans provided to them for comment as part of the development consent process to compile a version of the cadastre (interesting how a range of still proposed parcels seem to be included in their version of the cadastre in small towns throughout the NT!). Telstra are then providing this to Sensis.

    I would be VERY interested to see if this was somehow a breach of confidentiality of the development consent process that service authorities like Telstra and the power companies are are part of.

    ReplyDelete
  15. Two points:
    1. Is this not the old trail of public offices (manadate driven) versus private (commerce driven)? We know UGC can only take on authoritative sources if big companies are the bacckbone. This is like open source versus proprietory. Both systems have their flaws and one cannot substitute each other. We should better make UGC complimentry and not a better solution.
    2. There are some other silent issues that makes the conclusion still a dilema. For instance, use of the data which have different requirement for quality of data. Will a person seeking legal right on his landed property agree to use UGC rather than authoritative data?
    ..... so before we conclude, we should consider several cases and scenarios over again

    ReplyDelete
  16. Hello everyone.
    We were using the PSMA data for a long time and recently switched to Navteq.
    Both providers have a lot of inconsistency with the real world.
    I would prefer using the OpenStreetMap if has the full cadastral info.

    ReplyDelete
  17. That's really interesting. Would be good to know in what areas you find PSMA and NAVTEQ/HERE wanting, especially compared to open street map. Accuracy, currency, completeness, other?

    ReplyDelete