Behind the Data: Earning Trust with Provenance

From Inference Web

Jump to: navigation, search

Alvaro Graves, Tim Lebo, and Jim McCusker developed this demonstration for Emerging Trends in Semantic Technologies. See their May 2010 report.



James Michaelis created a compelling and informative web application called Comparing US (USAID) and UK (DFID) Global Foreign Aid to demonstrate how Semantic Web techniques can be used to mash up government data. While the data used by the application were accumulated from a variety of authoritative government sources, they are presented without the ability to justify the claims put forth. For example, one can see that Brazil received $13,602,693 in aid by positioning the cursor over the map, but how would a curious -- or suspicious -- human viewer verify this claim? Despite the fact that the data sources are authoritative, audiences may have concerns of their resulting fidelity because the application is hosted by an independent organization - an "unknown" university intermediary. Fortunately, James responsibly cites his sources and provides pointers to the data used by the application in his "About" section; breadcrumbs to begin an investigation:

   USAID: An RDF Dataset based on U.S. Overseas Loans and Grants (Greenbook).
   DFID:  An RDF Dataset based on Statistics on International Development 2004-2009.

Can't we do better? How can we get a specific answer regarding the $13,602,693 to Brazil? How do we know it is true, without dedicating a few hours to plunge into the data mass and homepages that we've been pointed to? Behind the Data: Earning Trust with Provenance supplements James' demonstration by providing direct access to the authoritative source of each financial claim.

Here is a screenshot of James' original Comparing US (USAID) and UK (DFID) Global Foreign Aid demonstration:

Check it out

Behind the Data: Earning Trust with Provenance is being hosted at two locations (when clicking around, note the IOU):

We include a screenshot of the Behind the Data: Earning Trust with Provenance demonstration here:

Getting started

Comparing the two applications, we see some similarities and differences. Both are showing foreign aid, but Behind the Data: Earning Trust with Provenance is only showing aid from the UK (for simplicity). Both show maps providing the total aid and both show pie charts breaking down the total according to its intended purposes. While James' application provides some supplemental information from New York Times (News) and DBPedia (Country Facts), we've also omitted these extras to focus on justifying the financial claims.

Digging in

The table shown below the map in Behind the Data: Earning Trust with Provenance distinguishes this demonstration from Comparing US (USAID) and UK (DFID) Global Foreign Aid. Clicking on the map or a pie slice adds a row in the table to reiterate the visual's claim. It also offers an explanation in the form of a clickable oh yeah?. The materials used to generate the explanation are also provided in the form of the raw PML encoding as well as a link to Alvaro Graves' POMELo Online PML Editor that will render the raw PML.

Clicking the offer of explanation:

oh yeah?

fetches the explanation pointing to the original data source and describing some intermediary processing. For example (IOU):

Bilateral-exp-recipient-country-sector-america.xls [row 217, col 14] x 1.96867 x 1000.0 (by csv2rdf4lod, Tim Lebo)

Getting to the bottom of things

The added advantage of this demonstration is the ability to provide a direct and granular citation for a specific claim made by an application constructed by a third party aggregating authoritative sources. This permits curious -- or suspicious -- human observers to access and inspect justifications for specific details provided by an application, which can lead to increased trust in the intermediary application or more efficient access to originating sources and subsequent analysis.

Technical details

Behind the Data: Earning Trust with Provenance - technical details contains more technical details about how the demonstration is constructed. If you have any direct questions, please feel free to contact Tim Lebo.


At some point in the last 14 months, the explanation query started experiencing problems while we weren't looking. So the application's promises such as "(gathering explanation with query)" are empty promises. We'll be looking into it and fleshing it out in Behind the Data: Earning Trust with Provenance - technical details.

Personal tools