Semantic Water Quality Portal
From Inference Web
This page focuses on how the Semantic Water Quality Portal uses provenance.
Contents |
Demo
Overview
The Semantic Water Quality Portal aims to connect water data and allow a broad class of users to query the data in a convenient way to allow them to identify polluted water sources and facilities in geographic areas of choice. It also incorporates multiple regulations including federal and state regulations so that users may review water data with respect to a variety of regulations. This demonstration gives a brief introduction to what can be done and highlights the provenance aspects of the effort.
A sample static demonstration follows. The live demonstration can be accessed at: [TWC Semantic Water Quality Portal] And the Tetherless World project page on this effort is available at [SemantAQUA]
Static Demo:
Assume one is interested in water data in a particular zipcode. The screen show shows the results of inputting
the zip code: 02888 with the selected the facets as shown. The user clicks on “Go!”
The portal visualizes the results on a Google map using different icons to distinguish between clean and polluted water sources and facilities.
To get the pop up window, click the polluted water source.
Switch regulation
With the “Regulation” box, the user can select the regulation to apply, e.g. check the “CA Regulation”. The portal discovers more polluted water sites, polluting facilities based on the CA Regulation.
Retrieval by characteristic
Unselect “No Filter” and click “select” at the next row and select one or more characteristics from the pop up window. The portal detects less polluting facilities and no polluted water sources, since we only select phosphorus_total_as_P. So all polluting facilities displayed are facilities releasing over limit amount of phosphorus_total_as_P.
Pop up window for pollution details
The user can access more details about a site by clicking on its icon. The information provided in the pop up window include: the contaminant names, the measurement values, the limit values, and measurement time.
Provenance of water data
Click the “?” near the “measurement value” in the pop up window for pollution details. The provenance provided in the pop up window includes: the pml file, the RDF file and the original source file.
Provenance of water regulations
Click the “?” near the “limit value” in the pop up window for pollution details. The provenance provided in the pop up window include: the pml file, the RDF file and the original source file.
Trend Visualization
Click the “Visualize Characteristics” at the bottom of the pop up window for pollution facts; select the permit for the facility (one facility can have multiple permits), the characteristic, and the test type. The page displays a trend graph of the water quality over time with violations highlighted. Move the mouse near the data point, the measurement time and value appear.
Data Sources
| Data Type | Data Source |
|---|---|
| Water Quality Data | [EPA Enforcement & Compliance History Online (ECHO) Database] |
| Water Quality Data | [USGS National Water Information System (NWIS) Water-Quality Web Services] |
| Water Quality Regulation | [EPA (National Water Regulation)] |
| Water Quality Regulation | [California Code of Regulations] |
| Water Quality Regulation | [Massachusetts Department of Environmental Protection] |
| Water Quality Regulation | [New York Department of Health] |
| Water Quality Regulation | [State of Rhode Island Department of Environmental Management] |
Technical Design
Provenance capture
Water data provenance capture
| Integration Stage | Provenance | Script |
|---|---|---|
| Retrieval | source URL, modification time, inference engine, inference rule, involved actor | purl.sh |
| Adjust | antecedent data, modification time, inference engine, inference rule, involved actor | punzip.sh, justify.sh |
| Convert | antecedent data, invocation time, inference engine, interpretation rule | convert*.sh (conversion trigger) |
| publish | dump file URL, publish time, involved actor | publish.sh |
Water regulation provenance capture
The water quality regulations are converted to OWL2 ontologies with our ad-hoc regulation converter. The regulation provenance data are captured manually. We plan to automate the regulation provenance capture with our regulation converter.
Provenance usage
Data source widget
| Input | URL of SPARQL endpoint and (optional) list of its named graphs, and URL of the SimpleNamedGraphSourceGraph instance |
| Output | SimpleNamedGraphSourceGraph instance filled with simple descriptions of the source organizations responsible for the data |
| Processing | Walk a big provenance graph for each named graph and abstracts it into one triple: <data_1> dct:source <source_1> |
An example instance of SWQPSimpleNamedGraphSourceGraph is as below:
#apps do not hard code this graph
graph <SWQPSimpleNamedGraphSourceGraph> {
<SWQPSimpleNamedGraphSourceGraph> rdf:type :SimpleNamedGraphSourceGraph .
<rhode_island_data_1> dct:source <http://health.tw.rpi.edu/source/epa-gov> .
<rhode_island_data_2> dct:source <http://health.tw.rpi.edu/source/epa-gov> .
# 2 sources is OK
<rhode_island_data_3> dct:source <http://health.tw.rpi.edu/source/usgs-gov> .
<rhode_island_data_3> dct:source <http://health.tw.rpi.edu/source/epa-gov> .
}
Usage:
- Presentation of the data sources on the interface
- Source based data retrieval
Advantage:
- The only thing that web applications need to know is the OWL class :SimpleNamedGraphSourceGraph. They can find the named graph in arbitrary endpoints that are typed to this class.
- Lightweight provenance - done at holistic level of named graphs. When you care about triples returned, you can ask where THOSE SPECIFIC triples came from.
Disadvantage of this simplification:
- All organizations identified share ALL of the responsibility for ALL of the data in the named graph. The more granular provenance will know which triple came from where.
Provenance Visualization
The responses provided by the water portal may not be trusted by some users if it does not provide users with the option to examine how the responses are obtained. The following provenance visualization example that shows the process trace of the USGS data when the input zip code is for Bristol County, RI. As we realized that the visualization of the full provenance trace is not readable for users, we are working on support both full and abstracted provenance trace.
Technical details
For more discussion about technical details, please refer to Semantic Water Quality Portal - technical details.
Data details
For more discussion about data details, please refer to Semantic Water Quality Portal - data details.
Publication
- Wang, P., Fu, L., Patton, E.W., McGuinness, D.L., Dein, F.J., and Bristol, R.S. 2012. Towards Semantically-enabled Exploration and Analysis of Environmental Ecosystems. In Proceedings of 8th IEEE International Conference on eScience (October 8-12 2012, Chicago, IL).
- Wang, P., Zheng, J., Fu, L., Patton, E., Lebo, T., Ding, L., Liu, Q., Luciano, J.S., and McGuinness, D.L. 2011. TWC-SWQP: A Semantic Portal for Next Generation Environmental Monitoring. (Technical Report).
- Wang, P., Zheng, J., Fu, L., Patton, E., Lebo, T., Ding, L., Liu, Q., Luciano, J.S., and McGuinness, D.L. 2011. [http://tw.rpi.edu/web/doc/iswc2011_swqp TWC-SWQP: A Semantic Portal for Next Generation Environmental Monitoring. In Proceedings of 10th International Semantic Web Conference (October 23-27 2011, Bonn, Germany).
- Zheng, J., Wang, P., Patton, E., Lebo, T., Luciano, J.S., and McGuinness, D.L. 2011. [http://tw.rpi.edu/web/doc/eim2011_swqp A Semantically-Enabled Provenance-Aware Water Quality Portal . In Proceedings of EIM 2011 (September 28-29 2011, Santa Barbara, CA, USA).
Presentation
Related Work
Source Code Repository
Source Code: http://code.google.com/p/swqp/


