IW Meeting 2012-12-13

Meeting Information



  • Tim

Meeting Preparation

Around the room

* Add a section for yourself 2 hours before meeting.
* Mark any discussion point that you would like to raise during meeting (with DURING MEETING). 
* Otherwise, assume that others will read the rest before meeting. 
* Also, please be considerate and read others' discussion points before the meeting starts.


  • Wrote description of provenance in healthdata.tw.rpi.edu at
  • Met with Nick re: visko PML code several times through past week. Current plan:
    • Nick: added queries/*.vkq, and new Main class to read *.vkq and output .pml (instructions to run here)
    • Tim: fork and rework the code to assert PROV
    • Nick: Look at Tim's PROV. Interest(How to capture query)? (Activity is blasé)
      • for the parts that are bad -> We discuss and fix it. We bring it to IW for discussion.
      • for the parts that are good -> We report it to IW as a win and get final approval.
    • Once w'ere ALL good + IW says all good => pull Tim's code in and let users toggle PML 2 and PROV/PML 3 on and off (no reason to REMOVE what was done).
    • PROV Implementation Report submission.
  • Branched a fork of Nick's visko github repo and took first cut at PROV'ing his PipelineExecutions.
  • Tried to dig into Unified Digital Formats Registry (notes), but they did a pretty bad job of overview and accessibility.
  • Started a PROV-XML -> PROV-O converter that will be in github; it's discussed here.

Below is a portion of the visko redesign. It only uses pml:Query and pml:hasAnswer.

<query/43c9b8da77c96c0c2b513a90641f805> a pml:Query , prov:Entity ;
   dcterms:identifier "43c9b8da77c96c0c2b513a90641f805" ;
   prov:value """
PREFIX views https://
""" .

<query/43c9b8da77c96c0c2b513a90641f805> pml:hasAnswer <http://dummy-domain.edu/dummy-data-027593291809351883.data> .
<query/43c9b8da77c96c0c2b513a90641f805> pml:hasAnswer <http://dummy-domain.edu/dummy-data-07440605876865989.data> .
<query/43c9b8da77c96c0c2b513a90641f805> pml:hasAnswer <http://dummy-domain.edu/dummy-data-01102675142505809.data> .

   a dcat:Dataset , prov:Entity ;
   prov:wasDerivedFrom <http://dummy-domain.edu/dummy-data-0370396405731899.data> ;
   prov:wasAttributedTo <https://raw.github.com/nicholasdelrio/visko-packages-rdf/master/package_ghostscript.owl#ps2png> ;  ### WRONG
   dcterms:format <https://raw.github.com/nicholasdelrio/visko/master/resources/formats/PNG.owl#PNG> .

<http://dummy-domain.edu/dummy-data-0370396405731899.data> a dcat:Dataset , prov:Entity ;
   prov:wasDerivedFrom <http://dummy-domain.edu/dummy-data-07515358272300815.data> ;
   prov:wasAttributedTo <https://raw.github.com/nicholasdelrio/visko-packages-rdf/master/package_gmt.owl#psxy> ;  # WRONG
   dcterms:format <https://raw.github.com/nicholasdelrio/visko/master/resources/formats/POSTSCRIPT.owl#POSTSCRIPT> .

<http://dummy-domain.edu/dummy-data-07515358272300815.data> a dcat:Dataset , prov:Entity ;
   prov:wasDerivedFrom <http://iw.cs.utep.edu/visko-web/test-data/gravity/gravityDataset.txt> ;
   prov:wasAttributedTo <https://raw.github.com/nicholasdelrio/visko-packages-rdf/master/package_custom.owl#GravityDataFieldFilter> ; #### WRONG
   dcterms:format <https://raw.github.com/nicholasdelrio/visko/master/resources/formats/SPACESEPARATEDVALUES.owl#SPACESEPARATEDVALUES> .

<http://iw.cs.utep.edu/visko-web/test-data/gravity/gravityDataset.txt> a dcat:Dataset , prov:Entity .

   a prov:Activity , hartigprov:DataCreation ;
   prov:wasAssociatedWith <https://raw.github.com/nicholasdelrio/visko-packages-rdf/master/package_custom.owl#GravityDataFieldFilter> ;  ### WRONG
      <query/43c9b8da77c96c0c2b513a90641f805/submission/067c4e96-2a77-4097-aa96-a4b9157d3746/pipeline/1/call/1/indexOfZ> , 
      <query/43c9b8da77c96c0c2b513a90641f805/submission/067c4e96-2a77-4097-aa96-a4b9157d3746/pipeline/1/call/1/indexOfY> , 
      <query/43c9b8da77c96c0c2b513a90641f805/submission/067c4e96-2a77-4097-aa96-a4b9157d3746/pipeline/1/call/1/url> , 
       <query/43c9b8da77c96c0c2b513a90641f805/submission/067c4e96-2a77-4097-aa96-a4b9157d3746/pipeline/1/call/1/indexOfX> .

<query/43c9b8da77c96c0c2b513a90641f805/submission/067c4e96-2a77-4097-aa96-a4b9157d3746/pipeline/1/call/1/indexOfZ> a prov:Entity ;
   prov:specializationOf <https://raw.github.com/nicholasdelrio/visko-packages-rdf/master/GravityDataFieldFilter.owl#indexOfZ> ;
   prov:value "2" .

<query/43c9b8da77c96c0c2b513a90641f805/submission/067c4e96-2a77-4097-aa96-a4b9157d3746/pipeline/1/call/1/indexOfY> a prov:Entity ;
   prov:specializationOf <https://raw.github.com/nicholasdelrio/visko-packages-rdf/master/GravityDataFieldFilter.owl#indexOfY> ;
   prov:value "1" .

### TODO: check that the output is connected to the activity.


  • finished WWW submission
  • working on healthdata again
  • working on grant for using healthdata infrastructure in melagrid
  • Maybe submit DILS short paper on melagrid and healthdata vision?
    • todo from deborah
    • send links for starting points of provenance in health data, health data infrastructure, kristine's writeup, updated version of your ckan stuff with michael that leverages the health data infrastructure



  • New fuse feature: dynamically generated indicators of challenge questions per metric.



  • Fuse Phase 2
    • will require explanations by rows of data and by by cell. very d3-ish. may provide some interesting use cases.
    • will have an updated rubric for evaluation of explanations (this is not available yet)
  • Community science is in - yeah!
  • AGU last week
  • NIH follow up with Jim M - will motivate write up of the health data challenge infrastructure

Outstanding Items


