IW Meeting 2010-07-15

Meeting Info

2010-07-15 IW Meeting Attendees:

 Li Ding
 Paulo Pinheiro da Silva
 Tim Lebo
 Cynthia Chang
 Aida Gandara

IW Overall Objectives

     1) PML ontology issues (OWL level only) potential non-monotonic  changes (Li) 
     2) PML practice  
         2.1) Use cases and examples (Tim) 
         2.2) RDF level issues (Cynthia) 
         2.3) PML Tools / API issues (Cynthia) 
         2.4) PML primer (Tim) 
     3) PML and OPM mapping (Paulo) 
     4) IW group management (TBD)
        4.1) inference web infrastructure and research plan
        4.2) technologies for engagement 

Today's agenda

items below were not discussed during this meeting
  • more notes from email: http://groups.google.com/group/inference-web
  • IW group management
    • assign associate name IW Overall Objectives
    • use of technology
      • issue tracking tool for code/tasks (google code?)
      • mailling list discussion tool ( google group)
      • chat (skype)
      • collaborative online editing (titanpad)
    • documentation (on wiki?) on how a new person join the group
      • enrolling in google groups, skype chat, titanpad, etc.
      • having access to easy to understand publications about pml, inference web, use cases
  • PML example from Tim Lebo
    • URL download (figure on email list)
  • PML issue from Jim M
    • bnode NodeSet vs
    • giving tool URL of a document with a NodeSet
  • Review PML ontology issues PML Ontology (Project)


reviewed last week's notes


ISSUE: upload papers to web

ACTION (below)

ACTION: everyone to add their publications to the wiki (http://inference-web.org/wiki/Publications)

(citations arms race: workflow papers that cite theirs)

(collection of tags for provenance on medalaey) ACTION: Paulo to send link to IW for what tags are defined in Medalaey.

ACTION: IW group to tag PML papers on Medalaey.

Paulo has a big bibtex file about provenance. Does Medaleay accept bibtex?

ACTION: Patricia to find out of Mendeley [1] will accept Paulo's bibtex file.

(where is the wiki page listing our articles)

The wiki page we use to list our publications: http://inference-web.org/wiki/Publications (we should not change the Selected References section of the page for now)

Another URL would be (why useful?) : http://tw.rpi.edu/portal/Inference_web

ISSUE: unzip method rule

Tim asked about it. Cynthia can add one, but should we?

Longer term issue: If we already have one, how can we learn about it? How do we address this general problem?

"PML-P search"? - what is this? -- http://tw1.tw.rpi.edu/pmlpsearch/

WDOit mints URIs for method rules.

Paulo gave a URL to a sparql endpoint. (http://trust.utep.edu/sparql-pml/query/index) Anyone can query the method rules they are registering.

Q: Is this spaql endpoint part of the CI Server? A: currently, yes. in future, should not.

When multiple - need master sparql endpoint?

IW Search is another method do find Method Rules.

Ways to find URIs for Method Rules:

"IW Registry" vs PML instance data.

not all registries should incorporate all content.

"hiding" PML-P data in a triple store vs. providing as linked data. (google analogy: the triple store is searching google. walking linked data is walking all web documents...)

Q: does this triple store have the 1000 formats that Nick mentioned during IPAW? A: Yes.

Paulo: PML instance data is embedded in Drupal?

MIME tables

Sources of PML instance data:

  • drupal
  • mime tables
  • bibtex files
  • ontologies, e.g., WDOs, VSTO, SAWs
  • scientific processes, e.g., sensor tables, data repositories

Challenge is not to generate, but to use.

ACTION: build collection of pointers to where PML instnace data is available. starting point: list of URLs for PML data http://inference-web.org/wiki/PML_Datasets

Q: Can WDOit register a new Method Rule into the sparql endpoitn? crawler crawls certain domains and puts into triple store. what should it put into the triple store? CI Server DONE: added pointer to CI Server.http://rio.cs.utep.edu/ciserver/

WE try to abstract unzipping. We don't provenance verything, only what is relevant to the scientists. unzipping is not relevant. essential to get final answer but no scientific value.

DONE: add link to PC-2011 (provenance challenge) http://twiki.ipaw.info/bin/view/Challenge/FourthProvenanceChallengeCFSP (Tim referenced during "zipping and unzipping is unimportant" discussions. One of the proposals is centered around zipping and unzipping).

Tim: need to express prov of unzip just to link two connected components of the provenance. User needs to query across

zipping: scalability or unit of files?

Design considerations: not encoding provenance of things that the scientific user cares about (but

SAW is supposed to be abstract. It is used to generate provenance and is at more abstract level. There is a mismatch between low-level provenance capture and higher-level provenance capture.

DESIGN SUGGESTION: look for more abstract method that aggregates zip and processing that occurs.

ACTION: Tim to consider how meeting notes can be more consumable by those who did not attend.

PNNL: ARM process. Prov of every little step. They need it b/c things can go wrong during processing. Things go wrong at that level. Don't want to keep all find grained provenance. Want more coarse grained that reflects what is scientifically relevant. e.g. Peter's CHIP process. Monitoring sun. They capture registeration (p-angle) how much to rotate sun image so that up is north. Every solar physicst knows about it but does not care about that level b/c nothing goes wrong. HOWEVER, calibration of image is modeled heavily b/c things can go wrong.

DESIGN SUGGESTION: only model at level at which things can go wrong. (unzipping does not go wrong; it is highly trusted)

terminology: instance (vs "constant")? instances <=> proofs constants <=> e.g. #Told RETURN.

ACTION: Tim to find out from IW tools exist to create a Unzip Method Rule. http://registrar.inference-web.org/iwregistrar/ Tim needs an account. Cynthia to give Tim account.

1. We discussed two tools
 1.1 a web-based interface for creating an PMLP instance, currently we have IWRegistry
 1.2. a search sevice for reuse published PMLP instances, currently we have PMLP search
2. We also discussed best practices of creating a method rule, which level of granularity should be captured

(Undiscussed) ISSUE: Jim's example

Jim's example

http://granite.med.yale.edu:2020/sparql?query=describe <http://data-gov.tw.rpi.edu/source/data-gov/dataset/1554/version/2010-Feb-13/thing_10> <http://data-gov.tw.rpi.edu/source/data-gov/dataset/1554/vocab/enhancement/1/fy1999>

[] a j:NodeSet ;

             [ a        pmlp:Information , rdf:Statement , rdfs:Resource ;
               rdf:object 2615000  ;
               rdf:predicate <http://data-gov.tw.rpi.edu/source/data-gov/dataset/1554/vocab/enhancement/1/fy1999> ;
               rdf:subject  ds1554:thing_10
             ] ;
             [ a        rdfs:Resource , j:InferenceStep ;
                        [ a       j:NodeSet ;
                                  [ a       pmlp:Information , rdf:Statement , rdfs:Resource ;
                                    rdf:predicate <http://data-gov.tw.rpi.edu/source/data-gov/dataset/1554/vocab/raw/fy1999> ;
                                    rdf:subject ds1554:thing_10 ;
                                            [ a       pmlp:SourceUsage ;
                                                      [ a       pmlp:DocumentFragmentByRowCol ;
                                                        pmlp:hasDocument <http://www.data.gov/download/1554/csv> ;
                                                        pmlp:hasFromCol 57 ;
                                                        pmlp:hasFromRow 11 ;
                                                        pmlp:hasToCol 57 ;
                                                        pmlp:hasToRow 11
                                                      ] ;
                       ] ;

Tabled ISSUEs

Paulo: OPM does not have notion of Source, and (Source should not be called Source).

Action Items

* everyone add publications at http://inference-web.org/wiki/Publications, do not update the selected section.
* Patricia to find out of Mendeley will accept Paulo's bibtex file. 
* Patricia will put papers at http://inference-web.org/wiki/Publications onto Mendalay [2]
* everyone tag PML papers on Mendeley. Patricia will guide.
