IW Meeting 2011-03-04

From Inference Web

Jump to: navigation, search


Meeting info



Leo's addition of WDo description

Leo sent this to Tim via email;

WDOs (Workflow-Driven Ontologies) are OWL ontologies that capture concepts related to process. The main classification of process-related concepts are "Data" and "Method". The latest version of the WDO ontology aligns to PML in the following way: the previously existing concept of wdo:Data has been replaced by pmlp:Information, and wdo:Method has been replaced by pmlp:MethodRule. Notice that the intended meaning of such concepts varies a little, depending on whether you are talking about "process" or "provenance". For example, In "process", pmlp:Information refers to types of information that can be used as input or can be the output of some activity. In "provenance", pmlp:Information refers to types of information that resulted from actually carrying out some activity or that justifies the existence of some other resource. In other words, process concerns are with respect to planning and provenance concerns are with respect to recording. However, there is a natural precedence between process and provenance. For systematic tasks, process may lead to scoping provenance. For ad-hoc tasks, recording provenance may lead to generalizing process.

SAWs (Semantic Abstract Workflows) are instantiations of WDO-concepts (i.e., instantiations of pmlp:Information and pmlp:MethodRule). These instances have values for properties such as: wdo:isInputTo and wdo:hasOutput, among others. So for example, I can have a WDO:

wdo1:dataX subClassOf pmlp:Information. wdo1:dataY subClassOf pmlp:Information. wdo1:methodZ subClassOf pmlp:MethodRule.

Then I can have a SAW that "instantiates" these concepts to capture process:

saw1:instanceX hasType wdo1:dataX. saw1:instanceY hasType wdo1:dataY. saw1:instanceZ hasType wdo1:methodZ. saw1:instanceX wdo:isInputTo saw1:instanceZ. saw1:instanceZ wdo:hasOutput saw1:instanceY.

In addition, since I do not want to have data in a process that comes and goes from nowhere, the pmlp:Source concept is used to "ground" process data, like this:

saw1:instanceS1 hasType pmlp:Source. saw1:instanceS2 hasType pmlp:Source. saw1:instanceS1 wdo:hasOutput saw1:instanceX. saw1:instanceY wdo:isInputTo saw1:instanceS2.

In previous versions of our tool we enforced the constraint that pmlp:Source instances could either be used as a "process source" or as a "process sink" but not both, meaning that any instance of pmlp:Source would only be allowed at beginning and ending points of a process, but not in between. However, the current version removed this constraint. The use case for doing this is when we have a process where, at some intermediate step in the process, there is an "archival" step. The new interpretation of pmlp:Source for process concerns is that it represents a container of information.

We are currently re-evaluating the use of instances at the SAW level because of limitations to modeling process. That is, once we instantiate a concept we are committing to a physical model, which is good for provenance but not necessarily good for process. Nonetheless, I do not expect the way we interpret and use PML concepts in WDOs and SAWs to change, and so this description should be adequate for the purposes of planning your PML-layering efforts.


recording vs. planning.

"type" of information - class of information.

Peter's conceptual, logical, and physical modeling. workflow: capturing the conceptual modeling ( creating the actual process is logical modeling - connecting the things that work together. (but still planning) physical modeling - actually carrying it out. (Provenance/PML)

WDO is used to encode SAWs Currently transitioning WDo is OWL ontologies and have class definitions. SAWs instantiate the ontologies and connect with properties (hasOuput, isOutputOf)

Leo: no middle layer. CONCEPTUAL: OWl classes LOGICAL: PHYSICAL: instances

Patrick: "whiteboard" is conceptual, not specific to a language, an interlingua logical is implemented in some way (RDFS/ object model, OWL, UML, etc) logical level is ontology/uml diagram physical level is instances of the logical level?

done: get a pointer to Jim's conceptual modeling ontology to Leo (previous meeting notes)

Leo returning: modeling process with owl instances, but restricting further usage when capturing provenance b/c you already instantiated it. how to create provenance of SAW? instances of instances. So moving away from craeting SAW with instances towards creating SAW with classes.


transition to the QUAD

implicit action is to present PML in graphical form.


for each tool: what PML constructs the tools recognize

want to do: link descriptions to the action the tool performs and what use case requires the action.

      rdf:subject            :Leo
      rdf:predicate   foaf:knows;
      rdf:object               :Nick;
      rdf:typed rdf:Statement;
     dcterms:isPartOf :TimsLayer

######   :Leo foaf:knows :Nick .

CONSTRUCT ?s ?p ?o
where {
      rdf:subject            ?s
      rdf:predicate   ?p;
      rdf:object               ?o;
      rdf:typed rdf:Statement;
     dcterms:isPartOf :TimsLayer
 :Leo foaf:knows :Nick .

      owl:annotationSource            tim:HTTPHeader;
      owl:annotationProperty      rdfs:subClassOf;
      owl:annotationTarget               pmlp:Information;
      rdf:typed owl:Annotation;

    rdf:type pmlp:InferenceRule;
     pmlp:hasRawString """
     @prefix tim:  <>
     @prefix rdfs
     @prefix pmlp: <>
     tim: HTTPHEader
     pmlp:hasLangauge :Turtle
     dcterms:isPartOf :TimsLayer
     XXXX:toolAction [];
     XXXX:useCaseNeed [];


What does UTEP need with OpenDAP?

UTEP is looking to use OpendDAP to serve data to their scientists. Separate actually hosting the data vs. hosting the data's metadata (including PML).

How to capture provenance with opendap? opendap uses auxillary data to describe the data. could one of those point to PML for that data? (patrick agrees that opendap does this).

What is http://inference-web.org/wiki/OPeNDAP_Example ?

Paulo: wants to use NCML

Paulo: can the operations happen at the server end? knwos how to do it at the desktop client side

patrick: backend server can NCML file. Paulo: can I specify aggregation on server side?

example, please? Patrick: yes. http://test.opendap.org/dap/data/ncml/agg/joinNew_grid.ncml.rdf and here's the ncml file: http://test.opendap.org/dap/data/ncml/agg/joinNew_grid.ncml

The issue seems to be, how do we capture the provenance of the OPeNDAP BES doing this aggregation?

can't it be modeled on the client side? "getting into" what the server is doing may not matter.

e.g. 100 sensors and find out that 2 are off. If you don't know the provenance, then you have to disregard all of it.

ESG community - climate models Petabytes big.

data of sensors on engine with 350 passengers - justifies every resource needed to handle it.

Patrick: provenance at variable level vs. provenance at variable value level. (first makes sense, value level is challenging)

Patrick's steps:

  • first approach: provenance of the server doing aggregation/constraint/subsetting. Dataset level provenance (easy to do) (what is opendap doing?)
  • next, tougher. get variable level provenance - where does each come from?
  • third step: variable VALUE level provenance - (scary)

paulo example: two stacks of cards. mix them. form which original stack does the card come from? want to be able to put them back before mix happened. want to undo the aggregation. Patrick: but they already exist Paulo: to know that THOSE datasets were the source. one stack good, one stack bad. happy until realize one stack is bad. how to get rid of the bad cards?

paulo closing request: will review what ever use case we just discussed. Patrick leading writeup on the example he posted .ncml and ncml.rdf

what does UTEP want: how to map operations to a workflow - how to generate NCML specs for opendap?

Personal tools