IW Meeting 2012-05-24

From Inference Web

Jump to: navigation, search


Meeting Information


  • Administrative
  • Around the room - please post online in advance
  • Outstanding Items
  • Focus: please propose here before meeting


Meeting Preparation

Around the room

(fill this out before meeting)

In addition, things that you would like to get out of the inference web meeting and any proposals for a structure that supports that.


  • a.  i would like to keep up quickly with what people are doing and where any issues arise.  i think we we get better at the around the room updates on the wiki, much of this can be done on the wiki.   I may want the updates to be in the night before so i can review and comment then though in general.
  • b.  i would like the meeting to be more about the next generation provenance and explanation environment that we all want to use.  some of that will be driven by our current sponsored projects - like the request for rationale encoding from usgs or the extra privacy and access concerns of fuse that are impacting our tools.  some will be driven by our and other research.   this leads us to issues like connection to PROV and other focus areas and extensions.
  • c.  i would like to have more time for technical discussion. i think if we can offload much of the around the room time and have more technical presentation AND probably have a couple of questions that everyone should answer in their presentation - why should this group care at least  (in one or max 2 sentences),  how does this potentially impact our work on the future of explanation and provenance environments, and possibly any requests from the group.

Jim McCusker

  • I think we need to spend some time on the following issues, which are not well factored and are interrelated:
    • How do we plan to migrate our existing projects and tooling to PROV?
    • How do we plan to create extensions to PROV (TWPROV)?
    • How do we plan to migrate PML to align with PROV?
  • I think that we need to hew very very closely to the rules for daily scrums: http://en.wikipedia.org/wiki/Scrum_(development)#Daily_Scrum
  • I've found the paper discussions very useful. We have actually gotten quite a few papers out of it.
  • The FUSE discussions have been less useful to me and I think they need to either focus more closely on provenance-specific discussions or be put into a separate meeting.
  • The meeting is in dire need of being timeboxed, and agenda items that go over time should either be placed back into an "old business" agendum or only be extended with the consensus of the people in the meeting.
  • What I did:
    • Wrote an outline for the HICSS paper.
    • Installed CKAN into a VM on my laptop
    • Started to learn how to make CKAN extensions
    • Did a little more work for IPAW paper: 
  • What I'm going to do:
    • write a CKAN extension that will use awwation and my datcube browser to create provenance-enabled data visualizations that take advantage of without having to code.
  • What has blocked me:
    • A paper submission at Yale has been taking up an immense amount of my time. The paper is now submitted, so I will have more time for this project.
    • Finish submission to IPAW


What I've been doing:

  • Yet More PROV-O
    • Internal draft due 1 June.
  • Updated IPAW paper, submitted for final version. I can accept any suggestions for change until 4 Jun.
  • Worked on DataFAQs epoch execution engine.
    • Added "Augmenter" SADI services that provide computations about datasets about to be evaluated
    • Adding augmenters is a work around for CKAN losing resolvable URIs.
    • Muddles the distinction for an evaluation service (which was a filter/computer)
    • Read to turn to exposing evaluations at http://aquarius.tw.rpi.edu/projects/datafaqs/home and tweaking modeling to suit.
    • ISWC looks beyond reach.

What I'll do next week:

My impediments:

  • TODO(Deborah) Need to hear Deborah about IPAW.

What I want (and how I can get it)

  • More collaborations with Paulo and PNNL.
  • A "non-dependent", native IW effort; claiming that different projects "overlap" (i.e., shoehorn) as IW doesn't seem adequate. IW should have its own agenda.
  • Less project management. Let the technology, discussions, and ideas drive the efforts we do between meetings.
  • If it's not working, let it die?
  • Distinguish PML from IW. It should be the objectives, not the approaches.

James Michaelis

  • In upcoming meetings, I would like to discuss development work being conducted on FUSE's explanation component, as well as the user profile collections across the GILA, SPCDIS and FUSE projects.
  • In line with what Tim and Jim have suggested, it may be helpful to re-frame the IW telecon series as an Explanation/Provenance telecon, which would have a broader focus than Inference Web and PML.  Particularly, since we are now moving to new provenance models (PROV) and supporting technologies.
  • I think setting aside between 20-30 minutes each telecon to focus on a particular research topic/project - presented by a given lab member - would be a helpful way for everyone to get group feedback on their work.  We have been doing this previously with the PROV and FUSE work - as time has permitted.  However, I think we should make this more of an official part of the meeting (similar to status updates).


  • Starting outline for ISWC paper on profile modeling.
  • Cleaned up Profile Narratives for FUSE/GILA projects
    • Will be sending SPCDIS profiles to Stephan Zednik shortly for review.
  • Conducted review of technical requirements for FUSE explanation interface
    • One important focus: Generation of evidence summary views (e.g., presenting a few pieces of highly relevant evidence at a time)
    • This will tie-in with the profile modeling work for FUSE.
    • Will be contacting Dominic/Amar shortly, to discuss LOD-based explanation augmentation we can do with PubMed data from BAE.


  • Fuse demo modifications
  • Fuse issues
  • Backend process for Fuse

Outstanding Items

Today's discussions

Around the rooms vs. Scrum organization

Jim: Scrum.   3 points - what done, what will do before next meeting, impediments

Deborah: But would like to capture the "return to"s in the wiki in advance (or just type it into titanpad) for follow up.

The new IW meeting

Working towards IW.

Individuals give a technical presenting. Starting the talk with "why am I talking about this?"

~20 minutes, but starting with why presenting and what want to get out of.

  • IW vis a vis PROV
  • Collaborations: PNNL and RPI.

TODO: start wiki page with potential topics. 

  • e.g. PROV - updating tools
  • e.g. PNNL tools - should/can we use them?
  • e..g., what do we do with the frbr work
  • e.g., shall we push the sameas effort, the DataFAQs work, the data notebook, etc...

ARM - radiation measurements.

Paulo - we need expertise that others do not have.

  • we used to be provenance, but plenty of others now have that.

Where is the essence?

  • declaration
  • extensible toolkit.
  • the newbies: DataFAQs, Data Notebook?

Are there other things that inference web could be....   

  • one is explanation?
  • fundamental problem -how to make provenance ubiqitous
  • deborah - making provenance useful and used - where are the provenance-aware services?
  • Making provenance a demand.
  • What can we not do without provenance?

paulo  - we do not use provenance to help ourselves...

"ping back provenance"  

Tim: IEEE journal tradeoff: splitting data from metadata: hashing the data to know it changed vs. letting the metadata and provenance ride along for free.

  • deborah: auto-config based on use case depending on the tradeoff that you want to make.

Tim: ping-back provenance : walking down the chain (into the future, not the traditional backwards).

  • next gen h-factor
  • NSF tracking down their impact
  • Consumers to explore down the derivation trace.
  • James: potential insights b/c one dataset is meshed with a new one.

Paulo: make it easier to capture day-to-day provenance

Finding more uses of provenance.

The take aways:

  • "Freebies" exposed
  • ping-back
  • explanation

DataFAQs....  data is dead unless it is being looked at....   can breathe life into data in different ways...

"sneaking provenance into points of consumption"   often a point of consumption os a figure in a paper

  • in biomedical, it's a figure in paper.
  • being able to reproduce the figure.  - i thnk the reproduction aspect is one thing that distinguishes this line from what for example you see from image retrieval on google.
  • reproduction vs. modifying to a new direction (not from scratch).
  • bitly link, QR code, (watermark)

notion of a validated dataset that i do not need to rejustify - since people with credentials have already done xyz to it...

Yale's "Image ontology" - classifying figures from pubmed articles

Tim: creating a new paper by forming a query over other papers.

Deborah: a paper that shows same data in different ways (depending on audience?)

Jim: science can't communicate to the general public. Shoiuld make a blog that simplifies the rigor of the paper.

  • reinterpret the data, tied to original form.

Deborah: different figures. Plotting dollars. As a manager vs. technical audience.

Jim: Propublica might be interested in working to adopt some of these ideas (they are about transparency).

Deborah: h-index and Elsevier rating.

a few other companies of interest     Linguastat  http://www.linguastat.com/ , http://automatedinsights.com/products_and_solutions http://www.theatlantic.com/entertainment/archive/2012/04/can-the-computers-at-narrative-science-replace-paid-writers/255631/    Deborah: selecting sources based on those that an audience may inherently accept.   Tim:  (this touches back on explanation)

when you pull a fact in.... what aspects of this make it more likely to be accepted  - one is source  one is recency .... are there seminal papers in semantic publishing?

todo - tim has lead on starting with prov and doing a mapping to pml.    

Around the room

Outstanding Items 

(handled above)


Today's focus

Facts about IW Meeting 2012-05-24RDF feed
Date24 May 2012  +
Personal tools