Challenge.ThirdProvenanceChallenge

Third Provenance Challenge

The top level page for the third provenance challenge.

Current Status

The Third Provenance Challenge Workshop was a success. It resulted in several proposals for changes in the OPM specification, additional profiles for OPM, a governance model, a CFP for a journal paper based on PC3 results, and thoughts on a future fourth provenance challenge.

Workshop Details

Workshop details can be found at LocalDetailsPC3.

Participating Teams

Pages for each participating team can be found at the ParticipatingTeams3 page. If you are participating, please create a link to your teams page there. You can use the Test Team page as a template for what should be included in a team page.

Schedule

1. Review of code and provenance query proposals (to Feb 27)

March 2 - PC3 Starts

2. Make the workflow work with individual team's systems [Mar 2 - Mar 30]

3. Generate provenance for the challenge workflow & run queries on it [Mar 30 - Apr 13]

4. Export OPM Graphs and import from others [Apr 13 - May 4]

5. Run queries on imported OPM graph [May 4 - Jun 1]

6. Prepare slides for challenge [Jun 1 - Jun 8]

PC3 Workshop June 10 - 11 held in Amsterdam

Challenge Goals

1. identify weaknesses and strengths of the the OPM specification

2. encourage the development of concrete bindings for OPM in a variety of languages

3. determine how well OPM can represent provenance for a variety of technologies (scientific workflow, databases, etc.)

4. demonstrate that a complex data products provenance can be constructed from provenance documentation produced by multiple combinations of heterogenous applications

5. bring together the community to further discuss the interoperability of provenance systems.

Provenance Questions

Please list possible provenance queries for the Challenge here. If the query requires any additions to the workflow please detail them as well.

Provenance Challenge Workflow

The PC3 workflow and its software implementation in .Net, Java, and shell scripts can be found at the ThirdPCWorkflow page. Below is the background of the workflow.

Background

The Pan-STARRS project is building and operating the next generation sky survey with the ability to continuously scan the visible sky once a week and build a time series of data. This helps detect moving objects that may potentially impact with earth besides building a massive catalog of the solar system and 99% of visible stars in the northern hemisphere. The collaboration is lead by the University of Hawai'i that operates the telescope and image pipeline while Johns Hopkins University is building the object data management (ODM) framework that is exposed to astronomers. The load workflow used in PC3 appears at the handoff between the image pipeline and the ODM, and uses the Trident workbench to ingest incoming CSV files into SQL Server databases.

Acknowledgement

Jim Heasley (University of Hawai'i)

Alex Szalay (Johns Hopkins University)

to top

End of topic
Skip to action links | Back to top

You are here: Challenge > ThirdProvenanceChallenge

to top

Provenance Challenge