Skip to topic | Skip to bottom

Provenance Challenge


Start of topic | Skip to actions

New Provenance Queries

At the first provenance challenge workshop, it was suggested that a wider range of provenance queries be devised than was present in the first challenge. The aim of this exercise is to better distinguish between approaches and to better determine what is important to and within scope of provenance systems. During the workshop, a set of general query topics was discussed. Here, we attempt to place them in the terms of the fMRI workflow used in the first provenance challenge to make them more concrete. The details of this workflow can be found on the first provenance challenge page.

We encourage participants to edit this page in the following ways:

Please make it clear on the page who you are when providing each of the above.

The contents of this page will be discussed at the second provenance challenge workshop.

Long-Term Use of Provenance

The fMRI workflow was run by user X on 2006 September 14th. Five years later, the workflow has been adapted and run many times, the AIR suite and its data file formats have changed substantially, GIFs are no longer in common use and user X has left research to become a farmer.

Causal Information Outside the Workflow

The new brain images that were inputs to the challenge workflow, Anatomy Images 1 to 4, were produced by four other workflow executions, run independently some time before the challenge workflow. The preceding workflow takes as input some configuration files, Configuration 1 to 4, for the runs generating each image.

Non-Workflow Processes

The new brain images that were inputs to the challenge workflow, Anatomy Images 1 to 4, were emailled in a archive (.tar.gz) file from another (remote) user before being opened and the workflow run.

Find the process that led to Atlas X Graphic, where this should include the identity of who emailled the archive from which the workflow inputs were taken, and the time at which those images wer added to the archive by the remote user.

Multiple Levels of Granularity

The challenge workflow can be abstracted away from its original form, to be seen as composed into only three stages:

The average procedure actually involves executing the original three stages of the challenge workflow.

One Process Affecting Another

During the enactment of the challenge workflow, another concurrently executing workflow, the corruption workflow, accidentally affects the file Warp Params 1 between its production by align_warp and its use by reslice.

Effects of External Events

During the enactment of the challenge workflow, the hard disk becomes full, meaning that the output of softmean, Atlas Image, is an empty file, and the workflow then crashes.

Unwritten Queries

There are two aspects of provenance discussed in the first challenge workshop for which I am unsure how to write practical queries such as those below. It would be good for those that understand the problem being suggested to add queries for these aspects, preferably tied to the challenge workflow.

-- SimonMiles - 04 Oct 2006
to top

End of topic
Skip to action links | Back to top

You are here: Challenge > SecondProvenanceChallenge > ProvenanceQueries

to top

Copyright © 1999-2012 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.