This page in progress
Participating Team
- Short team name: MINDSWAP
- Participant names: Jennifer Golbeck
- Project URL: http://provenance.mindswap.org
- Project Overview: Using Semantic Web Technologies for Provenance
- Provenance-specific Overview:
- Relevant Publications:
Jennifer Golbeck. 2006. Combining Provenance with Trust in Social Networks for Semantic Web Content Filtering. Proceedings of the International Provenance and Annotation Workshop. Chicago, Illinois, May, 2006. (
http://trust.mindswap.org/papers/IPAW_Trust.pdf )
Christian Halaschek-Wiener, Jennifer Golbeck, Andrew Schain, Michael Grove, Bijan Parsia, and Jim Hendler. Annotation and provenance tracking in semantic web photo libraries.Proceedings of the International Provenance and Annotation Workshop (IPAW). Chicago, Illinois, May 2006. (
http://www.mindswap.org/papers/2006/IPAW_PhotoStuff.pdf )
Workflow Representation
The workflow has been encoded using an OWL ontology at
http://provenance.mindswap.org/provenance.owl
Provenance Trace
The structure of the provenance trace can be extracted by following the logical and rule-based connections of instances of the ontology. Our ontology, instance data, and rules are all expressed in OWL. They can be browsed at
http://provenance.mindswap.org (this site is still being constructed, so pardon any errors). A visualization shows some of the connections to a certain depth on each page. Transitive roles in the ontology allow provenance tracking back up to the original files. SPARQL queries are used for finding any more complex information.
Provenance Queries Matrix
Teams | Queries |
Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8 | Q9 |
MINDSWAP team | | | | | | | | | |
Provenance Queries
Queries can be run in real time at
http://www.mindswap.org/~golbeck/queries.html
Write your own queries at
http://provenance.mindswap.org/query/
We use SPARQL for all of our queries. Below is the syntax that we use for meeting each of the queries in the challenge. In some instances (for example, query 1), we give a full URI. Those URIs refer to a specific instance of a graphic or Service Execution. Those URIs can be replaced to perform the query about any other instance. They also can be substituted with a variable to query for the data related to any instance.
- Find the process that led to Atlas X Graphic / everything that caused Atlas X
Graphic to be as it is. This should tell us the new brain images from which the
averaged atlas was generated, the warping performed etc.
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?prop ?x WHERE {
?prop ?x.}
- Find the process that led to Atlas X Graphic, excluding everything prior to the
averaging of images with softmean.
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?x ?y ?z WHERE {
?x ?y ?z.
?prop ?x FILTER ( ?prop = prov:hasServiceExecutionAncestor ||
?prop = prov:producedFromServiceExecution).
?x prov:hasOutputFile ?f.
?f ?prop2 ?serv FILTER ( ?prop2 = prov:hasServiceExecutionAncestor ||
?prop2 = prov:producedFromServiceExecution).
?serv prov:serviceUsed
}
- Find the Stage 3, 4 and 5 details of the process that led to Atlas X Graphic.
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?x ?propX ?obX ?y ?propY ?obY ?z ?propZ WHERE {
prov:hasServiceExecutionAncestor ?x;
prov:hasServiceExecutionAncestor ?y;
prov:producedFromServiceExecution ?z.
?x prov:stage "3";
?propX ?obX.
?y prov:stage "4";
?propY ?obY.
?z prov:stage "5";
?propZ ?obZ.
}
- Find all invocations of procedure align_warp using a twelfth order nonlinear 1365 parameter model (see model menu describing possible values of parameter "-m 12" of align_warp) that ran on a Monday.
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?x WHERE {
?z rdfs:label ?x;
prov:dayOfWeekRun "Mon";
prov:hasTextInputParameters " -m -12 -q";
prov:serviceUsed .
}
- Find all Atlas Graphic images outputted from workflows where at least one of the
input Anatomy Headers had an entry global maximum=4095. The contents of a header file
can be extracted as text using the scanheader AIR utility.
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?z WHERE {
?x rdf:type ;
prov:hasFileAncestor ?f;
rdfs:label ?z.
?f prov:annotation "maximum=4095".
}
- Find all output averaged images of softmean (average) procedures, where the warped
images taken as input were align_warped using a twelfth order nonlinear 1365 parameter
model, i.e. "where softmean was preceded in the workflow, directly or indirectly, by
an align_warp procedure with argument -m 12."
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?q WHERE {
?q prov:producedFromServiceExecution ?m;
prov:hasFileAncestor ?y.
?m prov:serviceUsed .
?y prov:producedFromServiceExecution ?z.
?z
prov:hasTextInputParameters " -m -12 -q";
prov:serviceUsed .
}
- A user has run the workflow twice, in the second instance replacing each procedures
(convert) in the final stage with two procedures: pgmtoppm, then pnmtojpeg. Find the
differences between the two workflow runs. The exact level of detail in the difference
that is detected by a system is up to each participant.
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?z1 ?z2 WHERE {
?p1 ?x.
?x ?p2 ?z1.
?p1 ?y.
?y ?p2 ?z2.
FILTER (?z1 = ?z2)
}
Since we reference files by URI, the outputs will be very different. Each file output will differ, as well as the services used and inputs.
- A user has annotated some anatomy images with a key-value pair center=UChicago.
Find the outputs of align_warp where the inputs are annotated with center=UChicago.
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?y WHERE {
?x prov:annotation "center=UChicago".
?y prov:hasInputFile ?x;
prov:serviceUsed .
}
- A user has annotated some atlas graphics with key-value pair where the key is
studyModality. Find all the graphical atlas sets that have metadata annotation
studyModality with values speech, visual or audio, and return all other annotations to
these files.
PREFIX rdf:
PREFIX prov:
PREFIX rdfs:
SELECT DISTINCT ?xl ?z WHERE {
?x rdf:type prov:Graphic;
rdfs:label ?xl.
?x prov:annotation ?z.
?x prov:annotation ?a
FILTER ( ?a = "studyModality=speech" || ?a = "studyModality=audio" || ?a = "studyModality=visual" ).
}
Suggested Wokflow Variants
Suggest variants of the workflow that can exhibit capabilities that your system support.
Suggested Queries
Suggest significant queries that your system can support and are not in the proposed list of queries, and how you have implemented/would implement them. These queries may be with regards to a variant of the workflow suggested above.
Categorisation of queries
According to your provenance approach, you may be able to provide a categorisation of queries. Can you elaborate on the categorisation and its rationale.
Live systems
The system can be accessed and tested in its current form at
http://provenance.mindswap.org
Further Comments
Provide here further comments.
Conclusions
Provide here your conclusions on the challenge, and issues that you like to see discussed at a face to face meeting.
--
LucMoreau - 31 May 2006
--
JenniferGolbeck - 28 Jun 2006
to top