Open Provenance Model (OPM) XML Schema Specification

Working draft: 12 October 2010

This version:
http://openprovenance.org/model/opmx-20101012
Latest version:
http://openprovenance.org/model/opmx
Last Update:
Date:
Editors:
Luc Moreau (University of Southampton)
Paul Groth (Vrije University of Amsterdam)
Contributors in alphabetical order (see acknowledgements):
Ben Clifford
Simon Miles (King's College London)

Abstract

The Open Provenance Model is a model of provenance that is designed to meet the following requirements: (1) To allow provenance information to be exchanged between systems, by means of a compatibility layer based on a shared provenance model. (2) To allow developers to build and share tools that operate on such a provenance model. (3) To define provenance in a precise, technology-agnostic manner. (4) To support a digital representation of provenance for any 'thing', whether produced by computer systems or not. (5) To allow multiple levels of description to coexist. (6) To define a core set of rules that identify the valid inferences that can be made on provenance representation.

This document presents an XML Schema for the Open Provenance Model (v1.1) [OPM V1.1].

Table of Contents

  1. Introduction
  2. OPMX Schema at a glance
  3. OPMX Schema overview
    1. Simple Example
  4. Cross-reference for OPMX types
  5. References
  6. Acknowledgements

1 Introduction

The Open Provenance Model is a model of provenance that is designed to meet the following requirements: (1) To allow provenance information to be exchanged between systems, by means of a compatibility layer based on a shared provenance model. (2) To allow developers to build and share tools that operate on such a provenance model. (3) To define provenance in a precise, technology-agnostic manner. (4) To support a digital representation of provenance for any 'thing', whether produced by computer systems or not. (5) To allow multiple levels of description to coexist. (6) To define a core set of rules that identify the valid inferences that can be made on provenance representation.

The purpose of this document is to define an XML Schema to capture the concepts of the open provenance model [OPM V1.1]. Valid inferences are not captured by this specification; instead, we refer the reader to OPMO ontology [OPMO].

A design goal of this XMLSchema is that the XML serialization should be convertible into RDF (as per the OPMO ontology [OPMO]), and vice-versa, the RDF representation should be convertible into XML. The OWL ontology and the XML schemas were co-evolved to ensure that convertibility.

We adopt the following XML prefix and XML namespaces:

opmx:http://openprovenance.org/model/opmx#
opmo:http://openprovenance.org/model/opmo#
opmv:http://purl.org/net/opmv/ns#

The xsd file is accessible from [OPMX].

2. OPMX Schema at a glance

An alphabetical index of OPMX complex types.

Account, AccountRef, Accounts, Agent, AgentRef, Agents, Annotation, AnnotationRef, Annotations, Artifact, ArtifactRef, Artifacts, Dependencies, DependencyRef, EmbeddedAnnotation, Label, OPMGraph, OPMGraphRef, OTime, Overlaps, PName, Process, ProcessRef, Processes, Profile, Property, Role, RoleRef, Type, Used, UsedStar, Value, WasControlledBy, WasDerivedFrom, WasDerivedFromStar, WasGeneratedBy, WasGeneratedByStar, WasTriggeredBy, WasTriggeredByStar

The xsd file is accessible from [OPMX].

3. OPMX Schema overview

OPM define a notion of graphs. There are three kinds of nodes: Artifacts, Agents, Processes. (Note that the schema does not define the type Node.)

OPM Nodes

Five kinds of edges are supported: Used, WasGeneratedBy (WGB), WasDerivedFrom (WDF), WasControlledBy (WCB) and WasTriggeredBy (WTB). (Note that the schema does not define the type Edge.)

OPM Edges

Edges have specific source (effect) and specific destination (cause). Used has an Artifact as an effect, and a Process as a Cause; WasGeneratedBy (WGB) has an Artifact as an effect, and a Process as a cause; WasDerivedFrom (WDF) has Artifacts as cause and effect; WasControlledBy (WCB) has a Process as an effect, and an Agent as a Cause; WasTriggeredBy (WTB) has Processes as cause and effect. Some edges have a Role and Time information associated with them.

OPM Dependencies

Nodes, edges, and annotations can belong to Accounts. A graph enumerates the nodes, edges, annotations and accounts it contains.

OPM Accounts

Annotable entities can be associated with Annotations. (Note that the schema does not define the type Annotable.)

OPM Annotations

The OPMX XML schema uses xsd:IDREF to identify nodes, edges, accounts in an OPM graph.

3.1. Example

Here is a simple OPM graph, inspired from the First Provenance Challenge workflow. Using the OPM graphical notation, we have the following OPM graph:

PC1

Two representations of this OPM graph have been produced. The first maps to RDF, according to the OPMO ontology, and is represented in the N3 notation. The second is a serialization in XML compatible with OPMX Schema.

  • Representation in RDF N3 notation
  • Representation in XML.
  • 4. Cross-reference for OPMX types

    -->

    Account

    QName: {http://openprovenance.org/model/opmx#}Account

    -
    <xs:complexType name="Account">
    -
    <xs:sequence>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    AccountRef

    QName: {http://openprovenance.org/model/opmx#}AccountRef

    -
    <xs:complexType name="AccountRef">
    <xs:attribute name="ref" type="xs:IDREF"/>
    </xs:complexType>

    Accounts

    QName: {http://openprovenance.org/model/opmx#}Accounts

    -
    <xs:complexType name="Accounts">
    -
    <xs:sequence>
    <xs:element name="account" type="opmx:Account" minOccurs="0" maxOccurs="unbounded"/>
    <xs:element name="overlaps" type="opmx:Overlaps" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    </xs:complexType>

    Agent

    QName: {http://openprovenance.org/model/opmx#}Agent

    -
    <xs:complexType name="Agent">
    -
    <xs:sequence>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    AgentRef

    QName: {http://openprovenance.org/model/opmx#}AgentRef

    -
    <xs:complexType name="AgentRef">
    <xs:attribute name="ref" type="xs:IDREF"/>
    </xs:complexType>

    Agents

    QName: {http://openprovenance.org/model/opmx#}Agents

    -
    <xs:complexType name="Agents">
    -
    <xs:sequence>
    <xs:element name="agent" type="opmx:Agent" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    </xs:complexType>

    Annotation

    QName: {http://openprovenance.org/model/opmx#}Annotation

    -
    <xs:complexType name="Annotation">
    -
    <xs:complexContent>
    -
    <xs:extension base="opmx:EmbeddedAnnotation">
    -
    <xs:sequence>
    -
    <xs:choice minOccurs="0" maxOccurs="1">
    <xs:element name="externalSubject" type="xs:anyURI"/>
    <xs:element name="localSubject" type="xs:IDREF"/>
    </xs:choice>
    </xs:sequence>
    </xs:extension>
    </xs:complexContent>
    </xs:complexType>

    AnnotationRef

    QName: {http://openprovenance.org/model/opmx#}AnnotationRef

    -
    <xs:complexType name="AnnotationRef">
    <xs:attribute name="ref" type="xs:IDREF"/>
    </xs:complexType>

    Annotations

    QName: {http://openprovenance.org/model/opmx#}Annotations

    -
    <xs:complexType name="Annotations">
    -
    <xs:sequence>
    <xs:element name="annotation" type="opmx:Annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    </xs:complexType>

    Artifact

    QName: {http://openprovenance.org/model/opmx#}Artifact

    -
    <xs:complexType name="Artifact">
    -
    <xs:sequence>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    ArtifactRef

    QName: {http://openprovenance.org/model/opmx#}ArtifactRef

    -
    <xs:complexType name="ArtifactRef">
    <xs:attribute name="ref" type="xs:IDREF"/>
    </xs:complexType>

    Artifacts

    QName: {http://openprovenance.org/model/opmx#}Artifacts

    -
    <xs:complexType name="Artifacts">
    -
    <xs:sequence>
    <xs:element name="artifact" type="opmx:Artifact" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    </xs:complexType>

    Dependencies

    QName: {http://openprovenance.org/model/opmx#}Dependencies

    -
    <xs:complexType name="Dependencies">
    -
    <xs:sequence>
    -
    <xs:choice minOccurs="0" maxOccurs="unbounded">
    <xs:element name="used" type="opmx:Used"/>
    <xs:element name="wasGeneratedBy" type="opmx:WasGeneratedBy"/>
    <xs:element name="wasTriggeredBy" type="opmx:WasTriggeredBy"/>
    <xs:element name="wasDerivedFrom" type="opmx:WasDerivedFrom"/>
    <xs:element name="wasControlledBy" type="opmx:WasControlledBy"/>
    <xs:element name="usedStar" type="opmx:UsedStar"/>
    <xs:element name="wasGeneratedByStar" type="opmx:WasGeneratedByStar"/>
    <xs:element name="wasDerivedFromStar" type="opmx:WasDerivedFromStar"/>
    </xs:choice>
    </xs:sequence>
    </xs:complexType>

    DependencyRef

    QName: {http://openprovenance.org/model/opmx#}DependencyRef

    -
    <xs:complexType name="DependencyRef">
    <xs:attribute name="ref" type="xs:IDREF"/>
    </xs:complexType>

    EmbeddedAnnotation

    QName: {http://openprovenance.org/model/opmx#}EmbeddedAnnotation

    -
    <xs:complexType name="EmbeddedAnnotation">
    -
    <xs:sequence>
    <xs:element name="property" type="opmx:Property" minOccurs="1" maxOccurs="unbounded"/>
    <xs:element name="account" type="opmx:AccountRef" minOccurs="0" maxOccurs="unbounded"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    Label

    QName: {http://openprovenance.org/model/opmx#}Label

    -
    <xs:complexType name="Label">
    -
    <xs:complexContent>
    -
    <xs:extension base="opmx:EmbeddedAnnotation">
    <xs:attribute name="value" type="xs:string"/>
    </xs:extension>
    </xs:complexContent>
    </xs:complexType>

    OPMGraph

    QName: {http://openprovenance.org/model/opmx#}OPMGraph

    -
    <xs:complexType name="OPMGraph">
    -
    <xs:annotation>
    -
    <xs:appinfo>
    -
    <jxb:class>
    -
    <jxb:javadoc>
    Java class for OPMGraph complex type. See <A href="http://twiki.ipaw.info/bin/view/Challenge/OPM1-01Review">OPMGraph</A>.
    </jxb:javadoc>
    </jxb:class>
    </xs:appinfo>
    </xs:annotation>
    -
    <xs:sequence>
    <xs:element name="accounts" type="opmx:Accounts" minOccurs="0"/>
    <xs:element name="processes" type="opmx:Processes" minOccurs="0"/>
    <xs:element name="artifacts" type="opmx:Artifacts" minOccurs="0"/>
    <xs:element name="agents" type="opmx:Agents" minOccurs="0"/>
    <xs:element name="dependencies" type="opmx:Dependencies" minOccurs="0"/>
    <xs:element name="annotations" type="opmx:Annotations" minOccurs="0"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    OPMGraphRef

    QName: {http://openprovenance.org/model/opmx#}OPMGraphRef

    -
    <xs:complexType name="OPMGraphRef">
    <xs:attribute name="ref" type="xs:IDREF"/>
    </xs:complexType>

    OTime

    QName: {http://openprovenance.org/model/opmx#}OTime

    -
    <xs:complexType name="OTime">
    -
    <xs:annotation>
    -
    <xs:documentation>
    Observed Time allow for interval of observation, where an event is said to occur no earlier than a given time t1 and no later than a given time t2. When the event is observed to occur at a specific time, it is not convenient to use an interval. Instead, one can use the alternate exactlyAt attribute. We note that exactlyAt is disjoint from noEarlierThan and noLaterThan.
    </xs:documentation>
    </xs:annotation>
    <xs:sequence> </xs:sequence>
    <xs:attribute name="noEarlierThan" type="xs:dateTime"/>
    <xs:attribute name="noLaterThan" type="xs:dateTime"/>
    <xs:attribute name="exactlyAt" type="xs:dateTime"/>
    </xs:complexType>

    Overlaps

    QName: {http://openprovenance.org/model/opmx#}Overlaps

    -
    <xs:complexType name="Overlaps">
    -
    <xs:sequence>
    <xs:element name="account" type="opmx:AccountRef" minOccurs="2" maxOccurs="2"/>
    </xs:sequence>
    </xs:complexType>

    PName

    QName: {http://openprovenance.org/model/opmx#}PName

    -
    <xs:complexType name="PName">
    -
    <xs:complexContent>
    -
    <xs:extension base="opmx:EmbeddedAnnotation">
    <xs:attribute name="value" type="xs:anyURI"/>
    </xs:extension>
    </xs:complexContent>
    </xs:complexType>

    Process

    QName: {http://openprovenance.org/model/opmx#}Process

    -
    <xs:complexType name="Process">
    -
    <xs:sequence>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    ProcessRef

    QName: {http://openprovenance.org/model/opmx#}ProcessRef

    -
    <xs:complexType name="ProcessRef">
    <xs:attribute name="ref" type="xs:IDREF"/>
    </xs:complexType>

    Processes

    QName: {http://openprovenance.org/model/opmx#}Processes

    -
    <xs:complexType name="Processes">
    -
    <xs:sequence>
    <xs:element name="process" type="opmx:Process" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    </xs:complexType>

    Profile

    QName: {http://openprovenance.org/model/opmx#}Profile

    -
    <xs:complexType name="Profile">
    -
    <xs:complexContent>
    -
    <xs:extension base="opmx:EmbeddedAnnotation">
    <xs:attribute name="value" type="xs:anyURI"/>
    </xs:extension>
    </xs:complexContent>
    </xs:complexType>

    Property

    QName: {http://openprovenance.org/model/opmx#}Property

    -
    <xs:complexType name="Property">
    -
    <xs:sequence>
    <xs:element name="value" type="xs:anyType"/>
    </xs:sequence>
    <xs:attribute name="key" type="xs:anyURI"/>
    </xs:complexType>

    Role

    QName: {http://openprovenance.org/model/opmx#}Role

    -
    <xs:complexType name="Role">
    -
    <xs:sequence>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="value" type="xs:string"/>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    RoleRef

    QName: {http://openprovenance.org/model/opmx#}RoleRef

    -
    <xs:complexType name="RoleRef">
    <xs:attribute name="ref" type="xs:IDREF"/>
    </xs:complexType>

    Type

    QName: {http://openprovenance.org/model/opmx#}Type

    -
    <xs:complexType name="Type">
    -
    <xs:complexContent>
    -
    <xs:extension base="opmx:EmbeddedAnnotation">
    <xs:attribute name="value" type="xs:anyURI"/>
    </xs:extension>
    </xs:complexContent>
    </xs:complexType>

    Used

    QName: {http://openprovenance.org/model/opmx#}Used

    -
    <xs:complexType name="Used">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ProcessRef"/>
    <xs:element name="role" type="opmx:Role"/>
    <xs:element name="cause" type="opmx:ArtifactRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element name="time" type="opmx:OTime" minOccurs="0"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    UsedStar

    QName: {http://openprovenance.org/model/opmx#}UsedStar

    -
    <xs:complexType name="UsedStar">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ProcessRef"/>
    <xs:element name="cause" type="opmx:ArtifactRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    Value

    QName: {http://openprovenance.org/model/opmx#}Value

    -
    <xs:complexType name="Value">
    -
    <xs:complexContent>
    -
    <xs:extension base="opmx:EmbeddedAnnotation">
    -
    <xs:sequence>
    <xs:element name="content" type="xs:anyType" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="encoding" type="xs:anyURI"/>
    </xs:extension>
    </xs:complexContent>
    </xs:complexType>

    WasControlledBy

    QName: {http://openprovenance.org/model/opmx#}WasControlledBy

    -
    <xs:complexType name="WasControlledBy">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ProcessRef"/>
    <xs:element name="role" type="opmx:Role"/>
    <xs:element name="cause" type="opmx:AgentRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element name="startTime" type="opmx:OTime" minOccurs="0"/>
    <xs:element name="endTime" type="opmx:OTime" minOccurs="0"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    WasDerivedFrom

    QName: {http://openprovenance.org/model/opmx#}WasDerivedFrom

    -
    <xs:complexType name="WasDerivedFrom">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ArtifactRef"/>
    <xs:element name="cause" type="opmx:ArtifactRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element name="time" type="opmx:OTime" minOccurs="0"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    WasDerivedFromStar

    QName: {http://openprovenance.org/model/opmx#}WasDerivedFromStar

    -
    <xs:complexType name="WasDerivedFromStar">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ArtifactRef"/>
    <xs:element name="cause" type="opmx:ArtifactRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    WasGeneratedBy

    QName: {http://openprovenance.org/model/opmx#}WasGeneratedBy

    -
    <xs:complexType name="WasGeneratedBy">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ArtifactRef"/>
    <xs:element name="role" type="opmx:Role"/>
    <xs:element name="cause" type="opmx:ProcessRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element name="time" type="opmx:OTime" minOccurs="0"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    WasGeneratedByStar

    QName: {http://openprovenance.org/model/opmx#}WasGeneratedByStar

    -
    <xs:complexType name="WasGeneratedByStar">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ArtifactRef"/>
    <xs:element name="cause" type="opmx:ProcessRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    WasTriggeredBy

    QName: {http://openprovenance.org/model/opmx#}WasTriggeredBy

    -
    <xs:complexType name="WasTriggeredBy">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ProcessRef"/>
    <xs:element name="cause" type="opmx:ProcessRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element name="time" type="opmx:OTime" minOccurs="0"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    WasTriggeredByStar

    QName: {http://openprovenance.org/model/opmx#}WasTriggeredByStar

    -
    <xs:complexType name="WasTriggeredByStar">
    -
    <xs:sequence>
    <xs:element name="effect" type="opmx:ProcessRef"/>
    <xs:element name="cause" type="opmx:ProcessRef"/>
    <xs:element name="account" minOccurs="0" maxOccurs="unbounded" type="opmx:AccountRef"/>
    <xs:element ref="opmx:annotation" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID"/>
    </xs:complexType>

    5. References

    6. Acknowledgements

    Ben Clifford and Simon Miles commented on previous versions of the schema.