OPM Tutorial

OPM Tutorial

This tutorial on the Open Provenance Model was delivered by Luc Moreau (University of Southampton), Paul Groth (Free University of Amsterdam) and Jun Zhao (University of Oxford), at the Future Internet Symposium, Berlin, 20 September 2010.

Abstract

Where was a document found? How was this data set produced? Were all facts included in this decision? Were all the latest figures included in this diagram? Can this scientific experiment be reproduced? Provenance matters! Provenance, which is an explicit representation of the origin of data, is important for users to be able to put their trust in data. The Open Provenance Model (OPM) is a community-driven model for provenance, which originates from the Provenance Challenge series, allowing provenance to be exchanged between systems. In this tutorial, we will introduce the rationale for OPM, its concepts and theoretical underpinnings, its concrete bindings to XML Schema and OWL, and emerging profiles. We will run through a series of small case studies exploiting OPM provenance, and we will organize a hands-on session, where a small application is OPM-enabled, provenance is generated, visualized and exploited. All examples covered by this tutorial will exploit the Java-based OPMtoolbox, the OPM XML schema, the OPM OWL ontology, or the OPM Vocabulary.

Contents

  • Tutorial introduction slides (ppt).
  • Session 1: Background on Provenance and OPM (0.5 hour) slides (ppt).

    In session 1, you will learn about the notion of provenance and the "Open Provenance Vision", an architectural vision allowing the provenance of individual components to be expressed uniformly represented, connected in a coherent fashion, and queried seamlessly. These ideas were developed in the context of the Provenance Challenge Series, a community activity that led to the development of the Open Provenance Model.

  • Session 2: OPM Overview, Specification and Formalisation (1 hour) slides (ppt).

    In session 2, you will learn about the core constituents of the OPM data model, its inference rules and extant efforts for providing a semantics to OPM.

  • Session 3: OPM Bindings (0.5 hour) slides (ppt).

    In session 3, you will learn how to serialize an OPM Graph in a concrete serialization format. Specifically, this session will cover the XML Schema for OPM, the OWL ontology for OPM, and the challenges in implementing OPM inferences with Semantic Web technologies.

  • Session 4: Use Cases (1 hour) use cases slides (ppt), data.gov.uk slides (ppt).

    In session 4, you will learn about use cases that demonstrate the need for provenance, drawn from the W3C Incubator on provenance and the data.gov.uk application. You will also learn how to identify and organize provenance issues from a use case, and determine how OPM can help. To foster interactions, participants are invited to think about and bring their own use cases.

  • Session 5: Emerging Profiles (1 hour) slides (ppt).

    In session 5, you will learn about: how to extend OPM through profiles, the content of a profile, four emerging profiles for OPM, how to get involved with your own profile

  • Session 6: OPM and Inter-Operability (0.5 hour) slides (ppt).

    In session 6, you will learn about steps towards interoperability, interoperability challenges, and next steps towards achieving interoperability.

  • Session 7: Provenance Vocabulary (1 hour) slides (ppt).

    In session 7, you will learn about the Open Provenance Model Vocabulary (OPMV), a Semantic Web approach to provenance aiming to be compatible with OPM, and its application to the data.gov.uk application.

  • Session 8: Hands on Session (2 hours). slides (ppt).

    bring your own laptop!

    In Session 8, you will learn how to design an OPM graph, and generate it from Java (with the OPM toolbox) or craft it by hand (from Protege), or both; you will then learn how to embed provenance inside a document using ProvenanceJS.

    Ahead of the tutorial, you may want to install maven or protege if you want to produce OPM graph from Java or from Protege, respectively.

    With the opm toolbox slides, you will learn how to install the OPM Toolbox; how to write a Java program that generates an OPM graph; how to serialize an OPM graph to XML or RDF; how to convert and pretty print OPM graphs from the command line; how to contribute to the OPM toolbox development.

  • Downloadable Material

    All the material can be downloaded in a single archive, which includes all the slides, the OPM specification, the OPM toolbox, and a simple program using the opm toolbox (opmexample).

    License

    Creative Commons License
    The Open Provenance Tutorial at FIS'10, Berlin, 2010 by Luc Moreau, Paul Groth, Jun Zhao is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.