Skip to topic | Skip to bottom

Provenance Challenge

Challenge
Challenge.OPM1-01Review-FormalModelWithTime

Start of topic | Skip to actions
Open Provenance Model Contents
  1. Introduction
  2. Basics
  3. Overlapping and Hierarchichal Descriptions
  4. Provenance Graph Definition
  5. Timeless Formal Model
  6. Inferences
  7. Formal Model and Time Annotations
  8. Time Constraints and Inferences
  9. Support for Collections
  10. Example of Representation
  11. Conclusion
  12. Best Practice on the Use of Agensts
  13. References

7 Formal Model and Time Annotations

The Open Provenance Model allows for causality graphs to be annotated with time annotations. In this model, time is not intended to be used for deriving causality: if causal dependencies exist, they need to be made explicit with the appropriate edges. However, time may have been observed during the course of a process, and we would expect such time information to be compatible with causal dependencies: the time of an effect should be greater than the time of its cause (for a same clock). Hence, time is useful in validating causality claims.

In the Open Provenance Model, time may be associated to instantaneous occurrences in a process. We currently recognize four instantaneous occurrences, which have a reasonable shared understanding in real life and computer systems. Two of them pertain to artifacts, whereas the other two relate to processes. For artifacts, we consider the occurrences of creation and use, whereas for processes, we consider their starting and ending.

The rationale for choosing instant time for the OPM model is the same as for adopting artifacts as immutable pieces of state. At a specific time, an object we consider will be in a specific state, which we refer to as artifact, and for which we can express the causality path that led to the object being in such a state.

In some scenarios, occurrences of use or creation of objects and occurrences of starting or ending of processes may not be instantenous. To capture such scenarios, detailed processes and artifacts, and their respective causal dependencies, need to be made explicit, in order to be expressible in the OPM model. For instance, the starting of a nuclear power plant is not usefully modelled as an instantatenous occurrence, when one tries to understand failures that occurred during this activity; hence, this whole starting occurrence must be modelled by one process (or possibly several), which in turn have instanenous beginnings and endings.

In the Open Provenance Model, time information is expected to be obtained by observing a clock when an occurrence occurs. Given that time is observed, time accuracy is limited by the granularity of the clock and the granularity of the observer's activities. Hence, while the notion of time we consider is instantaneous, the model allows for an interval of accuracy to support granularity of clocks and observers. In the OPM model, an instantaneous occurrence happening at time t is annotated by two observation times tm,tM, such that the occurrence is known to have occurred no later than tM and no earlier than tm. Hence, t ∈ [tm,tM].

Causality Graph Data Model and Time Annotations
Figure 13: Causality Graph Data Model and Time Annotations

Concretely, for an artifact, we will be able to state that it was used (or generated by) no earlier than time t1 or no later than time t2. For a process, we will be able to state that it was started (or terminated), no earlier than time t1 or no later than time t2 .

In Figure 13, we revisit our formal model, examining where time annotations are permitted. We first introduce a new primitive set Time, for which a given serialization will specify a format (such as the standard coordinated universal time, UTC). We then introduce Observed Time as a pair of time values (whose set is OTime). All time annotations are optional, which we note by OTime0 in the definitions.

Edges involve OTime in their cartesian product. Edges from WasGeneratedBy and Used can be annotated by an optional timestamp, marking the associated artifact was known to be generated or used, at a given time (expressed as an observation interval).

For WasControlledBy, we allow two optional timestamps marking when the process was known to be started or terminated, respectively.

For WasDerivedFrom, we also allow one optional timestamp. Given Figure 9 and associated inferences, for a given edge < a1,a2,acc > ∈ wasDerivedFrom, there is an implicit process that generated a1 and that consumed a2. The time annotation indicates when the artifact was generated.

Likewise, for !WasTriggeredBy, we also allow one optional timestamp. Given Figure 9 and associated inferences, for a given edge < p1,p2,acc > ∈ WasTriggeredBy , there is an implicit artifact that was used by p1 and generated by p2. The time annotations indicates the time when the artifact was used by p1.


Comments


to top


End of topic
Skip to action links | Back to top

I Attachment sort Action Size Date Who Comment
fig13.jpg manage 165.6 K 31 Jul 2008 - 03:10 PaulGroth  

You are here: Challenge > OPM1-01Review-FormalModelWithTime

to top

Copyright © 1999-2012 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.