<<O>>  Difference Topic OntoGrid (r1.13 - 28 Jun 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 10 to 10

Changed:
<
<
  • KOPE-2ndProvenanceChallenge.ppt: Slides on the knowledge-oriented provenance environment used during the presentation at the Second Provenance Challenge (June 26th 2007, Monterey Ca)
>
>
  • Slides on the knowledge-oriented provenance environment, used during the Second Provenance Challenge workshop (June 26th 2007, Monterey Ca): KOPE-2ndProvenanceChallenge.ppt

OntoGrid Approach Overview

 <<O>>  Difference Topic OntoGrid (r1.12 - 28 Jun 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 10 to 10

Added:
>
>
  • KOPE-2ndProvenanceChallenge.ppt: Slides on the knowledge-oriented provenance environment used during the presentation at the Second Provenance Challenge (June 26th 2007, Monterey Ca)

OntoGrid Approach Overview

According to the myGrid provenance pyramid, we can structure provenance information as a pyramid with four main levels: Data, Organization, Process, and Knowledge. The first three levels have been widely addressed in the available literature and in the first edition of the Challenge. Thus, our focus is primarly on the Knowledge level. Knowledge provenance, settled on top of the provenance pyramid, is focused on the interpretation of the information registered by the other three.

Line: 108 to 111

META FILEATTACHMENT Stage1-2.xml attr="" comment="Part 1: align_warp and reslice (stages 1 and 2)" date="1180363901" path="Stage1-2.xml" size="44234" user="JoseManuel" version="1.1"
META FILEATTACHMENT Stage3.xml attr="" comment="Part 2: softmean (stage 3)" date="1180363943" path="Stage3.xml" size="31894" user="JoseManuel" version="1.1"
META FILEATTACHMENT Stage4-5.xml attr="" comment="Part 3: slicer and convert (stages 4 and 5)" date="1180363974" path="Stage4-5.xml" size="30277" user="JoseManuel" version="1.1"
Added:
>
>
META FILEATTACHMENT KOPE-2ndProvenanceChallenge.ppt attr="" comment="Slides on the knowledge-oriented provenance environment used during the presentation at the Second Provenance Challenge (June 26th 2007, Monterey Ca)" date="1183045753" path="KOPE-2nd Provenance Challenge.ppt" size="947200" user="JoseManuel" version="1.1"
 <<O>>  Difference Topic OntoGrid (r1.11 - 22 Jun 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 7 to 7

Participating Team

Changed:
<
<
  • Participant names: Jose Manuel Gomez-Perez, Francisco Javier García (iSOCO), Chris Van Aart (Y'All)
>
>
  • Participant names: Jose Manuel Gomez-Perez, Francisco Javier García (iSOCO), Rafael González-Cabero (UPM), Chris Van Aart (Y'All)

OntoGrid Approach Overview

 <<O>>  Difference Topic OntoGrid (r1.10 - 28 May 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 60 to 60

The original process documentation, compliant with the Southampton datamodel, for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors. A file containing this provenance information can be retrieved by selecting "p-struct export".

Changed:
<
<
Links to the semantically enhanced individual logs for each workflow stage will be provided in short.
>
>
The next are links to the semantically enhanced individual logs for each workflow stage:

  • Stage1-2.xml: Part 1: align_warp and reslice (stages 1 and 2)


Model Integration Results

Line: 86 to 93

-- SimonMiles - 11 Dec 2006

Changed:
<
<
-- JoseManuel - 27 May 2007
>
>
-- JoseManuel - 28 May 2007

META FILEATTACHMENT psmontology.rdfs attr="" comment="PSM meta-model" date="1180099730" path="psmontology.rdfs" size="5746" user="JoseManuel" version="1.1"
META FILEATTACHMENT catalogation_task-method_decomposition.jpg attr="" comment="Task-Method decomposition of a generic catalogation task" date="1180268906" path="catalogation_task-method_decomposition.jpg" size="27169" user="JoseManuel" version="1.1"
Line: 98 to 105

META FILEATTACHMENT ChallengeLog?.xml attr="" comment="Semantically-enhanced provenance log for the population-based brain atlas workflow" date="1180288489" path="ChallengeLog.xml" size="89749" user="JoseManuel" version="1.1"
META FILEATTACHMENT AlgorithmResult?.xml attr="" comment="XML structure representing the PSM hierarchy and their enriched roles" date="1180340422" path="AlgorithmResult.xml" size="5293" user="JoseManuel" version="1.2"
META FILEATTACHMENT ChallengeProvenanceLog?.xml attr="" comment="Semantically-enhanced provenance log for the population-based brain atlas workflow" date="1180340350" path="ChallengeProvenanceLog.xml" size="89769" user="JoseManuel" version="1.1"
Added:
>
>
META FILEATTACHMENT Stage1-2.xml attr="" comment="Part 1: align_warp and reslice (stages 1 and 2)" date="1180363901" path="Stage1-2.xml" size="44234" user="JoseManuel" version="1.1"
META FILEATTACHMENT Stage3.xml attr="" comment="Part 2: softmean (stage 3)" date="1180363943" path="Stage3.xml" size="31894" user="JoseManuel" version="1.1"
META FILEATTACHMENT Stage4-5.xml attr="" comment="Part 3: slicer and convert (stages 4 and 5)" date="1180363974" path="Stage4-5.xml" size="30277" user="JoseManuel" version="1.1"
 <<O>>  Difference Topic OntoGrid (r1.9 - 28 May 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 37 to 37

Explicit representation of domain and PSM meta-model as ontologies allows bridging domain and PSM entities, hence using domain-independent PSM to interpret domain-specific provenance. We have exploited the flexibility of the content tag of the Southampton datamodel to include in the interacion p-assertions of the process documentation corresponding to the challenge workflow semantic annotations on the data exchanged by the services involved. These annotations allow us to relate domain entities, e.g. brain atlas images, with the input and output roles of the PSM category, e.g. initial observation, used to interpret the provenance log.

Changed:
<
<
  • ChallengeLog.xml: Semantically-enhanced provenance log for the population-based brain atlas workflow
>
>

Roles serve two purposes, first they act as containers for domain concepts and, second, as pointers to the types of domain concepts that can play this role. For example, brain image can be the input role of an initial observation during the catalogation process. Roles can be either static or dynamic. Static roles contain concepts that are persistent across the process. Dynamic roles contain concepts that change during the reasoning process. Dynamic roles characterize the process because they are constantly manipulated by the PSM in which they occur.

Line: 96 to 96

META FILEATTACHMENT psmontologyModel.rdf attr="" comment="Catalogation PSM library" date="1180282228" path="psmontologyModel.rdf" size="34792" user="JoseManuel" version="1.1"
META FILEATTACHMENT psmontologyModel.jpg attr="" comment="PSM meta-model" date="1180283076" path="psmontologyModel.jpg" size="46247" user="JoseManuel" version="1.1"
META FILEATTACHMENT ChallengeLog?.xml attr="" comment="Semantically-enhanced provenance log for the population-based brain atlas workflow" date="1180288489" path="ChallengeLog.xml" size="89749" user="JoseManuel" version="1.1"
Changed:
<
<
META FILEATTACHMENT AlgorithmResult?.xml attr="" comment="XML structure representing the PSM hierarchy and their enriched roles" date="1180292437" path="AlgorithmResult.xml" size="9347" user="JoseManuel" version="1.1"
>
>
META FILEATTACHMENT AlgorithmResult?.xml attr="" comment="XML structure representing the PSM hierarchy and their enriched roles" date="1180340422" path="AlgorithmResult.xml" size="5293" user="JoseManuel" version="1.2"
META FILEATTACHMENT ChallengeProvenanceLog?.xml attr="" comment="Semantically-enhanced provenance log for the population-based brain atlas workflow" date="1180340350" path="ChallengeProvenanceLog.xml" size="89769" user="JoseManuel" version="1.1"
 <<O>>  Difference Topic OntoGrid (r1.8 - 27 May 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 48 to 48

Task-Method decomposition of a generic catalogation task

Changed:
<
<
At the top level, the brain atlas creation process is described as an overall catalogation task whose input role is initial observation and its output role is information representation. After applying the prime catalogue method, a refinement of such task is produced, as shown in the next figure. This refinement shows the subtasks into which the method decomposes the current task and how they are connected by means of intermediate roles.
>
>
At the top level, the brain atlas creation process is described as an overall catalogation task whose input role is initial observation and its output role is information representation. After applying the prime catalogue method, a refinement of such task is produced, as shown in the next figure. This refinement shows the subtasks (ellipses) into which the method decomposes the current task and how they are connected by means of roles (squares). Subsequent refinements of each of these subtasks are provided by other PSM.

Prime catalogue method

Changed:
<
<
During this iterative refinement process, we match, for each level of the task-method decomposition, the data flow of the current PSM with the provenance DAG (p-DAG) by means of detecting paths in the p-DAG between data sets which correspond to the data flow between the input and output roles of the PSM. The hierarchical structure of PSM i) guides the search process and ii) allows producing interpretations of provenance at different levels of detail.
>
>
During this iterative refinement process, we match, for each level of the task-method decomposition, the data flow of the current PSM with the provenance DAG (i.e. the p-DAG, a directed acyclic graph whose nodes are data pieces exchanged by the services comprised in the distributed application and edges represent the precedence relation between them) by means of detecting paths (in fact, twigs) in the p-DAG between data sets which correspond to the data flow between the input and output roles of the PSM. The hierarchical structure of PSM i) guides the search process within the p-DAG and ii) allows producing interpretations of provenance at different levels of detail. As a result, the roles of the succeeding PSM are enriched with domain information extracted from the the p-DAG nodes against which they are successfully matched. The ouput of the process is an XML structure representing the PSM hierarchy and their enriched roles. This structure is later used to produce a graphic representation of the provenance interpretation.

Changed:
<
<
The original process documentation, according to the Southampton datamodel, for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors. A file containing this provenance information can be retrieved by selecting "p-struct export".
>
>
  • AlgorithmResult.xml: XML structure representing the PSM hierarchy and their enriched roles resulting from the provenance interpretation process using PSM.

Changed:
<
<
Links to the individual logs for each workflow stage will be provided in short.
>
>
The original process documentation, compliant with the Southampton datamodel, for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors. A file containing this provenance information can be retrieved by selecting "p-struct export".

Added:
>
>
Links to the semantically enhanced individual logs for each workflow stage will be provided in short.

Model Integration Results

Changed:
<
<
The provenance interpretation approach uses provenance logs produced accordingly to the provenance datamodel of the University of Southampton.
>
>
The provenance interpretation approach uses process documentation produced accordingly to the provenance datamodel of the University of Southampton. The provenance to be interpreted with our PMS libraries is obtained by means of provenance queries, using the pre-existing querying facilities.

Translation Details

Added:
>
>
The semantically-enhanced provenance log of the brain atlas workflow has been processed in order to produce an explicit representation of the p-DAG. This pre-processing eases the matching process between (in this case) the Catalogation PSM library and the Brain Atlas p-DAG, which is the core of the provenance interpretation based on PSM. The resulting structure simplifies creating XPath queries against the p-DAG XML, which relate PSM input and output role sets.

Benchmarks

Describe your proposed benchmark queries, how the comparable quantities are determined, and the results of applying the benchmark to your own system

Added:
>
>
Our approach to provenance interpretation is not focused on querying provenance but rather on interpreting, i.e. explaining, distributed process executions by the provenance information returned by pre-existing provenance query facilities. Thus, it is orthogonal to any system compliant with the University of Southampton datamodel. Thus, benhmarking can not appeal to its querying capabilities (as they are the same of the underlying provenance infrastructure) but rather to the quantity, i.e. does the system detect all the expected occurrences of the available PSM in the provenance logs?, and quality of the intepretations produced, i.e. do they contribute for a better understanding of process executions by non-provenance knowledgeable users?

Further Comments

Provide here further comments.

Line: 82 to 86

-- SimonMiles - 11 Dec 2006

Changed:
<
<
-- JoseManuel - 19 Feb 2007
>
>
-- JoseManuel - 27 May 2007

META FILEATTACHMENT psmontology.rdfs attr="" comment="PSM meta-model" date="1180099730" path="psmontology.rdfs" size="5746" user="JoseManuel" version="1.1"
META FILEATTACHMENT catalogation_task-method_decomposition.jpg attr="" comment="Task-Method decomposition of a generic catalogation task" date="1180268906" path="catalogation_task-method_decomposition.jpg" size="27169" user="JoseManuel" version="1.1"
Line: 94 to 96

META FILEATTACHMENT psmontologyModel.rdf attr="" comment="Catalogation PSM library" date="1180282228" path="psmontologyModel.rdf" size="34792" user="JoseManuel" version="1.1"
META FILEATTACHMENT psmontologyModel.jpg attr="" comment="PSM meta-model" date="1180283076" path="psmontologyModel.jpg" size="46247" user="JoseManuel" version="1.1"
META FILEATTACHMENT ChallengeLog?.xml attr="" comment="Semantically-enhanced provenance log for the population-based brain atlas workflow" date="1180288489" path="ChallengeLog.xml" size="89749" user="JoseManuel" version="1.1"
Added:
>
>
META FILEATTACHMENT AlgorithmResult?.xml attr="" comment="XML structure representing the PSM hierarchy and their enriched roles" date="1180292437" path="AlgorithmResult.xml" size="9347" user="JoseManuel" version="1.1"
 <<O>>  Difference Topic OntoGrid (r1.7 - 27 May 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 10 to 10

Changed:
<
<

Differences from First Challenge

>
>

OntoGrid Approach Overview


Changed:
<
<
This is the first participation of OntoGrid in the Provenance Challenge. According to the myGrid provenance pyramid, we can structure provenance information as a pyramid with four main levels: Data, Organization, Process, and Knowledge. According to our experience, the first three levels have been widely addressed in the available literature and in the first edition of the Challenge. However, our focus is on the Knowledge level. Knowledge provenance, settled on top of the provenance pyramid, is focused on the interpretation of the information registered by the other three. Hence, under a pragmatical viewpoint we aim at interpreting the process documentation recorded by third party provenance systems. Currently, our provenance interpretation system is compliant with the provenance datamodel produced by the University of Southampton.
>
>
According to the myGrid provenance pyramid, we can structure provenance information as a pyramid with four main levels: Data, Organization, Process, and Knowledge. The first three levels have been widely addressed in the available literature and in the first edition of the Challenge. Thus, our focus is primarly on the Knowledge level. Knowledge provenance, settled on top of the provenance pyramid, is focused on the interpretation of the information registered by the other three.

Changed:
<
<
OntoGrid's process documentation technology, accordingly to the S-OGSA architecture, is the Business Process Monitor (BPM). BPM produces process documentation in a format compliant with the Southampton datamodel. We intend to use BPM as the infrastructure for documenting processes in distributed environments. BPM also provides a series of inspection methods which allow querying such process documentation. More information about BPM can be obtained in http://www.insurancegrid.org/bpm/ (click on "About").
>
>
This is the first participation of OntoGrid in the Provenance Challenge. We aim at semantically interpreting the process documentation recorded by third party provenance systems by means of knowledge representation techniques from the field of Semantic Web that help users unacquainted with provenance systems to understand the results of a provenance query and therefore the execution of a distributed process.

Changed:
<
<
Our approach to analysing past processes through their preexisting provenance logs is based on ProblemSolvingMethods (PSMs). A PSM is defined as a sequence of actions that accomplish a task in a specific domain. In this regard, PSM have been traditionally used in AI as an analytic tool to model, establish, and control the sequence of actions required to solve domain-specific tasks by means of decomposing them into simpler subtasks down to the level of primitive actions.
>
>
Our provenance interpretation system is compliant with the provenance datamodel of the University of Southampton. OntoGrid's distributed process documentation technology, accordingly to the S-OGSA architecture, the Business Process Monitor (BPM), records process documentation in a format compliant with such datamodel. We intend to use BPM as the infrastructure for documenting processes in distributed environments. BPM also provides a series of inspection methods which allow querying such process documentation. More information about BPM can be obtained in http://www.insurancegrid.org/bpm/ (click on "About").

Changed:
<
<
PSM can be also used to analyse computational problem-solving behavior or reasoning, beyond invocation and execution of a reasoning process implemented as a chain of services. In a service-oriented scenario, PSM can be used as high-level, domain-independent, knowledge templates that support interpretation of process execution. Our aim is to interpret provenance data (logs) at the Knowledge Level, using the PSM analytic approach, providing a process analysis component that detects occurrences of PSM amidst process documentation as abstractions of the methods applied to execute a particular task.
>
>
Our approach to analysing past processes through their provenance logs is based on ProblemSolvingMethods (PSMs). A PSM is defined as a sequence of actions that accomplish a task in a specific domain. PSM have been traditionally used in Artificial Intelligence as a tool to model, establish, and control the sequence of actions required to solve domain-specific tasks by means of decomposing them into simpler subtasks down to the level of primitive actions.

Changed:
<
<
On the one hand, different PSM categories (like those identified in CommonKADS) can produce interpretations of process documentation under different perspectives, e.g. validation, monitoring, or diagnostics. On the other hand, more domain-oriented PSM categories can be built which are focused on related applications with the goal of validating their compliance with their original specification.
>
>
Beyond execution of a reasoning process, PSM can also be used to analyze computational problem-solving behavior or reasoning in a service-oriented scenario. In this regard, PSM are high-level, domain-independent, knowledge templates that explain the underlying reasoning going on within a process execution. Our aim is to interpret provenance data (logs) at the knowledge level of the provenance pyramid, using the PSM analytic approach. Our provenance system detects occurrences of PSM amidst process documentation resulting from the execution of a particular task.

There are two different approaches towards using PSM for knowledge provenance. On the one hand, domain-independent PSM describing different reasoning processes, like those identified in CommonKADS, e.g. validation, monitoring, or diagnostics, can be used to come up with a description, in terms of the particular domain of application, of the reasoning process which has taken place during execution. On the other hand, domain-level (instead of reasoning-level), process-oriented PSM can be used to specify the knowledge flow of distributed applications and validate their execution by means of interpreting their provenance logs against such specification.

In the case of the Challenge, we have followed the first approach in its simpler flavour: we use a single, domain-independent PSM category which describes a Catalogation reasoning process to interpret, in terms of the population-based brain atlases domain, the execution of the Challenge workflow as recorded by preexisting process documentation infrastructure.


Provenance Data for Workflow Parts

Changed:
<
<
Our approach to provenance interpretation consists of a two-staged process:
>
>
Interpretation of provenance in a given domain requires a number of knowledge resources which, in the case of the Challenge, are the following:

  1. The OntoGrid PSM meta-model (psmontology.rdfs). It describes the basic entities of a PSM, and how they are related with each other. These key components can be found in the ProblemSolvingMethods section and are represented as follows:
PSM meta-model
  1. The catalogation PSM library (psmontologyModel.rdf) which provides a hierarchy of methods describing strategies to solve catalogation tasks. This PSM library is an instance of the OntoGrid PSM meta-model.
  2. The Brain Atlas ontology (BrainAtlasDomainOntology.rdfs, BrainAtlasDomainOntology.rdf). It describes the domain to be interpreted by the PSM library.

Changed:
<
<
  1. The first stage consists of documenting process execution using pre-existing provenance infrastructure like e.g. BPM. This provenance information is compliant with the Univesity of Southampton Provenance datamodel.
>
>
Explicit representation of domain and PSM meta-model as ontologies allows bridging domain and PSM entities, hence using domain-independent PSM to interpret domain-specific provenance. We have exploited the flexibility of the content tag of the Southampton datamodel to include in the interacion p-assertions of the process documentation corresponding to the challenge workflow semantic annotations on the data exchanged by the services involved. These annotations allow us to relate domain entities, e.g. brain atlas images, with the input and output roles of the PSM category, e.g. initial observation, used to interpret the provenance log.

Changed:
<
<
  1. The second stage, i.e. proper interpretation of the previously produced process documentation, requires a PSM library to interpret such documentation. For the Challenge, we have developed a domain-independent PSM library, which provides a hierarchy of methods describing strategies to solve catalogation tasks, of which the creation of population-based brain atlas can be seen as an instance. This catalogation PSM library is an instance of the RDF PSM meta-model developed in OntoGrid.
>
>
  • ChallengeLog.xml: Semantically-enhanced provenance log for the population-based brain atlas workflow

Deleted:
<
<

Changed:
<
<
>
>
Roles serve two purposes, first they act as containers for domain concepts and, second, as pointers to the types of domain concepts that can play this role. For example, brain image can be the input role of an initial observation during the catalogation process. Roles can be either static or dynamic. Static roles contain concepts that are persistent across the process. Dynamic roles contain concepts that change during the reasoning process. Dynamic roles characterize the process because they are constantly manipulated by the PSM in which they occur.

Changed:
<
<
We have used the Challenge workflow to show how the abstraction capabilities of PSM can help users without a background in provenance systems to more intuitively understand the execution of a distributed process. The prime catalogue method of our PSM library can be used to decompose the overall task represented in the population-based brain atlas creation task into simpler subtasks, as shown in the next figure. Ellipses represent tasks and rectangles are methods that can solve those tasks. Tasks achievable by several different methods are linked to them with dashed lines.
>
>
The prime catalogue method of our PSM library decomposes the overall population-based brain atlas creation task into simpler subtasks, as shown in the next figure, where ellipses represent tasks and rectangles are methods that can solve those tasks. Tasks achievable by several different methods are linked to them with dashed lines.

Task-Method decomposition of a generic catalogation task

Changed:
<
<
At the top level, the brain atlas creation process is described as an overall catalogation task whose input is an "initial observation" and its output is an "information representation". After applying the prime catalogue method, a refinement of such task is produced, as shown in next figure:
>
>
At the top level, the brain atlas creation process is described as an overall catalogation task whose input role is initial observation and its output role is information representation. After applying the prime catalogue method, a refinement of such task is produced, as shown in the next figure. This refinement shows the subtasks into which the method decomposes the current task and how they are connected by means of intermediate roles.

Prime catalogue method

Added:
>
>
During this iterative refinement process, we match, for each level of the task-method decomposition, the data flow of the current PSM with the provenance DAG (p-DAG) by means of detecting paths in the p-DAG between data sets which correspond to the data flow between the input and output roles of the PSM. The hierarchical structure of PSM i) guides the search process and ii) allows producing interpretations of provenance at different levels of detail.

The original process documentation, according to the Southampton datamodel, for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors. A file containing this provenance information can be retrieved by selecting "p-struct export".
Changed:
<
<
Links to the semantically enhanced logs used for provenance interpretation using PSM will be provided in short.
>
>
Links to the individual logs for each workflow stage will be provided in short.

Model Integration Results

Added:
>
>
The provenance interpretation approach uses provenance logs produced accordingly to the provenance datamodel of the University of Southampton.

Translation Details

Deleted:
<
<
We have exploited the flexibility of the content tag of the Southampton datamodel to include in the process documentation, corresponding to the challenge workflow, semantic annotations on the data exchanged by the services involved. These annotations allow us to relate domain entities, e.g. brain atlas images, with the roles of the PSM used to interpret the provenance log.

Benchmarks

Line: 74 to 84

-- JoseManuel - 19 Feb 2007

Added:
>
>


META FILEATTACHMENT psmontology.rdfs attr="" comment="PSM meta-model" date="1180099730" path="psmontology.rdfs" size="5746" user="JoseManuel" version="1.1"
Deleted:
<
<
META FILEATTACHMENT psmCatalogation.rdf attr="" comment="Catalogation PSM library" date="1180100003" path="psmCatalogation.rdf" size="24001" user="JoseManuel" version="1.1"

META FILEATTACHMENT catalogation_task-method_decomposition.jpg attr="" comment="Task-Method decomposition of a generic catalogation task" date="1180268906" path="catalogation_task-method_decomposition.jpg" size="27169" user="JoseManuel" version="1.1"
META FILEATTACHMENT primecataloguemethod.jpg attr="" comment="Prime catalogue method" date="1180270070" path="prime catalogue method.jpg" size="10019" user="JoseManuel" version="1.1"
Added:
>
>
META FILEATTACHMENT BrainAtlasDomainOntology?.rdfs attr="" comment="Brain Atlas domain ontology schema" date="1180280843" path="BrainAtlasDomainOntology.rdfs" size="3952" user="JoseManuel" version="1.1"
META FILEATTACHMENT BrainAtlasDomainOntology?.rdf attr="" comment="Brain Atlas domain ontology" date="1180280885" path="BrainAtlasDomainOntology.rdf" size="11991" user="JoseManuel" version="1.1"
META FILEATTACHMENT psmontologyModel.rdf attr="" comment="Catalogation PSM library" date="1180282228" path="psmontologyModel.rdf" size="34792" user="JoseManuel" version="1.1"
META FILEATTACHMENT psmontologyModel.jpg attr="" comment="PSM meta-model" date="1180283076" path="psmontologyModel.jpg" size="46247" user="JoseManuel" version="1.1"
META FILEATTACHMENT ChallengeLog?.xml attr="" comment="Semantically-enhanced provenance log for the population-based brain atlas workflow" date="1180288489" path="ChallengeLog.xml" size="89749" user="JoseManuel" version="1.1"
 <<O>>  Difference Topic OntoGrid (r1.6 - 27 May 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 18 to 18

Our approach to analysing past processes through their preexisting provenance logs is based on ProblemSolvingMethods (PSMs). A PSM is defined as a sequence of actions that accomplish a task in a specific domain. In this regard, PSM have been traditionally used in AI as an analytic tool to model, establish, and control the sequence of actions required to solve domain-specific tasks by means of decomposing them into simpler subtasks down to the level of primitive actions.

Changed:
<
<
Semantic Web Services are an incarnation of PSM in the SoA scenario. Indeed, the constituting ingredients of both are very similar from a conceptual point of view. A major difference is that PSM can be also used to analyse computational problem-solving behavior or reasoning, while Semantic Web Services are focused on invocation and execution of a reasoning process (implemented as a chain of services). In a service-oriented scenario, PSM can be used as high-level, domain-independent, knowledge templates that support interpretation of process execution. Our aim is to interpret provenance data (logs) at the Knowledge Level, using the PSM analytic approach. Our goal is to offer a process analysis component that detects occurrences of PSM amidst process documentation as abstractions of the methods applied to execute a particular task.
>
>
PSM can be also used to analyse computational problem-solving behavior or reasoning, beyond invocation and execution of a reasoning process implemented as a chain of services. In a service-oriented scenario, PSM can be used as high-level, domain-independent, knowledge templates that support interpretation of process execution. Our aim is to interpret provenance data (logs) at the Knowledge Level, using the PSM analytic approach, providing a process analysis component that detects occurrences of PSM amidst process documentation as abstractions of the methods applied to execute a particular task.

On the one hand, different PSM categories (like those identified in CommonKADS) can produce interpretations of process documentation under different perspectives, e.g. validation, monitoring, or diagnostics. On the other hand, more domain-oriented PSM categories can be built which are focused on related applications with the goal of validating their compliance with their original specification.

Line: 22 to 22

On the one hand, different PSM categories (like those identified in CommonKADS) can produce interpretations of process documentation under different perspectives, e.g. validation, monitoring, or diagnostics. On the other hand, more domain-oriented PSM categories can be built which are focused on related applications with the goal of validating their compliance with their original specification.

Deleted:
<
<

Provenance Data for Workflow Parts

Changed:
<
<
The original process documentation, according to the Southampton datamodel, for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors. A file containing this provenance information can be retrieved by selecting "p-struct export".
>
>
Our approach to provenance interpretation consists of a two-staged process:

Changed:
<
<
Links to the semantically enhanced logs used for provenance interpretation using PSM will be provided in short.
>
>
  1. The first stage consists of documenting process execution using pre-existing provenance infrastructure like e.g. BPM. This provenance information is compliant with the Univesity of Southampton Provenance datamodel.

Added:
>
>
  1. The second stage, i.e. proper interpretation of the previously produced process documentation, requires a PSM library to interpret such documentation. For the Challenge, we have developed a domain-independent PSM library, which provides a hierarchy of methods describing strategies to solve catalogation tasks, of which the creation of population-based brain atlas can be seen as an instance. This catalogation PSM library is an instance of the RDF PSM meta-model developed in OntoGrid.

Changed:
<
<

Model Integration Results

>
>

Changed:
<
<
The first stage of the provenance interpretation process in our approach consists of documenting process execution using pre-existing provenance infrastructure like e.g. BPM. This provenance information follows the Provenance datamodel produced by the University of Southampton in the first Challenge.
>
>

Changed:
<
<
The second stage, i.e. proper interpretation of the previously produced process documentation, requires a PSM library to interpret such documentation. For the Challenge, we have developed a generic PSM library which provides a hierarchy methods to solve an abstract catalogation task like the one occuring in the Challenge workflow. This catalogation PSM library is an instance of the RDF PSM meta-model developed in OntoGrid.
>
>
We have used the Challenge workflow to show how the abstraction capabilities of PSM can help users without a background in provenance systems to more intuitively understand the execution of a distributed process. The prime catalogue method of our PSM library can be used to decompose the overall task represented in the population-based brain atlas creation task into simpler subtasks, as shown in the next figure. Ellipses represent tasks and rectangles are methods that can solve those tasks. Tasks achievable by several different methods are linked to them with dashed lines.

Changed:
<
<
>
>

Task-Method decomposition of a generic catalogation task

At the top level, the brain atlas creation process is described as an overall catalogation task whose input is an "initial observation" and its output is an "information representation". After applying the prime catalogue method, a refinement of such task is produced, as shown in next figure:

Prime catalogue method

The original process documentation, according to the Southampton datamodel, for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors. A file containing this provenance information can be retrieved by selecting "p-struct export".

Links to the semantically enhanced logs used for provenance interpretation using PSM will be provided in short.

Model Integration Results


Deleted:
<
<

Translation Details

Line: 58 to 72

-- SimonMiles - 11 Dec 2006

Deleted:
<
<


-- JoseManuel - 19 Feb 2007

Deleted:
<
<

META FILEATTACHMENT psmontology.rdfs attr="" comment="PSM meta-model" date="1180099730" path="psmontology.rdfs" size="5746" user="JoseManuel" version="1.1"
META FILEATTACHMENT psmCatalogation.rdf attr="" comment="Catalogation PSM library" date="1180100003" path="psmCatalogation.rdf" size="24001" user="JoseManuel" version="1.1"
Added:
>
>
META FILEATTACHMENT catalogation_task-method_decomposition.jpg attr="" comment="Task-Method decomposition of a generic catalogation task" date="1180268906" path="catalogation_task-method_decomposition.jpg" size="27169" user="JoseManuel" version="1.1"
META FILEATTACHMENT primecataloguemethod.jpg attr="" comment="Prime catalogue method" date="1180270070" path="prime catalogue method.jpg" size="10019" user="JoseManuel" version="1.1"
 <<O>>  Difference Topic OntoGrid (r1.5 - 25 May 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 42 to 42

Translation Details

Changed:
<
<
We have exploited the flexibility of the tag of the Southampton datamodel to include in the process documentation, corresponding to the challenge workflow, semantic annotations on the data exchanged by the services involved. These annotations allow us to relate domain entities, e.g. brain atlas images, with the roles of the PSM used to interpret the provenance log.
>
>
We have exploited the flexibility of the content tag of the Southampton datamodel to include in the process documentation, corresponding to the challenge workflow, semantic annotations on the data exchanged by the services involved. These annotations allow us to relate domain entities, e.g. brain atlas images, with the roles of the PSM used to interpret the provenance log.

Benchmarks

 <<O>>  Difference Topic OntoGrid (r1.4 - 25 May 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 12 to 12

Differences from First Challenge

Changed:
<
<
OntoGrid did not participate in the First Provenance Challenge. The following is a description of our approach towards provenance.
>
>
This is the first participation of OntoGrid in the Provenance Challenge. According to the myGrid provenance pyramid, we can structure provenance information as a pyramid with four main levels: Data, Organization, Process, and Knowledge. According to our experience, the first three levels have been widely addressed in the available literature and in the first edition of the Challenge. However, our focus is on the Knowledge level. Knowledge provenance, settled on top of the provenance pyramid, is focused on the interpretation of the information registered by the other three. Hence, under a pragmatical viewpoint we aim at interpreting the process documentation recorded by third party provenance systems. Currently, our provenance interpretation system is compliant with the provenance datamodel produced by the University of Southampton.

Changed:
<
<
The Business Process Monitor (BPM)is a tool able to automatically inspect systems at the knowledge level e.g. why something changed in the system, rather than just inspecting systems at typical symbol level aspects, such as what has changed in the system.
>
>
OntoGrid's process documentation technology, accordingly to the S-OGSA architecture, is the Business Process Monitor (BPM). BPM produces process documentation in a format compliant with the Southampton datamodel. We intend to use BPM as the infrastructure for documenting processes in distributed environments. BPM also provides a series of inspection methods which allow querying such process documentation. More information about BPM can be obtained in http://www.insurancegrid.org/bpm/ (click on "About").

Changed:
<
<
Basic functionalities of BPM are:
>
>
Our approach to analysing past processes through their preexisting provenance logs is based on ProblemSolvingMethods (PSMs). A PSM is defined as a sequence of actions that accomplish a task in a specific domain. In this regard, PSM have been traditionally used in AI as an analytic tool to model, establish, and control the sequence of actions required to solve domain-specific tasks by means of decomposing them into simpler subtasks down to the level of primitive actions.

Changed:
<
<
  • *Graphically inspect and explain individual node reasoning
  • *Help to understand communication and collaboration between nodes, such as negotiation and coordination, and compare this with the original design.
  • *Graphically inspect the overall system reasoning, such as problem solving.
  • *Inspect overall system behaviour on the basis of a Knowledge Base.
>
>
Semantic Web Services are an incarnation of PSM in the SoA scenario. Indeed, the constituting ingredients of both are very similar from a conceptual point of view. A major difference is that PSM can be also used to analyse computational problem-solving behavior or reasoning, while Semantic Web Services are focused on invocation and execution of a reasoning process (implemented as a chain of services). In a service-oriented scenario, PSM can be used as high-level, domain-independent, knowledge templates that support interpretation of process execution. Our aim is to interpret provenance data (logs) at the Knowledge Level, using the PSM analytic approach. Our goal is to offer a process analysis component that detects occurrences of PSM amidst process documentation as abstractions of the methods applied to execute a particular task.

Changed:
<
<
Our approach uses BPM to document the execution of a given process. BPM provides a series of inspection methods which allow querying such process documentation and retrieve information about provenance. More information about BPM can be obtained in http://www.insurancegrid.org/bpm/ (click on "About").
>
>
On the one hand, different PSM categories (like those identified in CommonKADS) can produce interpretations of process documentation under different perspectives, e.g. validation, monitoring, or diagnostics. On the other hand, more domain-oriented PSM categories can be built which are focused on related applications with the goal of validating their compliance with their original specification.

Deleted:
<
<
We intend to use BPM as the infrastructure for documenting processes in a distributed environment. Our final goal is to use Problem Solving Methods (PSM) to interpret the provenance of such information. Provenance information is structured into four main levels: Data, Organization, Process, and Knowledge provenance. Knowledge provenance is settled on top of the pyramid, focused on interpretation of the information registered by the other three. PSM can be used as knowledge modules whose high level of abstraction facilitates the interpretation of provenance information, addressing the Knowledge level. Different existing PSM hierarchies can produce interpretations of provenance information under different perspectives, e.g. validation, monitoring, or diagnostics. On the other hand, more specific PSM hierarchies can be built specifically focused on related applications, like validation of medical praxis. As a final result of the Challenge we will produce semantic interpretations of the provenance information using these knowledge patterns.

Provenance Data for Workflow Parts

Changed:
<
<
The process documentation for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors.
>
>
The original process documentation, according to the Southampton datamodel, for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors. A file containing this provenance information can be retrieved by selecting "p-struct export".

Links to the semantically enhanced logs used for provenance interpretation using PSM will be provided in short.


Deleted:
<
<
A file containing this provenance information can be retrieved both in a format compliant with the EU-Provenance model ("p-struct export") or in RDF ("RDF export"). For information on the first option, see http://twiki.ipaw.info/bin/view/Challenge/Southampton2. The RDF export follows the schema contained in .

Model Integration Results

Changed:
<
<
State here which combinations of teams' models you have managed to perform the provenance query over
>
>
The first stage of the provenance interpretation process in our approach consists of documenting process execution using pre-existing provenance infrastructure like e.g. BPM. This provenance information follows the Provenance datamodel produced by the University of Southampton in the first Challenge.

The second stage, i.e. proper interpretation of the previously produced process documentation, requires a PSM library to interpret such documentation. For the Challenge, we have developed a generic PSM library which provides a hierarchy methods to solve an abstract catalogation task like the one occuring in the Challenge workflow. This catalogation PSM library is an instance of the RDF PSM meta-model developed in OntoGrid.


Translation Details

Changed:
<
<
Describe details regarding how data models were translated (or otherwise used to answer the query following the team's approach), any data which was absent from a downloaded model, and whether this affected the possibility of translation or successful provenance query, and any data which was excluded in translation from a downloaded model because it was extraneous
>
>
We have exploited the flexibility of the tag of the Southampton datamodel to include in the process documentation, corresponding to the challenge workflow, semantic annotations on the data exchanged by the services involved. These annotations allow us to relate domain entities, e.g. brain atlas images, with the roles of the PSM used to interpret the provenance log.

Benchmarks

Line: 59 to 62

-- JoseManuel - 19 Feb 2007

Added:
>
>

META FILEATTACHMENT psmontology.rdfs attr="" comment="PSM meta-model" date="1180099730" path="psmontology.rdfs" size="5746" user="JoseManuel" version="1.1"
META FILEATTACHMENT psmCatalogation.rdf attr="" comment="Catalogation PSM library" date="1180100003" path="psmCatalogation.rdf" size="24001" user="JoseManuel" version="1.1"
 <<O>>  Difference Topic OntoGrid (r1.3 - 23 May 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 7 to 7

Participating Team

Changed:
<
<
  • Participant names: Jose Manuel Gomez-Perez (iSOCO), Chris Van Aart (Y'All)
>
>
  • Participant names: Jose Manuel Gomez-Perez, Francisco Javier García (iSOCO), Chris Van Aart (Y'All)

Differences from First Challenge

 <<O>>  Difference Topic OntoGrid (r1.2 - 19 Feb 2007 - JoseManuel)

META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Line: 12 to 12

Differences from First Challenge

Changed:
<
<
OntoGrid did not participate in the First Provenance Challenge.
>
>
OntoGrid did not participate in the First Provenance Challenge. The following is a description of our approach towards provenance.

Changed:
<
<

Provenance Data for Workflow Parts

>
>
The Business Process Monitor (BPM)is a tool able to automatically inspect systems at the knowledge level e.g. why something changed in the system, rather than just inspecting systems at typical symbol level aspects, such as what has changed in the system.

Changed:
<
<
The process documentation for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors.
>
>
Basic functionalities of BPM are:

Changed:
<
<
A file containing this provenance information can be retrieved both in a format compliant with the EU-Provenance model ("p-struct export") or in RDF ("RDF export"). For information on the first option, see http://twiki.ipaw.info/bin/view/Challenge/Southampton2. The RDF export follows the schema contained in .
>
>
  • *Graphically inspect and explain individual node reasoning
  • *Help to understand communication and collaboration between nodes, such as negotiation and coordination, and compare this with the original design.
  • *Graphically inspect the overall system reasoning, such as problem solving.
  • *Inspect overall system behaviour on the basis of a Knowledge Base.

Our approach uses BPM to document the execution of a given process. BPM provides a series of inspection methods which allow querying such process documentation and retrieve information about provenance. More information about BPM can be obtained in http://www.insurancegrid.org/bpm/ (click on "About").


Changed:
<
<
The provenance for the three parts of the modified workflow (as per provenance query 7) are here:
>
>
We intend to use BPM as the infrastructure for documenting processes in a distributed environment. Our final goal is to use Problem Solving Methods (PSM) to interpret the provenance of such information. Provenance information is structured into four main levels: Data, Organization, Process, and Knowledge provenance. Knowledge provenance is settled on top of the pyramid, focused on interpretation of the information registered by the other three. PSM can be used as knowledge modules whose high level of abstraction facilitates the interpretation of provenance information, addressing the Knowledge level. Different existing PSM hierarchies can produce interpretations of provenance information under different perspectives, e.g. validation, monitoring, or diagnostics. On the other hand, more specific PSM hierarchies can be built specifically focused on related applications, like validation of medical praxis. As a final result of the Challenge we will produce semantic interpretations of the provenance information using these knowledge patterns.

Changed:
<
<
These document the same process except with convert replaced by three new procedures: pgmtogif, pgmtojpeg and jpegtogif (the first calls the latter two in order to convert from PGM to GIF in two stages). The difference is documented on our first challenge results page.
>
>

Provenance Data for Workflow Parts

The process documentation for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors.


Added:
>
>
A file containing this provenance information can be retrieved both in a format compliant with the EU-Provenance model ("p-struct export") or in RDF ("RDF export"). For information on the first option, see http://twiki.ipaw.info/bin/view/Challenge/Southampton2. The RDF export follows the schema contained in .

Model Integration Results

 <<O>>  Difference Topic OntoGrid (r1.1 - 19 Feb 2007 - JoseManuel)
Line: 1 to 1
Added:
>
>
META TOPICPARENT ParticipatingTeams

Second Provenance Challenge Template

Participating Team

Differences from First Challenge

OntoGrid did not participate in the First Provenance Challenge.

Provenance Data for Workflow Parts

The process documentation for the three parts of the original workflow can be found at http://www.insurancegrid.org/bpm/. Select "Provenance Challenge13:48:13,496 ,1502" from the list of available Virtual Organizations , then select option "sniffer" on the left, and, in the "Actors" tab select the subprocess of the brain atlas workflow in which you are interested. Clicking on "Filter" will return the process documentation relevant to those actors.

A file containing this provenance information can be retrieved both in a format compliant with the EU-Provenance model ("p-struct export") or in RDF ("RDF export"). For information on the first option, see http://twiki.ipaw.info/bin/view/Challenge/Southampton2. The RDF export follows the schema contained in .

The provenance for the three parts of the modified workflow (as per provenance query 7) are here:

These document the same process except with convert replaced by three new procedures: pgmtogif, pgmtojpeg and jpegtogif (the first calls the latter two in order to convert from PGM to GIF in two stages). The difference is documented on our first challenge results page.

Model Integration Results

State here which combinations of teams' models you have managed to perform the provenance query over

Translation Details

Describe details regarding how data models were translated (or otherwise used to answer the query following the team's approach), any data which was absent from a downloaded model, and whether this affected the possibility of translation or successful provenance query, and any data which was excluded in translation from a downloaded model because it was extraneous

Benchmarks

Describe your proposed benchmark queries, how the comparable quantities are determined, and the results of applying the benchmark to your own system

Further Comments

Provide here further comments.

Conclusions

Provide here your conclusions on the challenge, and issues that you like to see discussed at a face to face meeting.

-- SimonMiles - 11 Dec 2006

-- JoseManuel - 19 Feb 2007

Revision r1.1 - 19 Feb 2007 - 15:40 - JoseManuel
Revision r1.13 - 28 Jun 2007 - 17:57 - JoseManuel