Second Provenance ChallengeMotivation | ||||||||
| Line: 14 to 14 | ||||||||
|---|---|---|---|---|---|---|---|---|
| ||||||||
| Changed: | ||||||||
| < < |
The challenge is split into two phases. Each team should create a TWiki page for their second challenge results, and inform others when they complete each phase. | |||||||
| > > |
The challenge is split into two phases. Each team should create a TWiki page for their second challenge results, linked from ParticipatingTeams, and inform others when they complete each phase. | |||||||
| Once the challenge is complete, we will make all provenance data, translation programs and queries available on our Web site. Our goal is to create a repository with multiple scenarios and query workloads that can serve as benchmarks for provenance systems. | ||||||||
| Line: 72 to 72 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
Each team should, as before, construct a results page on the TWiki and upload the results above onto it. The template to follow for this challenge is given here. | |||||||
| > > |
Each team should, as before, construct a results page on the TWiki, linked from ParticipatingTeams, and upload the results above onto it. The template to follow for this challenge is given here. | |||||||
Workflow Parts | ||||||||
Second Provenance ChallengeMotivation | ||||||||
| Line: 76 to 76 | ||||||||
|---|---|---|---|---|---|---|---|---|
Workflow Parts | ||||||||
| Changed: | ||||||||
| < < |
The details of the workflow are given on the first provenance challenge page. However, the workflow is now split into three parts, depicted graphically here. Click on workflow images for high resolution versions. | |||||||
| > > |
This challenge is based on the same workflow definition as in the first challenge. However, the workflow definition is now split into three parts, depicted graphically here. As shown below, the workflow is comprised of procedures exchanging data. We do not want to restrict the execution environment for workflow runs, as the challenge is about provenance, and so only define only the essentials of the workflow: the roles of data in the workflow, types of procedure performed and where the output of one procedure becomes input for another. We do not otherwise prescribe how workflows are instantiated or run. The details of the workflow definition are given on the first provenance challenge page, as are pre-computed data for each of the items in the workflow for those that wish to simulate their workflow runs. Click on workflow images for high resolution versions. | |||||||
Part 1 | ||||||||
Second Provenance ChallengeMotivation | ||||||||||
| Changed: | ||||||||||
| < < |
The first provenance challenge was established following the IPAW 2006 workshop, where a range of papers were presented that described provenance models and systems. The challenge was proposed as a means for the disparate groups to gain a better understanding of the similarities, differences, core concepts and common issues across systems. The challenge consisted of running a workflow for an fMRI application and answering a set of queries over the provenance derived. The participants all executed this same workflow and performed the same set of queries over the data collected regarding the execution. At the conclusion of the challenge, the participants met and discussed their results, the commonality of the problem being a point around which comparison could take place. | |||||||||
| > > |
The first provenance challenge was established following the IPAW 2006 workshop, where a range of papers were presented that described provenance models and systems. The challenge was proposed as a means for the disparate groups to gain a better understanding of the similarities, differences, core concepts and common issues across systems. The challenge consisted of running a workflow for an fMRI application and answering a set of queries over the provenance derived. The participants all executed this same workflow and performed the same set of queries over the data collected regarding the execution. At the conclusion of the challenge, the participants met and discussed their results, the commonality of the problem being a point around which comparison could take place. | |||||||||
| While the first challenge had a large number of participants and led to valuable discussion about the aspects of provenance which were fundamental to all approaches, the queries and their expected results were weakly specified, and so interpreted differently by different groups. There was, therefore, no systematic way to compare capabilities of systems, including representations of provenance data. It was decided that a second challenge, based on the first, was desirable. With the first challenge showing that there are multiple levels of granularity/types of provenance that may be relevant at different times or in different parts of a process, understanding interoperability becomes a key issue, and so this will be the focus of second challenge. | ||||||||||
| Line: 41 to 31 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
Communicable Data Model | ||||||||||
| Changed: | ||||||||||
| < < |
The first phase of the challenge is to make available provenance data/process documentation for a workflow. As the focus of the challenge is to share and combine data models, we need to make it easy for other teams to parse the data. Therefore, the data should be exported in a well documented format, and a reference given to a free parser for that format. For consistency and availability of parsers, we strongly encourage (but do not require) teams to export their data in XML. The schema should be provided and adequately described for others to use, either in the schema document, on the TWiki or in a referenced document. | |||||||||
| > > |
The first phase of the challenge is to make available provenance data/process documentation for adequate workflow runs to answer the provenance queries. As the focus of the challenge is to share and combine data models, we need to make it easy for other teams to parse the data. Therefore, the data should be exported in a well documented format, and a reference given to a free parser for that format. For consistency and availability of parsers, we strongly encourage (but do not require) teams to export their data in XML. The schema should be provided and adequately described for others to use, either in the schema document, on the TWiki or in a referenced document. | |||||||||
The second challenge is based on the same workflow as the first. However, it is now divided into three parts:
| ||||||||||
| Changed: | ||||||||||
| < < |
Each part is considered a workflow in its own right with regards to provenance data, so there will be three distinct sets of provenance data uploaded to the TWiki for each team. To aid translation, adequate supporting information should be given on how the elements of the workflow are apparent in the provenance, e.g. how the challenge data files are identified in the provenance. Whether gathering this data is best done by running three separate workflows or by running one and splitting the provenance data afterwards is left to each team to decide. | |||||||||
| > > |
Each part is considered a workflow in its own right with regards to provenance data, so there will be three distinct sets of provenance data uploaded to the TWiki for each team. To aid translation, adequate supporting information should be given on how the elements of the workflow are apparent in the provenance, e.g. how the challenge data files are identified in the provenance. Whether gathering this data is best done by running three separate workflows or by running one and splitting the provenance data afterwards is left to each team to decide. | |||||||||
| The details of the workflow parts is given at the bottom of this page, but do not differ from the first challenge. | ||||||||||
| Added: | ||||||||||
| > > |
Specifically, the exported data should contain:
| |||||||||
Cross-Model Provenance QueryThe second phase of the challenge is for each team to use their approach to combine provenance data produced in the first phase by multiple other teams and use their approach to query over it. Each team should download data for each of the three workflow parts, each part from a different originating team. | ||||||||||
| Changed: | ||||||||||
| < < |
They must then perform the queries from the first provenance challenge over the combined three parts, as if they had captured the provenance data themselves. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. The queries are listed on the first provenance challenge page. | |||||||||
| > > |
They must then perform the queries from the first provenance challenge over the combined three parts, as if they had captured the provenance data themselves. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. The queries are listed in the Provenance Queries section below. | |||||||||
| Teams should perform queries over as many combinations as they feel is adequate for fulfilling the challenge goals above, but must query over at least one other team's data to have completed the challenge. Additional credit goes to teams whose data has been successfully imported and queried over by others! | ||||||||||
| Line: 98 to 96 | ||||||||||
![]() | ||||||||||
| Added: | ||||||||||
| > > |
Provenance QueriesThese are the same queries as in the first challenge, but more tightly specified to remove ambiguities.
| |||||||||
| ||||||||||
Second Provenance ChallengeMotivation | ||||||||
| Line: 55 to 55 | ||||||||
|---|---|---|---|---|---|---|---|---|
Cross-Model Provenance Query | ||||||||
| Changed: | ||||||||
| < < |
The second phase of the challenge is for each team to use their approach to combine provenance data produced in the first phase by multiple other teams and use their approach to query over it. A team should download data for each of the three workflow parts. | |||||||
| > > |
The second phase of the challenge is for each team to use their approach to combine provenance data produced in the first phase by multiple other teams and use their approach to query over it. Each team should download data for each of the three workflow parts, each part from a different originating team. | |||||||
| They must then perform the queries from the first provenance challenge over the combined three parts, as if they had captured the provenance data themselves. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. The queries are listed on the first provenance challenge page. | ||||||||
Second Provenance ChallengeMotivation | ||||||||
| Line: 74 to 74 | ||||||||
|---|---|---|---|---|---|---|---|---|
| ||||||||
| Changed: | ||||||||
| < < |
Each team should, as before, construct a results page on the TWiki and upload the results above onto it. The template to follow for this challenge is given SecondChallengeTemplate?. | |||||||
| > > |
Each team should, as before, construct a results page on the TWiki and upload the results above onto it. The template to follow for this challenge is given here. | |||||||
Workflow Parts | ||||||||
Second Provenance ChallengeMotivation | ||||||||
| Line: 16 to 16 | ||||||||
|---|---|---|---|---|---|---|---|---|
| participants met and discussed their results, the commonality of the problem being a point around which comparison could take place. | ||||||||
| Changed: | ||||||||
| < < |
While the first challenge had a large number of participants and led to valuable discussion about the aspects of provenance which were fundamental to all approaches, the queries and their expected results were weakly specified, and so interpreted differently by different groups. There was, therefore, no systematic way to compare capabilities of systems, including representations of provenance data. It was decided that a second challenge, based on the first, was desirable. To better understand the commonalities and differences across different approaches to provenance, the focus of second challenge will be on interoperability. | |||||||
| > > |
While the first challenge had a large number of participants and led to valuable discussion about the aspects of provenance which were fundamental to all approaches, the queries and their expected results were weakly specified, and so interpreted differently by different groups. There was, therefore, no systematic way to compare capabilities of systems, including representations of provenance data. It was decided that a second challenge, based on the first, was desirable. With the first challenge showing that there are multiple levels of granularity/types of provenance that may be relevant at different times or in different parts of a process, understanding interoperability becomes a key issue, and so this will be the focus of second challenge. | |||||||
| Changed: | ||||||||
| < < |
One way in which the interoperability of the approaches could be tested would be to compose the workflow execution systems, each system executing a part of the workflow, and run provenance queries over the results. However, this would primarily present a challenge to workflow systems rather than approaches to provenance. Instead, we propose that teams share provenance data produced by their different systems, and then perform provenance queries over compositions of data from other teams, as if it had been produced by their own system. | |||||||
| > > |
One way in which the interoperability of the approaches could be tested would be to compose the workflow execution systems, each system executing a part of the workflow, and run provenance queries over the results. However, this would primarily present a challenge to workflow systems rather than approaches to provenance: the scientific value of provenance will come from tracking provenance across workflow runs, through manual processes, and across workflow/provenance systems. Instead, we propose that teams share provenance data produced by their different systems, and then perform provenance queries over compositions of data from other teams, as if it had been produced by their own system. | |||||||
Through this approach, the second provenance challenge should encourage systematic conversions of data between systems, and so a reliable basis of comparison. Specifically, we hope to achieve the following goals.
| ||||||||
| Line: 43 to 41 | ||||||||
Communicable Data Model | ||||||||
| Changed: | ||||||||
| < < |
The first phase of the challenge is to make available provenance data/process documentation for a workflow. As the focus of the challenge is share and combine data models, we need to make it easy for other teams to parse the data. Therefore, the data should be exported in XML. The schema should be provided and adequately described for others to use, either in the schema document, on the TWiki or in a referenced document. | |||||||
| > > |
The first phase of the challenge is to make available provenance data/process documentation for a workflow. As the focus of the challenge is to share and combine data models, we need to make it easy for other teams to parse the data. Therefore, the data should be exported in a well documented format, and a reference given to a free parser for that format. For consistency and availability of parsers, we strongly encourage (but do not require) teams to export their data in XML. The schema should be provided and adequately described for others to use, either in the schema document, on the TWiki or in a referenced document. | |||||||
The second challenge is based on the same workflow as the first. However, it is now divided into three parts:
| ||||||||
| Changed: | ||||||||
| < < |
Each part is considered a workflow in its own right with regards to provenance data, so there will be three distinct sets of provenance data uploaded to the TWiki for each team. Whether gathering this data is best done by running three separate workflows or by running one and splitting the provenance data afterwards is left to each team to decide. | |||||||
| > > |
Each part is considered a workflow in its own right with regards to provenance data, so there will be three distinct sets of provenance data uploaded to the TWiki for each team. To aid translation, adequate supporting information should be given on how the elements of the workflow are apparent in the provenance, e.g. how the challenge data files are identified in the provenance. Whether gathering this data is best done by running three separate workflows or by running one and splitting the provenance data afterwards is left to each team to decide. | |||||||
| The details of the workflow parts is given at the bottom of this page, but do not differ from the first challenge. | ||||||||
Second Provenance ChallengeMotivation | ||||||||||
| Changed: | ||||||||||
| < < |
The first provenance challenge was established following the IPAW 2006 workshop, where a range of papers were presented on the topic of provenance, and systems to support provenance being determined in application environments. The challenge was proposed as a means for the disparate groups to gain a better understanding of the similarities, differences, core concepts and common issues between systems. It took the form of a sample experiment, a workflow for an fMRI application, and a set of queries regarding the provenance of the experiment's results (provenance queries). The particpants all executed this same workflow and performed the same set of queries over the data collected regarding the execution. At the conclusion of the challenge, the participants met and discussed their results, the commonality of the problem being a point around which comparison could take place. While the first challenge had a large number of participants and led to valuable discussion about the aspects of provenance which were fundamental to all approaches, the queries and their expected results were weakly specified, and so interpreted differently by different groups. There was, therefore, no systematic way to compare capabilities of systems, including representations of provenance data. It was decided that a second challenge, based on the first, was desirable. | |||||||||
| > > |
The first provenance challenge was established following the IPAW 2006 workshop, where a range of papers were presented that described provenance models and systems. The challenge was proposed as a means for the disparate groups to gain a better understanding of the similarities, differences, core concepts and common issues across systems. The challenge consisted of running a workflow for an fMRI application and answering a set of queries over the provenance derived. The participants all executed this same workflow and performed the same set of queries over the data collected regarding the execution. At the conclusion of the challenge, the participants met and discussed their results, the commonality of the problem being a point around which comparison could take place. While the first challenge had a large number of participants and led to valuable discussion about the aspects of provenance which were fundamental to all approaches, the queries and their expected results were weakly specified, and so interpreted differently by different groups. There was, therefore, no systematic way to compare capabilities of systems, including representations of provenance data. It was decided that a second challenge, based on the first, was desirable. To better understand the commonalities and differences across different approaches to provenance, the focus of second challenge will be on interoperability. | |||||||||
| One way in which the interoperability of the approaches could be tested would be to compose the workflow execution systems, each system executing a part of the workflow, and run provenance queries over the results. However, this would primarily present a challenge to workflow systems rather than approaches to provenance. Instead, we propose that teams share provenance data produced by their different systems, and then perform provenance queries over compositions of data from other teams, as if it had been produced by their own system. Through this approach, the second provenance challenge should encourage systematic conversions of data between systems, and so a reliable basis of comparison. Specifically, we hope to achieve the following goals. | ||||||||||
| Changed: | ||||||||||
| < < |
| |||||||||
| > > |
| |||||||||
| Changed: | ||||||||||
| < < |
The challenge is split into three phases, with the last two running in parallel. Each team should create a TWiki page for their second challenge results, and inform others when they complete each phase. | |||||||||
| > > |
The challenge is split into two phases. Each team should create a TWiki page for their second challenge results, and inform others when they complete each phase. Once the challenge is complete, we will make all provenance data, translation programs and queries available on our Web site. Our goal is to create a repository with multiple scenarios and query workloads that can serve as benchmarks for provenance systems. | |||||||||
TimetableThe timeline for the second provenance challenge is as follows: | ||||||||||
| Changed: | ||||||||||
| < < |
| |||||||||
| > > |
| |||||||||
| ||||||||||
| Line: 28 to 43 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
Communicable Data Model | ||||||||||
| Changed: | ||||||||||
| < < |
The first phase of the challenge is to make available provenance data/process documentation for a workflow. The format in which it is made available can be whatever is deemed most suitable but it should be possible to upload it to the TWiki and for others to download and parse it. The format should be adequately described for others to use, either on the TWiki or in a referenced document. | |||||||||
| > > |
The first phase of the challenge is to make available provenance data/process documentation for a workflow. As the focus of the challenge is share and combine data models, we need to make it easy for other teams to parse the data. Therefore, the data should be exported in XML. The schema should be provided and adequately described for others to use, either in the schema document, on the TWiki or in a referenced document. | |||||||||
The second challenge is based on the same workflow as the first. However, it is now divided into three parts:
| ||||||||||
| Line: 43 to 58 | ||||||||||
| The second phase of the challenge is for each team to use their approach to combine provenance data produced in the first phase by multiple other teams and use their approach to query over it. A team should download data for each of the three workflow parts. | ||||||||||
| Changed: | ||||||||||
| < < |
They must then perform the queries from the first provenance challenge over the combined three parts, as if they had captured the provenance data themselves. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. The query is the following: Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc. | |||||||||
| > > |
They must then perform the queries from the first provenance challenge over the combined three parts, as if they had captured the provenance data themselves. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. The queries are listed on the first provenance challenge page. | |||||||||
Teams should perform queries over as many combinations as they feel is adequate for fulfilling the challenge goals above, but must query over at least one other team's data to have completed the challenge. Additional credit goes to teams whose data has been successfully imported and queried over by others!
Benchmarks for Provenance Systems | ||||||||||
| Changed: | ||||||||||
| < < |
The third phase of the challenge, running in parallel with the second, is to propose, and demonstrate on your own system, queries with measurable qualities, so that provenance systems can be benchmarked against these queries. | |||||||||
| > > |
We aim to build a repository with different scenarios and queries, where each scenario/query combination exercises different features of provenance systems. The goal would be to classify the different systems with respect to their ability to handle the different scenarios. This exercise will be informed by the results of the second provenance challenge, where significant differences in approach should become apparent through attempting to translate data models. will be discussed at the second challenge workshop and we hope it will continue from there. | |||||||||
Results | ||||||||||
| Line: 86 to 99 | ||||||||||
![]() | ||||||||||
| Deleted: | ||||||||||
| < < |
-- SimonMiles - 26 Oct 2006 | |||||||||
| ||||||||||
Second Provenance ChallengeMotivation | ||||||||
| Changed: | ||||||||
| < < |
In the first provenance challenge, many participants performed provenance queries over the same workflow, and this led to valuable discussion about the aspects of provenance which were fundamental to all approaches. However, the queries and their expected results were weakly specified, and so interpreted differently by different groups. There was, therefore, no systematic way to compare capabilities of systems, including representations of provenance data. | |||||||
| > > |
The first provenance challenge was established following the IPAW 2006 workshop, where a range of papers were presented on the topic of provenance, and systems to support provenance being determined in application environments. The challenge was proposed as a means for the disparate groups to gain a better understanding of the similarities, differences, core concepts and common issues between systems. It took the form of a sample experiment, a workflow for an fMRI application, and a set of queries regarding the provenance of the experiment's results (provenance queries). The particpants all executed this same workflow and performed the same set of queries over the data collected regarding the execution. At the conclusion of the challenge, the participants met and discussed their results, the commonality of the problem being a point around which comparison could take place. While the first challenge had a large number of participants and led to valuable discussion about the aspects of provenance which were fundamental to all approaches, the queries and their expected results were weakly specified, and so interpreted differently by different groups. There was, therefore, no systematic way to compare capabilities of systems, including representations of provenance data. It was decided that a second challenge, based on the first, was desirable. | |||||||
| One way in which the interoperability of the approaches could be tested would be to compose the workflow execution systems, each system executing a part of the workflow, and run provenance queries over the results. However, this would primarily present a challenge to workflow systems rather than approaches to provenance. Instead, we propose that teams share provenance data produced by their different systems, and then perform provenance queries over compositions of data from other teams, as if it had been produced by their own system. | ||||||||
Second Provenance Challenge | ||||||||||
| Changed: | ||||||||||
| < < |
The second provenance challenge builds on the first and focusses on the interoperability of the myriad existing approaches to provenance. Through exchanging and combining provenance produced by each others' systems, we hope to achieve the following goals. | |||||||||
| > > |
MotivationIn the first provenance challenge, many participants performed provenance queries over the same workflow, and this led to valuable discussion about the aspects of provenance which were fundamental to all approaches. However, the queries and their expected results were weakly specified, and so interpreted differently by different groups. There was, therefore, no systematic way to compare capabilities of systems, including representations of provenance data. One way in which the interoperability of the approaches could be tested would be to compose the workflow execution systems, each system executing a part of the workflow, and run provenance queries over the results. However, this would primarily present a challenge to workflow systems rather than approaches to provenance. Instead, we propose that teams share provenance data produced by their different systems, and then perform provenance queries over compositions of data from other teams, as if it had been produced by their own system. Through this approach, the second provenance challenge should encourage systematic conversions of data between systems, and so a reliable basis of comparison. Specifically, we hope to achieve the following goals. | |||||||||
| ||||||||||
| Added: | ||||||||||
| > > |
| |||||||||
| Changed: | ||||||||||
| < < |
The challenge is split into two phases. Each team should create a TWiki page for their second challenge results, and inform others when they complete each phase. | |||||||||
| > > |
The challenge is split into three phases, with the last two running in parallel. Each team should create a TWiki page for their second challenge results, and inform others when they complete each phase.
Timetable | |||||||||
The timeline for the second provenance challenge is as follows:
| ||||||||||
| Changed: | ||||||||||
| < < |
| |||||||||
| > > |
| |||||||||
| ||||||||||
| Changed: | ||||||||||
| < < |
| |||||||||
| > > |
| |||||||||
Communicable Data ModelThe first phase of the challenge is to make available provenance data/process documentation for a workflow. The format in which it is made available can be whatever is deemed most suitable but it should be possible to upload it to the TWiki and for others to download and parse it. The format should be adequately described for others to use, either on the TWiki or in a referenced document. The second challenge is based on the same workflow as the first. However, it is now divided into three parts: | ||||||||||
| Changed: | ||||||||||
| < < |
| |||||||||
| > > |
| |||||||||
| Changed: | ||||||||||
| < < |
Each part is considered a workflow in its own right with regards to provenance data, so there will be three sets of provenance data uploaded to the TWiki for each team. Whether gathering this data is best done by running three separate workflows or by running one and splitting the provenance afterwards is left to each team to decide. | |||||||||
| > > |
Each part is considered a workflow in its own right with regards to provenance data, so there will be three distinct sets of provenance data uploaded to the TWiki for each team. Whether gathering this data is best done by running three separate workflows or by running one and splitting the provenance data afterwards is left to each team to decide. | |||||||||
The details of the workflow parts is given at the bottom of this page, but do not differ from the first challenge.
Cross-Model Provenance Query | ||||||||||
| Changed: | ||||||||||
| < < |
The second phase of the challenge is for each team to use their approach to combine provenance data produced in the first phase by multiple other teams and use their approach to query over it. A team should download data for each of the three workflow parts, possibly including their own for one or two parts in the first instance. | |||||||||
| > > |
The second phase of the challenge is for each team to use their approach to combine provenance data produced in the first phase by multiple other teams and use their approach to query over it. A team should download data for each of the three workflow parts. | |||||||||
| Changed: | ||||||||||
| < < |
They must then perform Query 1 from the first provenance challenge over the combined three parts, as if they had captured the provenance data themsevles. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. The query is the following: | |||||||||
| > > |
They must then perform the queries from the first provenance challenge over the combined three parts, as if they had captured the provenance data themselves. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. The query is the following: | |||||||||
| Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc. | ||||||||||
| Changed: | ||||||||||
| < < |
Teams should perform queries over as many combinations as they feel is adequate for fulfilling the challenge goals above, but must query over at least one other team's data to have completed the challenge. | |||||||||
| > > |
Teams should perform queries over as many combinations as they feel is adequate for fulfilling the challenge goals above, but must query over at least one other team's data to have completed the challenge. Additional credit goes to teams whose data has been successfully imported and queried over by others!
Benchmarks for Provenance SystemsThe third phase of the challenge, running in parallel with the second, is to propose, and demonstrate on your own system, queries with measurable qualities, so that provenance systems can be benchmarked against these queries. | |||||||||
Results | ||||||||||
| Line: 43 to 58 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
| Added: | ||||||||||
| > > |
| |||||||||
Workflow Parts | ||||||||||
| Line: 66 to 84 | ||||||||||
![]() | ||||||||||
| Added: | ||||||||||
| > > |
-- SimonMiles - 26 Oct 2006 | |||||||||
| ||||||||||
Second Provenance Challenge | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
The second provenance challenge builds on the first and focusses on the interoperability of the myriad existing approaches to provenance. Through exchanging and combining provenance produced by each others' systems, we hope to achieve the following. | ||||||||||||||||||
| > > |
The second provenance challenge builds on the first and focusses on the interoperability of the myriad existing approaches to provenance. Through exchanging and combining provenance produced by each others' systems, we hope to achieve the following goals. | ||||||||||||||||||
| |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
The challenge is split into two phase. Each team should create a TWiki page for their second challenge results, and inform others when they complete each phase. | ||||||||||||||||||
| > > |
The challenge is split into two phases. Each team should create a TWiki page for their second challenge results, and inform others when they complete each phase.
The timeline for the second provenance challenge is as follows:
| ||||||||||||||||||
Communicable Data Model | |||||||||||||||||||
| Line: 18 to 24 | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Each part is considered a workflow in its own right with regards to provenance data, so there will be three sets of provenance data uploaded to the TWiki for each team. Whether gathering this data is best done by running three separate workflows or by running one and splitting the provenance afterwards is left to each team to decide. | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
Across-Model Provenance Query | ||||||||||||||||||
| > > |
The details of the workflow parts is given at the bottom of this page, but do not differ from the first challenge.
Cross-Model Provenance Query | ||||||||||||||||||
| The second phase of the challenge is for each team to use their approach to combine provenance data produced in the first phase by multiple other teams and use their approach to query over it. A team should download data for each of the three workflow parts, possibly including their own for one or two parts in the first instance. | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
They must then perform Query 1 from the first provenance challenge over the combined three parts, as if they had captured the provenance data themsevles. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. | ||||||||||||||||||
| > > |
They must then perform Query 1 from the first provenance challenge over the combined three parts, as if they had captured the provenance data themsevles. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models. The query is the following: Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc. Teams should perform queries over as many combinations as they feel is adequate for fulfilling the challenge goals above, but must query over at least one other team's data to have completed the challenge. | ||||||||||||||||||
Results | |||||||||||||||||||
| Line: 30 to 42 | |||||||||||||||||||
| |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| > > |
Workflow PartsThe details of the workflow are given on the first provenance challenge page. However, the workflow is now split into three parts, depicted graphically here. Click on workflow images for high resolution versions.Part 1This part is the reslicing of images to fit one reference image. It includes two stages: align_warp and reslice.![]() Part 2This part is the averaging of brain images into one. It includes one stage: softmean.![]() Part 3This part is the conversion of the averaged image into three graphics files showing slices of that brain. It includes two stages: slicer and convert. | ||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
-- SimonMiles - 03 Oct 2006 | ||||||||||||||||||
| > > |
![]() | ||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
| ||||||||||||||||||
| Changed: | ||||||||
| < < |
Second Provenance Challenge | |||||||
| > > |
Second Provenance Challenge | |||||||
| Added: | ||||||||
| > > |
The second provenance challenge builds on the first and focusses on the interoperability of the myriad existing approaches to provenance. Through exchanging and combining provenance produced by each others' systems, we hope to achieve the following.
| |||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
The challenge is split into two phase. Each team should create a TWiki page for their second challenge results, and inform others when they complete each phase. | |||||||
| Changed: | ||||||||
| < < |
-- LucMoreau - 14 Sep 2006 | |||||||
| > > |
Communicable Data ModelThe first phase of the challenge is to make available provenance data/process documentation for a workflow. The format in which it is made available can be whatever is deemed most suitable but it should be possible to upload it to the TWiki and for others to download and parse it. The format should be adequately described for others to use, either on the TWiki or in a referenced document. The second challenge is based on the same workflow as the first. However, it is now divided into three parts:
Across-Model Provenance QueryThe second phase of the challenge is for each team to use their approach to combine provenance data produced in the first phase by multiple other teams and use their approach to query over it. A team should download data for each of the three workflow parts, possibly including their own for one or two parts in the first instance. They must then perform Query 1 from the first provenance challenge over the combined three parts, as if they had captured the provenance data themsevles. This is likely to involve, for most teams, a first step of translating the downloaded data into their own models.ResultsTeams should report the following results for the challenge:
| |||||||
| Line: 1 to 1 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Added: | ||||||||
| > > |
Second Provenance Challenge
| |||||||