<<O>>  Difference Topic FirstProvenanceChallenge (r1.18 - 10 Nov 2006 - SimonMiles)

META TOPICPARENT WebHome

First Provenance Challenge

 <<O>>  Difference Topic FirstProvenanceChallenge (r1.17 - 28 Aug 2006 - SimonMiles)

META TOPICPARENT WebHome

First Provenance Challenge

Line: 37 to 37

Changed:
<
<
It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. Note that the term stage is introduced only to help description of the workflow, and we do not dictate how it is apparent in a concrete implementation. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data, and the FSL suite to create 2D images across each sliced dimension of the brain. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.
>
>
It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. Note that the term stage is introduced only to help description of the workflow, and we do not dictate how it is apparent in a concrete implementation. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data, and the FSL suite to create 2D images across each sliced dimension of the brain. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center Accession Number: 2-2004-1168X.

The stages of the workflow are as follows.

Changed:
<
<
  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a _warp parameter set_ defining the spatially transformation to be performed (Warp Params 1 to 4).
  2. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a resliced image.
  3. All the resliced images are averaged into one single image using softmean.
>
>
  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a _warp parameter set_ defining the spatially transformation to be performed (Warp Params 1 to 4).
  2. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a resliced image.
  3. All the resliced images are averaged into one single image using softmean.

  1. For each dimension (x, y and z), the averaged image is sliced to give a 2D atlas along a plane in that dimension, taken through the centre of the 3D image. The output is an atlas data set, using slicer. This tool can be downloaded as part of the FSL suite, available at http://www.fmrib.ox.ac.uk/fsl/.
  2. For each atlas data set, it is converted into a graphical atlas image using (the ImageMagick utility) convert.

The full steps, procedures data and parameters are enumerated in the table below. The procedure names are linked to the manual pages for those utilities, and the input and output names to the actual data exchanged between procedures.

Step Procedure Data Role Item 1 Item 2 Item 3 Item 4
Changed:
<
<
1 align_warp Inputs Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header
>
>
1 align_warp Inputs Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header

Outputs Warp Parameters 1      
Parameters -m 12 -q
Changed:
<
<
2 align_warp Inputs Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header
>
>
2 align_warp Inputs Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header

Outputs Warp Parameters 2      
Parameters -m 12 -q
Changed:
<
<
3 align_warp Inputs Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header
>
>
3 align_warp Inputs Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header

Outputs Warp Parameters 3      
Parameters -m 12 -q
Changed:
<
<
4 align_warp Inputs Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header
>
>
4 align_warp Inputs Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header

Outputs Warp Parameters 4      
Parameters -m 12 -q
Changed:
<
<
5 reslice Inputs Warp Parameters 1      
>
>
5 reslice Inputs Warp Parameters 1      

Outputs Resliced Image 1 Resliced Header 1    
Parameters
Changed:
<
<
6 reslice Inputs Warp Parameters 2      
>
>
6 reslice Inputs Warp Parameters 2      

Outputs Resliced Image 2 Resliced Header 2    
Parameters
Changed:
<
<
7 reslice Inputs Warp Parameters 3      
>
>
7 reslice Inputs Warp Parameters 3      

Outputs Resliced Image 3 Resliced Header 3    
Parameters
Changed:
<
<
8 reslice Inputs Warp Parameters 4      
>
>
8 reslice Inputs Warp Parameters 4      

Outputs Resliced Image 4 Resliced Header 4    
Parameters
Changed:
<
<
9 softmean Inputs Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2
>
>
9 softmean Inputs Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2

Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
Outputs Atlas Image Atlas Header    
Parameters y null
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.16 - 21 Aug 2006 - SimonMiles)

META TOPICPARENT WebHome

First Provenance Challenge

Line: 135 to 135

Timetable

  • 2006-June: Challenge finalised, participants start!
Changed:
<
<
  • 2006-September: Deadline for challenge results to be uploaded
  • 2006-September: Face-to-face meeting at which results are discussed
>
>
  • 2006-September-13: Deadline for challenge results to be uploaded
  • 2006-September-13 and 2006-September-14: Face-to-face meeting at which results are discussed

  • 2006-October-15: Comparisons performed, minutes of discussion, proposed next steps uploaded

Changed:
<
<
-- SimonMiles - 19 Jun 2006
>
>
-- SimonMiles - 21 Aug 2006

META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow (original vdt display)" date="1147791744" moveby="LucMoreau" movedto="Challenge.FirstProvenanceChallenge" movedwhen="1149621057" movefrom="Challenge.WebHome" path="BrainAtlas.png" size="5220" user="SimonMiles" version="1.1"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.15 - 18 Jul 2006 - PassProject)

META TOPICPARENT WebHome

First Provenance Challenge

Line: 125 to 125

  • For any data given above, each team should provide a link to an explanation of the representation used so that other participants can interpret it.
Changed:
<
<

Sample Worflow Implementations

>
>

Sample Workflow Implementations


As it may be useful to some, we provide sample implementations of the workflow here. This should not preclude the use of any other technology. The implementations assume that the executables referenced above are all installed; they are provided by the two packages AIR (automated image registration) suite and ImageMagick.

Added:
>
>
Minor caution - this is a DOS text file, and if run on Unix the extra carriage returns at the ends of lines make their way into the filenames and cause everything to break. Strip the CRs with tr before running...

Timetable

  • 2006-June: Challenge finalised, participants start!
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.14 - 30 Jun 2006 - SimonMiles)

META TOPICPARENT WebHome

First Provenance Challenge

Line: 37 to 37

Changed:
<
<
It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. Note that the term stage is introduced only to help description of the workflow, and we do not dictate how it is apparent in a concrete implementation. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.
>
>
It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. Note that the term stage is introduced only to help description of the workflow, and we do not dictate how it is apparent in a concrete implementation. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data, and the FSL suite to create 2D images across each sliced dimension of the brain. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center Accession Number: 2-2004-1168X.

Line: 45 to 45

  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a _warp parameter set_ defining the spatially transformation to be performed (Warp Params 1 to 4).
  2. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a resliced image.
  3. All the resliced images are averaged into one single image using softmean.
Changed:
<
<
  1. For each dimension (x, y and z), the averaged image is sliced to give a 2D atlas along a plane in that dimension, taken through the centre of the 3D image. The output is an atlas data set, using slicer.
>
>
  1. For each dimension (x, y and z), the averaged image is sliced to give a 2D atlas along a plane in that dimension, taken through the centre of the 3D image. The output is an atlas data set, using slicer. This tool can be downloaded as part of the FSL suite, available at http://www.fmrib.ox.ac.uk/fsl/.

  1. For each atlas data set, it is converted into a graphical atlas image using (the ImageMagick utility) convert.

The full steps, procedures data and parameters are enumerated in the table below. The procedure names are linked to the manual pages for those utilities, and the input and output names to the actual data exchanged between procedures.

Line: 79 to 79

Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
Outputs Atlas Image Atlas Header    
Parameters y null
Changed:
<
<
10 slicer Inputs Atlas Image Atlas Header    
>
>
10 slicer (download) Inputs Atlas Image Atlas Header    

Outputs Atlas X Slice      
Parameters -x .5
Changed:
<
<
11 slicer Inputs Atlas Image Atlas Header    
>
>
11 slicer (download) Inputs Atlas Image Atlas Header    

Outputs Atlas Y Slice      
Parameters -y .5
Changed:
<
<
12 slicer Inputs Atlas Image Atlas Header    
>
>
12 slicer (download) Inputs Atlas Image Atlas Header    

Outputs Atlas Z Slice      
Parameters -z .5
13 convert Inputs Atlas X Slice      
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.13 - 19 Jun 2006 - SimonMiles)

META TOPICPARENT WebHome

First Provenance Challenge

Line: 108 to 108

  1. Find all invocations of procedure align_warp using a twelfth order nonlinear 1365 parameter model (see model menu describing possible values of parameter "-m 12" of align_warp) that ran on a Monday.
  2. Find all Atlas Graphic images outputted from workflows where at least one of the input Anatomy Headers had an entry global maximum=4095. The contents of a header file can be extracted as text using the scanheader AIR utility.
  3. Find all output averaged images of softmean (average) procedures, where the warped images taken as input were align_warped using a twelfth order nonlinear 1365 parameter model, i.e. "where softmean was preceded in the workflow, directly or indirectly, by an align_warp procedure with argument -m 12."
Changed:
<
<
  1. A user has run the workflow twice, in the second instance replacing each procedures (convert) in the final stage with two procedures: pgmtoppm, then pnmtojpeg. Find the differences between the two workflow runs.
>
>
  1. A user has run the workflow twice, in the second instance replacing each procedures (convert) in the final stage with two procedures: pgmtoppm, then pnmtojpeg. Find the differences between the two workflow runs. The exact level of detail in the difference that is detected by a system is up to each participant.

  1. A user has annotated some anatomy images with a key-value pair center=UChicago. Find the outputs of align_warp where the inputs are annotated with center=UChicago.
  2. A user has annotated some atlas graphics with key-value pair where the key is studyModality. Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all other annotations to these files.
Line: 132 to 132

Timetable

Changed:
<
<
  • 2006-June-7: First full draft of challenge, asking potential participants to help finalise
    • Fix provenance challenge proposal TWiki page, with aims, objectives, workflow, sample queries, timetable
    • Upload input, output, intermediate data used in workflow
    • Send provenance challenge call to other possible participants, including:
      • Aims, objectives, workflow, sample queries, timetable
      • Request for comments on the above
      • Request to suggest what should be core queries
  • 2006-June-15: Challenge finalised, participants start!
>
>
  • 2006-June: Challenge finalised, participants start!

  • 2006-September: Deadline for challenge results to be uploaded
  • 2006-September: Face-to-face meeting at which results are discussed
  • 2006-October-15: Comparisons performed, minutes of discussion, proposed next steps uploaded

Changed:
<
<
-- SimonMiles - 7 June 2006
>
>
-- SimonMiles - 19 Jun 2006

META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow (original vdt display)" date="1147791744" moveby="LucMoreau" movedto="Challenge.FirstProvenanceChallenge" movedwhen="1149621057" movefrom="Challenge.WebHome" path="BrainAtlas.png" size="5220" user="SimonMiles" version="1.1"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.12 - 07 Jun 2006 - LucMoreau)

META TOPICPARENT WebHome

First Provenance Challenge

Line: 132 to 132

Timetable

Changed:
<
<
  • 2006-May-31: First full draft of challenge, asking potential participants to help finalise
>
>
  • 2006-June-7: First full draft of challenge, asking potential participants to help finalise

    • Fix provenance challenge proposal TWiki page, with aims, objectives, workflow, sample queries, timetable
    • Upload input, output, intermediate data used in workflow
    • Send provenance challenge call to other possible participants, including:
Line: 146 to 146

-- SimonMiles - 7 June 2006

Changed:
<
<
META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1147791744" moveby="LucMoreau" movedto="Challenge.FirstProvenanceChallenge" movedwhen="1149621057" movefrom="Challenge.WebHome" path="BrainAtlas.png" size="5220" user="SimonMiles" version="1.1"
>
>
META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow (original vdt display)" date="1147791744" moveby="LucMoreau" movedto="Challenge.FirstProvenanceChallenge" movedwhen="1149621057" movefrom="Challenge.WebHome" path="BrainAtlas.png" size="5220" user="SimonMiles" version="1.1"

META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
META FILEATTACHMENT workflow.sh attr="" comment="Shell script version of workflow" date="1149088025" path="workflow.sh" size="798" user="SimonMiles" version="1.2"
META FILEATTACHMENT BrainAtlas?.gif attr="" comment="" date="1149615566" path="BrainAtlas.gif" size="41095" user="LucMoreau" version="1.1"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.11 - 07 Jun 2006 - SimonMiles)

META TOPICPARENT WebHome

First Provenance Challenge

Line: 24 to 24

  • Extensions to the example workflow to best illustrate the unique aspects of their system
  • Any categorisation of queries that the project considers to have practical value
Added:
>
>
Participants should not be too concerned about whether extensions to the workflow are scientific realistic: they are explicitly contrived to demonstrate aspects of their system.

Example Workflow

We propose an example workflow for creating population-based "brain atlases" from the fMRI Data Center's archive of high resolution anatomical data. The workflow is shown below (click for a pdf version of the image).

 <<O>>  Difference Topic FirstProvenanceChallenge (r1.10 - 07 Jun 2006 - SimonMiles)

META TOPICPARENT WebHome

First Provenance Challenge

Line: 103 to 103

  1. Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc.
  2. Find the process that led to Atlas X Graphic, excluding everything prior to the averaging of images with softmean.
  3. Find the Stage 3, 4 and 5 details of the process that led to Atlas X Graphic.
Changed:
<
<
  1. Find all invocations of procedure align_warp using a twelfth order nonlinear 1365 parameter model (see model menu describing possible values of parameter "-m 12" of align_warp) that ran in less than 30 minutes on non-ia64 processors.
>
>
  1. Find all invocations of procedure align_warp using a twelfth order nonlinear 1365 parameter model (see model menu describing possible values of parameter "-m 12" of align_warp) that ran on a Monday.
  2. Find all Atlas Graphic images outputted from workflows where at least one of the input Anatomy Headers had an entry global maximum=4095. The contents of a header file can be extracted as text using the scanheader AIR utility.

  1. Find all output averaged images of softmean (average) procedures, where the warped images taken as input were align_warped using a twelfth order nonlinear 1365 parameter model, i.e. "where softmean was preceded in the workflow, directly or indirectly, by an align_warp procedure with argument -m 12."
  2. A user has run the workflow twice, in the second instance replacing each procedures (convert) in the final stage with two procedures: pgmtoppm, then pnmtojpeg. Find the differences between the two workflow runs.
  3. A user has annotated some anatomy images with a key-value pair center=UChicago. Find the outputs of align_warp where the inputs are annotated with center=UChicago.
Line: 141 to 142

  • 2006-September: Face-to-face meeting at which results are discussed
  • 2006-October-15: Comparisons performed, minutes of discussion, proposed next steps uploaded

Changed:
<
<
-- SimonMiles - 6 June 2006
>
>
-- SimonMiles - 7 June 2006

META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1147791744" moveby="LucMoreau" movedto="Challenge.FirstProvenanceChallenge" movedwhen="1149621057" movefrom="Challenge.WebHome" path="BrainAtlas.png" size="5220" user="SimonMiles" version="1.1"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.9 - 06 Jun 2006 - LucMoreau)

META TOPICPARENT WebHome
Changed:
<
<

Provenance Challenge

>
>

First Provenance Challenge


Aims

Line: 9 to 9

  • The capabilities of each system in answering provenance-related queries
  • What each system considers to be within scope of the topic of provenance (regardless of whether the system can yet achieve all problems in that scope)
Changed:
<
<
To help achieve the aims, we define a single example workflow that forms the basis of the challenge. It is inspired from a real experiment, in the area of Functional Magnetic Resonance Imaging (fMRI). Here, we use the term workflow to denote a series of procedures being performed in a system, each taking some data as input and producing other data as output. We do not assume that these procedures must use some particular form of technology (EXE files, Web Services etc.) or that the workflow is explicitly defined in a workflow technology (BPEL, compiled executable, Scufl, batch file etc.), but individual participants will adopt their technology of choice.
>
>
To help achieve the aims, we define a simple example workflow that forms the basis of the challenge. It is inspired from a real experiment, in the area of Functional Magnetic Resonance Imaging (fMRI). Here, we use the term workflow to denote a series of procedures being performed in a system, each taking some data as input and producing other data as output. We do not assume that these procedures must use some particular form of technology (EXE files, Web Services etc.) or that the workflow is explicitly defined in a workflow technology (BPEL, compiled executable, Scufl, batch file etc.), but individual participants will adopt their technology of choice.

Changed:
<
<
Our focus in this challenge is on provenance and not on running the experiment. Hence, to facilitate take-up, while based on a real experiment, the procedures can be implemented as "dummies", i.e. we provide the input, output and intermediate data and participants can use fake procedures that take the right input and produce the right output. Alternatively, if participants wish to enact the real workflow, this is also possible. In addition to this, we define a set of core queries that all partipicipants should show how they address, so we can compare systems.
>
>
Our focus in this challenge is on provenance and not on running the experiment. Hence, to facilitate take-up, while based on a real experiment, the procedures can be implemented as "dummies", i.e. we provide the input, output and intermediate data and participants can use fake procedures that take the right input and produce the right output. Alternatively, participants can actually execute the real workflow after installing the necessary libaries. In addition to this, we define a set of core queries that all partipicipants should show how they address, so we can compare systems.

Changed:
<
<
Each participant in the challenge will have its own page on this TWiki, following the ChallengeTemplate, where they can inform the rest of their efforts in meeting the challenge. During the provenance challenge, we expect the participants to upload the following to their page, to then allow comparison.
>
>
Each participant in the challenge will have their own page on this TWiki, following the ChallengeTemplate, where they can inform the rest of their efforts in meeting the challenge. During the provenance challenge, we expect the participants to upload the following to their page, to then allow comparison.
  • Representations of the workflow in their system

  • Representations of provenance for the example workflow
  • Representations of the result of the core (and other) queries
Changed:
<
<
  • Contributions to a matrix of queries vs systems, indicating for each that: (1) the query can be answered by the system, (2) the system cannot answer the query now but considers it relevant, (3) the query is not relevant to the project.
>
>
  • Contributions to a matrix of queries vs systems, indicating for each that: (1) the query can be answered by the system, (2) the system cannot answer the query now but considers it relevant, (3) the query is not relevant to the project.

Optionally, the participants may like to contribute the following.

  • Additional queries (beyond the core queries) that illustrate the scope of their system
Deleted:
<
<
  • Representations of the workflow in their system

  • Extensions to the example workflow to best illustrate the unique aspects of their system
  • Any categorisation of queries that the project considers to have practical value

Example Workflow

Changed:
<
<
We propose an example workflow for creating population-based "brain atlases" from the fMRI Data Center's archive of high resolution anatomical data. The workflow is shown below (click for larger image).
>
>
We propose an example workflow for creating population-based "brain atlases" from the fMRI Data Center's archive of high resolution anatomical data. The workflow is shown below (click for a pdf version of the image).

BrainAtlas.gif

Line: 35 to 35

Changed:
<
<
It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure (the term stage is introduced only to help description of the workflow, and we do not dictate how it is apparent in a concrete implementation). The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.
>
>
It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. Note that the term stage is introduced only to help description of the workflow, and we do not dictate how it is apparent in a concrete implementation. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

Changed:
<
<
The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Accession Number: 2-2004-1168X Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center.
>
>
The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center Accession Number: 2-2004-1168X.

The stages of the workflow are as follows.

  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a _warp parameter set_ defining the spatially transformation to be performed (Warp Params 1 to 4).
Line: 50 to 50

Step Procedure Data Role Item 1 Item 2 Item 3 Item 4
1 align_warp Inputs Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header
Changed:
<
<
    Outputs Warp Parameters 1      
    Parameters -m 12 -q      
>
>
Outputs Warp Parameters 1      
Parameters -m 12 -q

2 align_warp Inputs Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header
Changed:
<
<
    Outputs Warp Parameters 2      
    Parameters -m 12 -q      
>
>
Outputs Warp Parameters 2      
Parameters -m 12 -q

3 align_warp Inputs Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header
Changed:
<
<
    Outputs Warp Parameters 3      
    Parameters -m 12 -q      
>
>
Outputs Warp Parameters 3      
Parameters -m 12 -q

4 align_warp Inputs Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header
Changed:
<
<
    Outputs Warp Parameters 4      
    Parameters -m 12 -q      
>
>
Outputs Warp Parameters 4      
Parameters -m 12 -q

5 reslice Inputs Warp Parameters 1      
Changed:
<
<
    Outputs Resliced Image 1 Resliced Header 1    
    Parameters        
>
>
Outputs Resliced Image 1 Resliced Header 1    
Parameters

6 reslice Inputs Warp Parameters 2      
Changed:
<
<
    Outputs Resliced Image 2 Resliced Header 2    
    Parameters        
>
>
Outputs Resliced Image 2 Resliced Header 2    
Parameters

7 reslice Inputs Warp Parameters 3      
Changed:
<
<
    Outputs Resliced Image 3 Resliced Header 3    
    Parameters        
>
>
Outputs Resliced Image 3 Resliced Header 3    
Parameters

8 reslice Inputs Warp Parameters 4      
Changed:
<
<
    Outputs Resliced Image 4 Resliced Header 4    
    Parameters        
>
>
Outputs Resliced Image 4 Resliced Header 4    
Parameters

9 softmean Inputs Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2
Changed:
<
<
    Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
    Outputs Atlas Image Atlas Header    
    Parameters y null      
>
>
Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
Outputs Atlas Image Atlas Header    
Parameters y null

10 slicer Inputs Atlas Image Atlas Header    
Changed:
<
<
    Outputs Atlas X Slice      
    Parameters -x .5      
>
>
Outputs Atlas X Slice      
Parameters -x .5

11 slicer Inputs Atlas Image Atlas Header    
Changed:
<
<
    Outputs Atlas Y Slice      
    Parameters -y .5      
>
>
Outputs Atlas Y Slice      
Parameters -y .5

12 slicer Inputs Atlas Image Atlas Header    
Changed:
<
<
    Outputs Atlas Z Slice      
    Parameters -z .5      
>
>
Outputs Atlas Z Slice      
Parameters -z .5

13 convert Inputs Atlas X Slice      
Changed:
<
<
    Outputs Atlas X Graphic      
    Parameters        
>
>
Outputs Atlas X Graphic      
Parameters

14 convert Inputs Atlas Y Slice      
Changed:
<
<
    Outputs Atlas Y Graphic      
    Parameters        
>
>
Outputs Atlas Y Graphic      
Parameters

15 convert Inputs Atlas Z Slice      
Changed:
<
<
    Outputs Atlas Z Graphic      
    Parameters        
>
>
Outputs Atlas Z Graphic      
Parameters

Core Provenance Queries

Line: 103 to 103

  1. Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc.
  2. Find the process that led to Atlas X Graphic, excluding everything prior to the averaging of images with softmean.
  3. Find the Stage 3, 4 and 5 details of the process that led to Atlas X Graphic.
Changed:
<
<
  1. Find all invocations of procedure align_warp using a twelfth order nonlinear 1365 parameter model (this is apparent as a "-m 12" argument to the procedure) that ran in less than 30 minutes on non-ia64 processors.
>
>
  1. Find all invocations of procedure align_warp using a twelfth order nonlinear 1365 parameter model (see model menu describing possible values of parameter "-m 12" of align_warp) that ran in less than 30 minutes on non-ia64 processors.

  1. Find all output averaged images of softmean (average) procedures, where the warped images taken as input were align_warped using a twelfth order nonlinear 1365 parameter model, i.e. "where softmean was preceded in the workflow, directly or indirectly, by an align_warp procedure with argument -m 12."
  2. A user has run the workflow twice, in the second instance replacing each procedures (convert) in the final stage with two procedures: pgmtoppm, then pnmtojpeg. Find the differences between the two workflow runs.
  3. A user has annotated some anatomy images with a key-value pair center=UChicago. Find the outputs of align_warp where the inputs are annotated with center=UChicago.
Line: 124 to 124

Sample Worflow Implementations

Changed:
<
<
As it may be useful to some, we provide sample implementations of the workflow here. This should not preclude the use of any other technology. The implementations assume that the executables referenced above are all installed.
>
>
As it may be useful to some, we provide sample implementations of the workflow here. This should not preclude the use of any other technology. The implementations assume that the executables referenced above are all installed; they are provided by the two packages AIR (automated image registration) suite and ImageMagick.

Timetable

Line: 143 to 143

-- SimonMiles - 6 June 2006

Changed:
<
<
META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1149007829" path="BrainAtlas.png" size="16079" user="SimonMiles" version="1.7"
>
>
META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1147791744" moveby="LucMoreau" movedto="Challenge.FirstProvenanceChallenge" movedwhen="1149621057" movefrom="Challenge.WebHome" path="BrainAtlas.png" size="5220" user="SimonMiles" version="1.1"

META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
META FILEATTACHMENT workflow.sh attr="" comment="Shell script version of workflow" date="1149088025" path="workflow.sh" size="798" user="SimonMiles" version="1.2"
META FILEATTACHMENT BrainAtlas?.gif attr="" comment="" date="1149615566" path="BrainAtlas.gif" size="41095" user="LucMoreau" version="1.1"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.8 - 06 Jun 2006 - LucMoreau)

META TOPICPARENT WebHome

Provenance Challenge

Line: 13 to 13

Our focus in this challenge is on provenance and not on running the experiment. Hence, to facilitate take-up, while based on a real experiment, the procedures can be implemented as "dummies", i.e. we provide the input, output and intermediate data and participants can use fake procedures that take the right input and produce the right output. Alternatively, if participants wish to enact the real workflow, this is also possible. In addition to this, we define a set of core queries that all partipicipants should show how they address, so we can compare systems.

Changed:
<
<
Each participant in the challenge will have its own page on this TWiki, following this template, where they can inform the rest of their efforts in meeting the challenge. During the provenance challenge, we expect the participants to upload the following to their page, to then allow comparison.
>
>
Each participant in the challenge will have its own page on this TWiki, following the ChallengeTemplate, where they can inform the rest of their efforts in meeting the challenge. During the provenance challenge, we expect the participants to upload the following to their page, to then allow comparison.

  • Representations of provenance for the example workflow
  • Representations of the result of the core (and other) queries
  • Contributions to a matrix of queries vs systems, indicating for each that: (1) the query can be answered by the system, (2) the system cannot answer the query now but considers it relevant, (3) the query is not relevant to the project.
Line: 29 to 29

We propose an example workflow for creating population-based "brain atlases" from the fMRI Data Center's archive of high resolution anatomical data. The workflow is shown below (click for larger image).

Changed:
<
<
BrainAtlas.png
>
>
BrainAtlas.gif

Added:
>
>


It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure (the term stage is introduced only to help description of the workflow, and we do not dictate how it is apparent in a concrete implementation). The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Accession Number: 2-2004-1168X Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center.

Line: 143 to 146

META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1149007829" path="BrainAtlas.png" size="16079" user="SimonMiles" version="1.7"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
META FILEATTACHMENT workflow.sh attr="" comment="Shell script version of workflow" date="1149088025" path="workflow.sh" size="798" user="SimonMiles" version="1.2"
Added:
>
>
META FILEATTACHMENT BrainAtlas?.gif attr="" comment="" date="1149615566" path="BrainAtlas.gif" size="41095" user="LucMoreau" version="1.1"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.7 - 06 Jun 2006 - SimonMiles)

META TOPICPARENT WebHome

Provenance Challenge

Line: 32 to 32

BrainAtlas.png

Changed:
<
<
It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.
>
>
It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure (the term stage is introduced only to help description of the workflow, and we do not dictate how it is apparent in a concrete implementation). The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Accession Number: 2-2004-1168X Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center.

The stages of the workflow are as follows.

Changed:
<
<
  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a [[http://bishopw.loni.ucla.edu/AIR5/file_types.html#warp][_warp parameter set_] defining the spatially transformation to be performed (Warp Params 1 to 4).
>
>
  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a _warp parameter set_ defining the spatially transformation to be performed (Warp Params 1 to 4).

  1. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a resliced image.
  2. All the resliced images are averaged into one single image using softmean.
  3. For each dimension (x, y and z), the averaged image is sliced to give a 2D atlas along a plane in that dimension, taken through the centre of the 3D image. The output is an atlas data set, using slicer.
Line: 100 to 100

  1. Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc.
  2. Find the process that led to Atlas X Graphic, excluding everything prior to the averaging of images with softmean.
  3. Find the Stage 3, 4 and 5 details of the process that led to Atlas X Graphic.
Changed:
<
<
  1. Find all calls to procedure alignwarp, and their runtimes, with argument model=rigid that ran in less than 30 minutes on non-ia64 processors. The argument is given as a string option to the alignwarp procedure.
  2. Find the average runtime of all alignwarp calls with argument model=rigid that ran in less than 30 minutes.
  3. Find all procedure calls within runs of the workflow named ‘/prod/2005/0305/prep’, a version of the brain atlas workflow given above, whose inputs were linearly aligned with model=affine
  4. Find all output averaged images of softmean (average) procedures, where the warped images taken as input were align-warped with model=affine (i.e., “where softmean was preceded in the workflow, directly or indirectly, by an alignwarp procedure with argument model=affine”)
  5. Find all outputs of softmean that were resliced with intensify=3
  6. Find the outputs that were linearly aligned with model=affine and with input LFN metadata center=UChicago
  7. Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all the annotation tags of this set of data
  8. Find all the metadata tags study on actual results of softmean that were linearly aligned with model= affine, and whose outputs have annotation state = IL
>
>
  1. Find all invocations of procedure align_warp using a twelfth order nonlinear 1365 parameter model (this is apparent as a "-m 12" argument to the procedure) that ran in less than 30 minutes on non-ia64 processors.
  2. Find all output averaged images of softmean (average) procedures, where the warped images taken as input were align_warped using a twelfth order nonlinear 1365 parameter model, i.e. "where softmean was preceded in the workflow, directly or indirectly, by an align_warp procedure with argument -m 12."
  3. A user has run the workflow twice, in the second instance replacing each procedures (convert) in the final stage with two procedures: pgmtoppm, then pnmtojpeg. Find the differences between the two workflow runs.
  4. A user has annotated some anatomy images with a key-value pair center=UChicago. Find the outputs of align_warp where the inputs are annotated with center=UChicago.
  5. A user has annotated some atlas graphics with key-value pair where the key is studyModality. Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all other annotations to these files.

Participant Instructions

Line: 141 to 138

  • 2006-September: Face-to-face meeting at which results are discussed
  • 2006-October-15: Comparisons performed, minutes of discussion, proposed next steps uploaded

Changed:
<
<
-- SimonMiles - 31 May 2006
>
>
-- SimonMiles - 6 June 2006

META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1149007829" path="BrainAtlas.png" size="16079" user="SimonMiles" version="1.7"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.6 - 05 Jun 2006 - LucMoreau)

META TOPICPARENT WebHome

Provenance Challenge

Line: 34 to 34

It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

Changed:
<
<
The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center, Accession Number: 2-2004-1168X.
>
>
The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Accession Number: 2-2004-1168X Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center.

The stages of the workflow are as follows.

  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a [[http://bishopw.loni.ucla.edu/AIR5/file_types.html#warp][_warp parameter set_] defining the spatially transformation to be performed (Warp Params 1 to 4).
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.5 - 31 May 2006 - SimonMiles)

META TOPICPARENT WebHome

Provenance Challenge

Line: 13 to 13

Our focus in this challenge is on provenance and not on running the experiment. Hence, to facilitate take-up, while based on a real experiment, the procedures can be implemented as "dummies", i.e. we provide the input, output and intermediate data and participants can use fake procedures that take the right input and produce the right output. Alternatively, if participants wish to enact the real workflow, this is also possible. In addition to this, we define a set of core queries that all partipicipants should show how they address, so we can compare systems.

Changed:
<
<
Each project participating in the challenge will have its own page on this TWiki where they can inform the rest of their efforts in meeting the challenge. During the provenance challenge, we expect the participants to upload the following to their page, to then allow comparison.
>
>
Each participant in the challenge will have its own page on this TWiki, following this template, where they can inform the rest of their efforts in meeting the challenge. During the provenance challenge, we expect the participants to upload the following to their page, to then allow comparison.

  • Representations of provenance for the example workflow
  • Representations of the result of the core (and other) queries
  • Contributions to a matrix of queries vs systems, indicating for each that: (1) the query can be answered by the system, (2) the system cannot answer the query now but considers it relevant, (3) the query is not relevant to the project.
Line: 46 to 46

The full steps, procedures data and parameters are enumerated in the table below. The procedure names are linked to the manual pages for those utilities, and the input and output names to the actual data exchanged between procedures.

Step Procedure Data Role Item 1 Item 2 Item 3 Item 4
Changed:
<
<
1 align_warp Inputs Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header
    Outputs Warp Parameters 1      
    Parameters -m 6 -t1 80 -t2 80 -s 81 27 3 -q      
2 align_warp Inputs Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header
    Outputs Warp Parameters 2      
    Parameters -m 6 -t1 80 -t2 80 -s 81 27 3 -q      
3 align_warp Inputs Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header
    Outputs Warp Parameters 3      
    Parameters -m 6 -t1 80 -t2 80 -s 81 27 3 -q      
4 align_warp Inputs Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header
    Outputs Warp Parameters 4      
    Parameters -m 6 -t1 80 -t2 80 -s 81 27 3 -q      
5 reslice Inputs Warp Parameters 1      
    Outputs Resliced Image 1 Resliced Header 1    
    Parameters        
6 reslice Inputs Warp Parameters 2      
    Outputs Resliced Image 2 Resliced Header 2    
    Parameters        
7 reslice Inputs Warp Parameters 3      
    Outputs Resliced Image 3 Resliced Header 3    
    Parameters        
8 reslice Inputs Warp Parameters 4      
    Outputs Resliced Image 4 Resliced Header 4    
    Parameters        
9 softmean Inputs Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2
    Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
    Outputs Atlas Image Atlas Header    
>
>
1 align_warp Inputs Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header
    Outputs Warp Parameters 1      
    Parameters -m 12 -q      
2 align_warp Inputs Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header
    Outputs Warp Parameters 2      
    Parameters -m 12 -q      
3 align_warp Inputs Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header
    Outputs Warp Parameters 3      
    Parameters -m 12 -q      
4 align_warp Inputs Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header
    Outputs Warp Parameters 4      
    Parameters -m 12 -q      
5 reslice Inputs Warp Parameters 1      
    Outputs Resliced Image 1 Resliced Header 1    
    Parameters        
6 reslice Inputs Warp Parameters 2      
    Outputs Resliced Image 2 Resliced Header 2    
    Parameters        
7 reslice Inputs Warp Parameters 3      
    Outputs Resliced Image 3 Resliced Header 3    
    Parameters        
8 reslice Inputs Warp Parameters 4      
    Outputs Resliced Image 4 Resliced Header 4    
    Parameters        
9 softmean Inputs Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2
    Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
    Outputs Atlas Image Atlas Header    

    Parameters y null      
Changed:
<
<
10 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas X Slice      
>
>
10 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas X Slice      

    Parameters -x .5      
Changed:
<
<
11 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas Y Slice      
>
>
11 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas Y Slice      

    Parameters -y .5      
Changed:
<
<
12 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas Z Slice      
>
>
12 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas Z Slice      

    Parameters -z .5      
Changed:
<
<
13 convert Inputs Atlas X Slice      
    Outputs Atlas X Graphic      
>
>
13 convert Inputs Atlas X Slice      
    Outputs Atlas X Graphic      

    Parameters        
Changed:
<
<
14 convert Inputs Atlas Y Slice      
    Outputs Atlas Y Graphic      
>
>
14 convert Inputs Atlas Y Slice      
    Outputs Atlas Y Graphic      

    Parameters        
Changed:
<
<
15 convert Inputs Atlas Z Slice      
    Outputs Atlas Z Graphic      
>
>
15 convert Inputs Atlas Z Slice      
    Outputs Atlas Z Graphic      

    Parameters        

Core Provenance Queries

Line: 109 to 109

  1. Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all the annotation tags of this set of data
  2. Find all the metadata tags study on actual results of softmean that were linearly aligned with model= affine, and whose outputs have annotation state = IL
Added:
>
>

Participant Instructions

We here give the specific steps that we expect each participating team to perform in completing the challenge.

  • The partipant should determine how they are going to execute the workflow (or a simulation of it) and how it will record data (provenance) about the execution.
  • The team should add the provenance to their TWiki page, and to declare the way in which they executed the workflow, e.g. upload a workflow script.
  • If the partipant has varied the workflow to make it more suitable for their system or to demonstrate an aspect important to their approach, then they should declare what this variation is.
  • The team should then use their systems to answer the core provenance queries, and any others that they wish to perform to demonstrate key aspects of their system.
  • The participant then uploads to the TWiki the queries performed, the way in which the queries were expressed/realised, and the answers they got.
  • For core queries that were not performed, the partipant should say why they were not performed, i.e. whether the query is considered out of scope for the system or in scope but not currently possible to answer.
  • For any data given above, each team should provide a link to an explanation of the representation used so that other participants can interpret it.

Sample Worflow Implementations

As it may be useful to some, we provide sample implementations of the workflow here. This should not preclude the use of any other technology. The implementations assume that the executables referenced above are all installed.

Line: 116 to 129

Timetable

Changed:
<
<
  • 2006-May-30: First full draft of challenge, asking potential participants to help finalise
>
>
  • 2006-May-31: First full draft of challenge, asking potential participants to help finalise

    • Fix provenance challenge proposal TWiki page, with aims, objectives, workflow, sample queries, timetable
    • Upload input, output, intermediate data used in workflow
    • Send provenance challenge call to other possible participants, including:
Line: 128 to 141

  • 2006-September: Face-to-face meeting at which results are discussed
  • 2006-October-15: Comparisons performed, minutes of discussion, proposed next steps uploaded

Changed:
<
<
-- SimonMiles - 30 May 2006
>
>
-- SimonMiles - 31 May 2006

META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1149007829" path="BrainAtlas.png" size="16079" user="SimonMiles" version="1.7"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
Changed:
<
<
META FILEATTACHMENT workflow.sh attr="" comment="Shell script version of workflow" date="1149006866" path="workflow.sh" size="930" user="SimonMiles" version="1.1"
>
>
META FILEATTACHMENT workflow.sh attr="" comment="Shell script version of workflow" date="1149088025" path="workflow.sh" size="798" user="SimonMiles" version="1.2"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.4 - 30 May 2006 - SimonMiles)

META TOPICPARENT WebHome

Provenance Challenge

Line: 34 to 34

It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

Changed:
<
<
The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The stages of the workflow are as follows.
>
>
The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The image data was published with article Frontal-Hippocampal Double Dissociation Between Normal Aging and Alzheimer's Disease by Head, D, Synder, AZ, Girton, LE, Morris, JC, Buckner, RL in the fMRI Data Center, Accession Number: 2-2004-1168X.

The stages of the workflow are as follows.


  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a [[http://bishopw.loni.ucla.edu/AIR5/file_types.html#warp][_warp parameter set_] defining the spatially transformation to be performed (Warp Params 1 to 4).
  2. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a resliced image.
  3. All the resliced images are averaged into one single image using softmean.
Line: 44 to 46

The full steps, procedures data and parameters are enumerated in the table below. The procedure names are linked to the manual pages for those utilities, and the input and output names to the actual data exchanged between procedures.

Step Procedure Data Role Item 1 Item 2 Item 3 Item 4
Changed:
<
<
1 align_warp Inputs Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header
    Outputs Warp Parameters 1      
    Parameters        
2 align_warp Inputs Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header
    Outputs Warp Parameters 2      
    Parameters        
3 align_warp Inputs Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header
    Outputs Warp Parameters 3      
    Parameters        
4 align_warp Inputs Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header
    Outputs Warp Parameters 4      
    Parameters        
5 reslice Inputs Warp Parameters 1      
    Outputs Resliced Image 1 Resliced Header 1    
    Parameters        
6 reslice Inputs Warp Parameters 2      
    Outputs Resliced Image 2 Resliced Header 2    
    Parameters        
7 reslice Inputs Warp Parameters 3      
    Outputs Resliced Image 3 Resliced Header 3    
    Parameters        
8 reslice Inputs Warp Parameters 4      
    Outputs Resliced Image 4 Resliced Header 4    
>
>
1 align_warp Inputs Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header
    Outputs Warp Parameters 1      
    Parameters -m 6 -t1 80 -t2 80 -s 81 27 3 -q      
2 align_warp Inputs Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header
    Outputs Warp Parameters 2      
    Parameters -m 6 -t1 80 -t2 80 -s 81 27 3 -q      
3 align_warp Inputs Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header
    Outputs Warp Parameters 3      
    Parameters -m 6 -t1 80 -t2 80 -s 81 27 3 -q      
4 align_warp Inputs Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header
    Outputs Warp Parameters 4      
    Parameters -m 6 -t1 80 -t2 80 -s 81 27 3 -q      
5 reslice Inputs Warp Parameters 1      
    Outputs Resliced Image 1 Resliced Header 1    
    Parameters        
6 reslice Inputs Warp Parameters 2      
    Outputs Resliced Image 2 Resliced Header 2    
    Parameters        
7 reslice Inputs Warp Parameters 3      
    Outputs Resliced Image 3 Resliced Header 3    
    Parameters        
8 reslice Inputs Warp Parameters 4      
    Outputs Resliced Image 4 Resliced Header 4    
    Parameters        
9 softmean Inputs Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2
    Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
    Outputs Atlas Image Atlas Header    
    Parameters y null      
10 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas X Slice      
    Parameters -x .5      
11 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas Y Slice      
    Parameters -y .5      
12 slicer Inputs Atlas Image Atlas Header    
    Outputs Atlas Z Slice      
    Parameters -z .5      
13 convert Inputs Atlas X Slice      
    Outputs Atlas X Graphic      

    Parameters        
Changed:
<
<
9 softmean Inputs Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2
    Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
    Outputs Atlas Image Atlas Header    
>
>
14 convert Inputs Atlas Y Slice      
    Outputs Atlas Y Graphic      

    Parameters        
Changed:
<
<
10 slicer Inputs Atlas Image      
    Outputs Atlas X Slice      
    Parameters        
11 slicer Inputs Atlas Image      
    Outputs Atlas Y Slice      
    Parameters        
12 slicer Inputs Atlas Image      
    Outputs Atlas Z Slice      
    Parameters        
13 convert Inputs Atlas X Slice      
    Outputs Atlas X Graphic      
    Parameters        
14 convert Inputs Atlas Y Slice      
    Outputs Atlas Y Graphic      
    Parameters        
15 convert Inputs Atlas Z Slice      
    Outputs Atlas Z Graphic      
>
>
15 convert Inputs Atlas Z Slice      
    Outputs Atlas Z Graphic      

    Parameters        

Core Provenance Queries

Line: 107 to 109

  1. Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all the annotation tags of this set of data
  2. Find all the metadata tags study on actual results of softmean that were linearly aligned with model= affine, and whose outputs have annotation state = IL
Added:
>
>

Sample Worflow Implementations

As it may be useful to some, we provide sample implementations of the workflow here. This should not preclude the use of any other technology. The implementations assume that the executables referenced above are all installed.


Timetable

  • 2006-May-30: First full draft of challenge, asking potential participants to help finalise
Line: 121 to 128

  • 2006-September: Face-to-face meeting at which results are discussed
  • 2006-October-15: Comparisons performed, minutes of discussion, proposed next steps uploaded

Changed:
<
<
-- SimonMiles - 24 May 2006
>
>
-- SimonMiles - 30 May 2006

Changed:
<
<
META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1148311076" path="BrainAtlas.png" size="14326" user="SimonMiles" version="1.3"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1148311137" path="BrainAtlas.pdf" size="143791" user="SimonMiles" version="1.1"
>
>
META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1149007829" path="BrainAtlas.png" size="16079" user="SimonMiles" version="1.7"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1149007254" path="BrainAtlas.pdf" size="121603" user="SimonMiles" version="1.2"
META FILEATTACHMENT workflow.sh attr="" comment="Shell script version of workflow" date="1149006866" path="workflow.sh" size="930" user="SimonMiles" version="1.1"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.3 - 24 May 2006 - SimonMiles)

META TOPICPARENT WebHome

Provenance Challenge

Aims

The provenance challenge aims to establish an understanding of the capabilities of available provenance-related systems and, in particular, the following details.

Changed:
<
<
  • The representations that systems use to document details of processes that have occurred (e.g. workflows run)
>
>
  • The representations that systems use to document details of processes that have occurred

  • The capabilities of each system in answering provenance-related queries
  • What each system considers to be within scope of the topic of provenance (regardless of whether the system can yet achieve all problems in that scope)
Changed:
<
<
To help achieve the aims, we define a single example process that forms the basis of the challenge. It is inspired from a real experiment, in the area of fMRI (Functional Magnetic Resonance Imaging). Here, we use the term process to denote a series of procedures being performed in a system, each taking some data as input and producing other data as output. We do not assume that these procedures must use some particular form of technology (EXE files, Web Services etc.) or that the process is explicitly defined in a workflow script, but individual participants will adopt their technology of choice to execute the process. Our focus in this challenge is on provenance and not on running the experiment. Hence, to facilitate take-up, while based on a real experiment, the procedures can be implemented as "dummies", i.e. we provide the input, output and intermediate data and the projects use fake procedures that take the right input and produce the right output. If a project wishes to enact the real process, this is also possible.
>
>
To help achieve the aims, we define a single example workflow that forms the basis of the challenge. It is inspired from a real experiment, in the area of Functional Magnetic Resonance Imaging (fMRI). Here, we use the term workflow to denote a series of procedures being performed in a system, each taking some data as input and producing other data as output. We do not assume that these procedures must use some particular form of technology (EXE files, Web Services etc.) or that the workflow is explicitly defined in a workflow technology (BPEL, compiled executable, Scufl, batch file etc.), but individual participants will adopt their technology of choice.

Changed:
<
<
In addition to this, we will define a set of core queries that all partipicipants should show how they address, so we can compare systems. The process and core queries will be agreed upon collectively by the challenge participants.
>
>
Our focus in this challenge is on provenance and not on running the experiment. Hence, to facilitate take-up, while based on a real experiment, the procedures can be implemented as "dummies", i.e. we provide the input, output and intermediate data and participants can use fake procedures that take the right input and produce the right output. Alternatively, if participants wish to enact the real workflow, this is also possible. In addition to this, we define a set of core queries that all partipicipants should show how they address, so we can compare systems.

Each project participating in the challenge will have its own page on this TWiki where they can inform the rest of their efforts in meeting the challenge. During the provenance challenge, we expect the participants to upload the following to their page, to then allow comparison.

Changed:
<
<
  • Representations of provenance for the example process
>
>
  • Representations of provenance for the example workflow
  • Representations of the result of the core (and other) queries

  • Contributions to a matrix of queries vs systems, indicating for each that: (1) the query can be answered by the system, (2) the system cannot answer the query now but considers it relevant, (3) the query is not relevant to the project.

Optionally, the participants may like to contribute the following.

Deleted:
<
<
  • Extensions to the example process to best illustrate the unique aspects of their system

  • Additional queries (beyond the core queries) that illustrate the scope of their system
Added:
>
>
  • Representations of the workflow in their system
  • Extensions to the example workflow to best illustrate the unique aspects of their system

  • Any categorisation of queries that the project considers to have practical value
Changed:
<
<

Example Process

>
>

Example Workflow


Changed:
<
<
We propose an example process for creating population-based "brain atlases" from the fMRI Data Center's archive of high resolution anatomical MR data, using a multi-stage I/O intensive pipeline that warps the anatomical features of the brain of multiple subjects into a standard space. The process is shown below (click for larger image).
>
>
We propose an example workflow for creating population-based "brain atlases" from the fMRI Data Center's archive of high resolution anatomical data. The workflow is shown below (click for larger image).

BrainAtlas.png

Line: 32 to 34

It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

Changed:
<
<
The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the header/metadata information for that image (Anatomy Header 1 to 4). The stages of the workflow do as follows.
  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a warp parameter set defining the warp to be done (Warp Params 1 to 4).
  2. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a warped image.
  3. All the warped images are averaged into one single image using softmean.
>
>
The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the metadata information for that image (Anatomy Header 1 to 4). The stages of the workflow are as follows.
  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a [[http://bishopw.loni.ucla.edu/AIR5/file_types.html#warp][_warp parameter set_] defining the spatially transformation to be performed (Warp Params 1 to 4).
  2. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a resliced image.
  3. All the resliced images are averaged into one single image using softmean.

  1. For each dimension (x, y and z), the averaged image is sliced to give a 2D atlas along a plane in that dimension, taken through the centre of the 3D image. The output is an atlas data set, using slicer.
  2. For each atlas data set, it is converted into a graphical atlas image using (the ImageMagick utility) convert.

The full steps, procedures data and parameters are enumerated in the table below. The procedure names are linked to the manual pages for those utilities, and the input and output names to the actual data exchanged between procedures.

Changed:
<
<
Step Procedure Input 1 Input 2 Input 3 Input 4 Input 5 Input 6 Input 7 Input 8 Output 1 Output 2 Parameters
1 align_warp Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header         Warp Parameters 1    
  Parameters
  Outputs    
2 align_warp Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header         Warp Parameters 2    
3 align_warp Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header         Warp Parameters 3    
4 align_warp Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header         Warp Parameters 4    
5 reslice Warp Parameters 1               Resliced Image 1 Resliced Header 1  
6 reslice Warp Parameters 2               Resliced Image 2 Resliced Header 2  
7 reslice Warp Parameters 3               Resliced Image 3 Resliced Header 3  
8 reslice Warp Parameters 4               Resliced Image 4 Resliced Header 4  
9 softmean Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2 Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4 Atlas Image Atlas Header  
10 slicer Atlas Image               Atlas X Slice    
11 slicer Atlas Image               Atlas Y Slice    
12 slicer Atlas Image               Atlas Z Slice    
13 convert Atlas X Slice               Atlas X Graphic    
14 convert Atlas Y Slice               Atlas Y Graphic    
15 convert Atlas Z Slice               Atlas Z Graphic    
>
>
Step Procedure Data Role Item 1 Item 2 Item 3 Item 4
1 align_warp Inputs Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header
    Outputs Warp Parameters 1      
    Parameters        
2 align_warp Inputs Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header
    Outputs Warp Parameters 2      
    Parameters        
3 align_warp Inputs Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header
    Outputs Warp Parameters 3      
    Parameters        
4 align_warp Inputs Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header
    Outputs Warp Parameters 4      
    Parameters        
5 reslice Inputs Warp Parameters 1      
    Outputs Resliced Image 1 Resliced Header 1    
    Parameters        
6 reslice Inputs Warp Parameters 2      
    Outputs Resliced Image 2 Resliced Header 2    
    Parameters        
7 reslice Inputs Warp Parameters 3      
    Outputs Resliced Image 3 Resliced Header 3    
    Parameters        
8 reslice Inputs Warp Parameters 4      
    Outputs Resliced Image 4 Resliced Header 4    
    Parameters        
9 softmean Inputs Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2
    Inputs Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4
    Outputs Atlas Image Atlas Header    
    Parameters        
10 slicer Inputs Atlas Image      
    Outputs Atlas X Slice      
    Parameters        
11 slicer Inputs Atlas Image      
    Outputs Atlas Y Slice      
    Parameters        
12 slicer Inputs Atlas Image      
    Outputs Atlas Z Slice      
    Parameters        
13 convert Inputs Atlas X Slice      
    Outputs Atlas X Graphic      
    Parameters        
14 convert Inputs Atlas Y Slice      
    Outputs Atlas Y Graphic      
    Parameters        
15 convert Inputs Atlas Z Slice      
    Outputs Atlas Z Graphic      
    Parameters        

Core Provenance Queries

Changed:
<
<
An initial set of provenance-related queries is given below. These are intended only as a starting point for discussion to find the core provenance questions for the challenge.
>
>
An initial set of provenance-related queries is given below.

  1. Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc.
  2. Find the process that led to Atlas X Graphic, excluding everything prior to the averaging of images with softmean.
Added:
>
>
  1. Find the Stage 3, 4 and 5 details of the process that led to Atlas X Graphic.

  1. Find all calls to procedure alignwarp, and their runtimes, with argument model=rigid that ran in less than 30 minutes on non-ia64 processors. The argument is given as a string option to the alignwarp procedure.
  2. Find the average runtime of all alignwarp calls with argument model=rigid that ran in less than 30 minutes.
  3. Find all procedure calls within runs of the workflow named ‘/prod/2005/0305/prep’, a version of the brain atlas workflow given above, whose inputs were linearly aligned with model=affine
Line: 89 to 121

  • 2006-September: Face-to-face meeting at which results are discussed
  • 2006-October-15: Comparisons performed, minutes of discussion, proposed next steps uploaded

Changed:
<
<
-- SimonMiles - 22 May 2006
>
>
-- SimonMiles - 24 May 2006

META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1148311076" path="BrainAtlas.png" size="14326" user="SimonMiles" version="1.3"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1148311137" path="BrainAtlas.pdf" size="143791" user="SimonMiles" version="1.1"
 <<O>>  Difference Topic FirstProvenanceChallenge (r1.2 - 23 May 2006 - LucMoreau)

META TOPICPARENT WebHome

Provenance Challenge

Line: 37 to 37

  1. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a warped image.
  2. All the warped images are averaged into one single image using softmean.
  3. For each dimension (x, y and z), the averaged image is sliced to give a 2D atlas along a plane in that dimension, taken through the centre of the 3D image. The output is an atlas data set, using slicer.
Changed:
<
<
  1. For each atlas data set, it is converted into a graphical atlas image using (the UNIX utility) convert.
>
>
  1. For each atlas data set, it is converted into a graphical atlas image using (the ImageMagick utility) convert.

The full steps, procedures data and parameters are enumerated in the table below. The procedure names are linked to the manual pages for those utilities, and the input and output names to the actual data exchanged between procedures.

Step Procedure Input 1 Input 2 Input 3 Input 4 Input 5 Input 6 Input 7 Input 8 Output 1 Output 2 Parameters
1 align_warp Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header         Warp Parameters 1    
Added:
>
>
  Parameters
  Outputs    

2 align_warp Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header         Warp Parameters 2    
3 align_warp Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header         Warp Parameters 3    
4 align_warp Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header         Warp Parameters 4    
Line: 54 to 56

10 slicer Atlas Image               Atlas X Slice    
11 slicer Atlas Image               Atlas Y Slice    
12 slicer Atlas Image               Atlas Z Slice    
Changed:
<
<
13 convert Atlas X Slice               Atlas X Graphic    
14 convert Atlas Y Slice               Atlas Y Graphic    
15 convert Atlas Z Slice               Atlas Z Graphic    
>
>
13 convert Atlas X Slice               Atlas X Graphic    
14 convert Atlas Y Slice               Atlas Y Graphic    
15 convert Atlas Z Slice               Atlas Z Graphic    

Core Provenance Queries

 <<O>>  Difference Topic FirstProvenanceChallenge (r1.1 - 22 May 2006 - SimonMiles)
Line: 1 to 1
Added:
>
>
META TOPICPARENT WebHome

Provenance Challenge

Aims

The provenance challenge aims to establish an understanding of the capabilities of available provenance-related systems and, in particular, the following details.

  • The representations that systems use to document details of processes that have occurred (e.g. workflows run)
  • The capabilities of each system in answering provenance-related queries
  • What each system considers to be within scope of the topic of provenance (regardless of whether the system can yet achieve all problems in that scope)

To help achieve the aims, we define a single example process that forms the basis of the challenge. It is inspired from a real experiment, in the area of fMRI (Functional Magnetic Resonance Imaging). Here, we use the term process to denote a series of procedures being performed in a system, each taking some data as input and producing other data as output. We do not assume that these procedures must use some particular form of technology (EXE files, Web Services etc.) or that the process is explicitly defined in a workflow script, but individual participants will adopt their technology of choice to execute the process. Our focus in this challenge is on provenance and not on running the experiment. Hence, to facilitate take-up, while based on a real experiment, the procedures can be implemented as "dummies", i.e. we provide the input, output and intermediate data and the projects use fake procedures that take the right input and produce the right output. If a project wishes to enact the real process, this is also possible.

In addition to this, we will define a set of core queries that all partipicipants should show how they address, so we can compare systems. The process and core queries will be agreed upon collectively by the challenge participants.

Each project participating in the challenge will have its own page on this TWiki where they can inform the rest of their efforts in meeting the challenge. During the provenance challenge, we expect the participants to upload the following to their page, to then allow comparison.

  • Representations of provenance for the example process
  • Contributions to a matrix of queries vs systems, indicating for each that: (1) the query can be answered by the system, (2) the system cannot answer the query now but considers it relevant, (3) the query is not relevant to the project.

Optionally, the participants may like to contribute the following.

  • Extensions to the example process to best illustrate the unique aspects of their system
  • Additional queries (beyond the core queries) that illustrate the scope of their system
  • Any categorisation of queries that the project considers to have practical value

Example Process

We propose an example process for creating population-based "brain atlases" from the fMRI Data Center's archive of high resolution anatomical MR data, using a multi-stage I/O intensive pipeline that warps the anatomical features of the brain of multiple subjects into a standard space. The process is shown below (click for larger image).

BrainAtlas.png

It is comprised of procedures, shown as orange ovals, and data items flowing between them, shown as rectangles. It can be seen as five stages, where each stage is depicted as a horizontal row of the same procedure in the figure. The procedures employ the AIR (automated image registration) suite to create an averaged brain from a collection of high resolution anatomical data. In addition to the data items shown in the figure, there are other inputs to procedures (constant string options), defined below.

The inputs to a workflow are a set of new brain images (Anatomy Image 1 to 4) and a single reference brain image (Reference Image). All input images are 3D scans of a brain of varying resolutions, so that different features are evident. For each image, there is the actual image and the header/metadata information for that image (Anatomy Header 1 to 4). The stages of the workflow do as follows.

  1. For each new brain image, align_warp compares the reference image to determine how the new image should be warped, i.e. the position and shape of the image adjusted, to match the reference brain. The output of each procedure in the stage is a warp parameter set defining the warp to be done (Warp Params 1 to 4).
  2. For each warp parameter set, the actual transformation of the image is done by reslice, which creates a new version of the original new brain image with the configuration defined in the warp parameter set. The output is a warped image.
  3. All the warped images are averaged into one single image using softmean.
  4. For each dimension (x, y and z), the averaged image is sliced to give a 2D atlas along a plane in that dimension, taken through the centre of the 3D image. The output is an atlas data set, using slicer.
  5. For each atlas data set, it is converted into a graphical atlas image using (the UNIX utility) convert.

The full steps, procedures data and parameters are enumerated in the table below. The procedure names are linked to the manual pages for those utilities, and the input and output names to the actual data exchanged between procedures.

Step Procedure Input 1 Input 2 Input 3 Input 4 Input 5 Input 6 Input 7 Input 8 Output 1 Output 2 Parameters
1 align_warp Anatomy Image 1 Anatomy Header 1 Reference Image Reference Header         Warp Parameters 1    
2 align_warp Anatomy Image 2 Anatomy Header 2 Reference Image Reference Header         Warp Parameters 2    
3 align_warp Anatomy Image 3 Anatomy Header 3 Reference Image Reference Header         Warp Parameters 3    
4 align_warp Anatomy Image 4 Anatomy Header 4 Reference Image Reference Header         Warp Parameters 4    
5 reslice Warp Parameters 1               Resliced Image 1 Resliced Header 1  
6 reslice Warp Parameters 2               Resliced Image 2 Resliced Header 2  
7 reslice Warp Parameters 3               Resliced Image 3 Resliced Header 3  
8 reslice Warp Parameters 4               Resliced Image 4 Resliced Header 4  
9 softmean Resliced Image 1 Resliced Header 1 Resliced Image 2 Resliced Header 2 Resliced Image 3 Resliced Header 3 Resliced Image 4 Resliced Header 4 Atlas Image Atlas Header  
10 slicer Atlas Image               Atlas X Slice    
11 slicer Atlas Image               Atlas Y Slice    
12 slicer Atlas Image               Atlas Z Slice    
13 convert Atlas X Slice               Atlas X Graphic    
14 convert Atlas Y Slice               Atlas Y Graphic    
15 convert Atlas Z Slice               Atlas Z Graphic    

Core Provenance Queries

An initial set of provenance-related queries is given below. These are intended only as a starting point for discussion to find the core provenance questions for the challenge.

  1. Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc.
  2. Find the process that led to Atlas X Graphic, excluding everything prior to the averaging of images with softmean.
  3. Find all calls to procedure alignwarp, and their runtimes, with argument model=rigid that ran in less than 30 minutes on non-ia64 processors. The argument is given as a string option to the alignwarp procedure.
  4. Find the average runtime of all alignwarp calls with argument model=rigid that ran in less than 30 minutes.
  5. Find all procedure calls within runs of the workflow named ‘/prod/2005/0305/prep’, a version of the brain atlas workflow given above, whose inputs were linearly aligned with model=affine
  6. Find all output averaged images of softmean (average) procedures, where the warped images taken as input were align-warped with model=affine (i.e., “where softmean was preceded in the workflow, directly or indirectly, by an alignwarp procedure with argument model=affine”)
  7. Find all outputs of softmean that were resliced with intensify=3
  8. Find the outputs that were linearly aligned with model=affine and with input LFN metadata center=UChicago
  9. Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all the annotation tags of this set of data
  10. Find all the metadata tags study on actual results of softmean that were linearly aligned with model= affine, and whose outputs have annotation state = IL

Timetable

  • 2006-May-30: First full draft of challenge, asking potential participants to help finalise
    • Fix provenance challenge proposal TWiki page, with aims, objectives, workflow, sample queries, timetable
    • Upload input, output, intermediate data used in workflow
    • Send provenance challenge call to other possible participants, including:
      • Aims, objectives, workflow, sample queries, timetable
      • Request for comments on the above
      • Request to suggest what should be core queries
  • 2006-June-15: Challenge finalised, participants start!
  • 2006-September: Deadline for challenge results to be uploaded
  • 2006-September: Face-to-face meeting at which results are discussed
  • 2006-October-15: Comparisons performed, minutes of discussion, proposed next steps uploaded

-- SimonMiles - 22 May 2006

META FILEATTACHMENT BrainAtlas?.png attr="" comment="Brain Atlas workflow" date="1148311076" path="BrainAtlas.png" size="14326" user="SimonMiles" version="1.3"
META FILEATTACHMENT BrainAtlas?.pdf attr="" comment="Brain Atlas workflow (hi-res)" date="1148311137" path="BrainAtlas.pdf" size="143791" user="SimonMiles" version="1.1"
Revision r1.1 - 22 May 2006 - 15:08 - SimonMiles
Revision r1.18 - 10 Nov 2006 - 14:34 - SimonMiles