Provenance Challenge TemplateParticipating Team | ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| ||||||||
Provenance Challenge Template | ||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
Provenance Challenge TemplateParticipating Team | ||||||||
Provenance Challenge Template | ||||||||||||||||
| Line: 17 to 17 | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Provenance Trace | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
Our model is composed of base tables which provide minimal information about the workflow execution, the workflow specification, and the relationships between the specification and execution. The model is represented though the E/R diagram below. The tables IMMCONTAINS and USERVIEW will be defined in the section Worflow Variation. The other tables and their schemas are listed below. We list the type of each column (VARCHAR are strings) and the primary keys (in bold). For each table, we give an example based on the Challenge workflow. | |||||||||||||||
| > > |
Our model is composed of base tables which provide minimal information about the workflow execution, the workflow specification, and the relationships between the specification and execution. The model is represented though the E/R diagram below. The tables IMMCONTAINS, IMMSTEPCONTAINS and USERVIEW will be defined in the section Worflow Variants. The other tables and their schemas are listed below. We list the type of each column (VARCHAR are strings) and the primary keys (in bold). For each table, we give an example based on the Challenge workflow. | |||||||||||||||
Figure 1 – E/R schema for the model of Provenance
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
Note: The content of the tables is provided at the end of this page (see images tables1.png to tables4.png). | |||||||||||||||
| > > |
Note: The content of the tables is provided at the end of this page (see images tablesData-1.jpg to tablesData-4.jpg). | |||||||||||||||
| ||||||||||||||||
| Line: 515 to 515 | ||||||||||||||||
| box3 reslice | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
|
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
CREATE VIEW virtualInstanceOf | |||||||||||||||
| > > |
CREATE VIEW | |||||||||||||||
| AS | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
SELECT step step, stepClass, ts
FROM instanceOf
UNION
select min(step) || '-' || compoStepClass, compoStepClass, min(ts)
from contains, instanceOf
where contains.stepClass = instanceOf.stepClass
group By compoStepClass
STEP STEPCLASS TS ------------------------------- 1 align_warp 8/7/2006 1-box1 box1 8/7/2006 <nop> | |||||||||||||||
| > > |
SELECT DISTINCT CONNECT_BY_ROOT compoStep compoStep, step FROM immStepContains CONNECT BY PRIOR step=compoStep ORDER BY compoStep | |||||||||||||||
| Added: | ||||||||||||||||
| > > |
COMPOSTEP STEP --------------------------------- stepBox1-1 1 stepBox1-1 5 stepBox1-2 2 stepBox1-2 6 stepBox1-3 3 stepBox1-3 7 stepBox1-4 4 stepBox1-4 8 stepBox2-1 10 stepBox2-1 13 stepBox2-2 11 stepBox2-2 14 stepBox2-3 12 stepBox2-3 15 stepBox3-1 1 stepBox3-1 10 stepBox3-1 13 stepBox3-1 5 stepBox3-1 9 stepBox3-1 stepBox1-1 stepBox3-1 stepBox2-1 | |||||||||||||||
| ||||||||||||||||
| Line: 565 to 588 | ||||||||||||||||
| -- First union operand: "For each containment, the input is an input of -- the higher level step class if and only if this input was not an output -- from another of its sub steps | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
SELECT inst2.step, input.dataId, min(input.ts) ts | |||||||||||||||
| > > |
SELECT containsStep.compoStep step, input.dataId, min(input.ts) ts | |||||||||||||||
| FROM input, | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
instanceOf inst1, contains, virtualInstanceOf inst2 | |||||||||||||||
| > > |
containsStep | |||||||||||||||
| WHERE | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
input.step = inst1.step AND inst1.stepClass = contains.stepClass AND contains.compoStepClass = inst2.stepClass | |||||||||||||||
| > > |
input.step = containsStep.step | |||||||||||||||
| AND NOT EXISTS ( SELECT o.step | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
FROM output o, instanceOf inst3, contains c2 | |||||||||||||||
| > > |
FROM output o, containsStep c2 | |||||||||||||||
| WHERE o.dataId = input.dataId | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
AND o.step = inst3.step AND inst3.stepClass = c2.stepClass AND c2.compoStepClass = contains.compoStepClass | |||||||||||||||
| > > |
AND o.step = c2.step AND c2.compoStep = containsStep.compoStep | |||||||||||||||
| ) | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
GROUP BY inst2.step, input.dataId | |||||||||||||||
| > > |
GROUP BY containsStep.compoStep, input.dataId | |||||||||||||||
| -- Second union operand -- We want to keep the direct input relations UNION SELECT step, dataId, ts | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
FROM input | |||||||||||||||
| > > |
FROM input; | |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
As an example, the step 10-box2 takes DATAID 23 (Atlas Image) and DATAID 24 (Atlas Header) as inputs. | |||||||||||||||
| > > |
As an example, the step stepBox2-1 takes DATAID 23 (Atlas Image) and DATAID 24 (Atlas Header) as inputs. | |||||||||||||||
STEP DATAID TS | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
10-box2 23 8/17/2006 10-box2 24 8/17/2006 | |||||||||||||||
| > > |
stepBox2-1 23 8/17/2006 stepBox2-1 24 8/17/2006 | |||||||||||||||
| ||||||||||||||||
| Line: 605 to 623 | ||||||||||||||||
| AS -- First union operand: "For each containment, the output is an output of -- the higher level step class if and only if this output is used as an | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
-- input for another step class which is not contained in the same step class | |||||||||||||||
| > > |
-- input for another step which is not contained in the same step | |||||||||||||||
| -- or if this is a final output of the workflow | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
SELECT inst2.step, output.dataId, output.ts | |||||||||||||||
| > > |
SELECT containsStep.compoStep step, output.dataId, output.ts | |||||||||||||||
| FROM output, | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
instanceOf inst1, contains, virtualInstanceOf inst2 | |||||||||||||||
| > > |
containsStep | |||||||||||||||
| WHERE | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
output.step = inst1.step AND inst1.stepClass = contains.stepClass AND contains.compoStepClass = inst2.stepClass | |||||||||||||||
| > > |
output.step = containsStep.step | |||||||||||||||
| AND EXISTS ( | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
/* Step class is used as an entry of * at least one step * class which is not a substep of this step class */ | |||||||||||||||
| > > |
/* Step class is used as an entry of at least one step * which is not a substep of this step class */ | |||||||||||||||
| SELECT i.step | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
FROM input i, instanceOf inst3 | |||||||||||||||
| > > |
FROM input i | |||||||||||||||
| WHERE i.dataId = output.dataId | ||||||||||||||||
| Deleted: | ||||||||||||||||
| < < |
AND i.step = inst3.step | |||||||||||||||
| AND NOT EXISTS ( | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
SELECT contains.compoStepClass FROM contains WHERE contains.stepClass = inst3.stepClass AND contains.compoStepClass = inst2.stepClass | |||||||||||||||
| > > |
SELECT cont2.compoStep FROM containsStep cont2 WHERE cont2.step = i.step AND cont2.compoStep = containsStep.compoStep | |||||||||||||||
| ) UNION -- Step class is a final output? | ||||||||||||||||
| Line: 649 to 661 | ||||||||||||||||
| SELECT step, dataId, ts FROM output | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
As an example, step 10-box2 produced data ID 28 to 30 (Atlas Graphic X, Y and Z) on 08/20/2006, 08/21/2006, and 08/22/2006 respectively. | |||||||||||||||
| > > |
As an example, step stepBox2-1 produced data ID 28 (Atlas Graphic X) on 08/20/2006. | |||||||||||||||
STEP DATAID TS ---------------------------- | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
10-box2 28 8/20/2006 10-box2 29 8/21/2006 10-box2 30 8/22/2006 | |||||||||||||||
| > > |
stepBox2-1 28 8/20/2006 | |||||||||||||||
| ||||||||||||||||
| Line: 674 to 684 | ||||||||||||||||
| input.ts time FROM userView, | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
virtualInstanceOf instanceOf, | |||||||||||||||
| > > |
instanceOf, | |||||||||||||||
| cinput input, data dataInput, coutput output, | ||||||||||||||||
| Line: 686 to 696 | ||||||||||||||||
| AND input.step = output.step AND output.dataId = dataOutput.dataId AND input.ts <= output.ts | ||||||||||||||||
| Deleted: | ||||||||||||||||
| < < |
--ORDER BY usr, step, input | |||||||||||||||
| As an example, let us consider three user views and their perspectives on stage 1 of the workflow (align_warp). | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
The user ‘uAdmin’ can see all the steps of the workflow. First, he can see the wrap param images produced (DATAID 11 to 14). Then, he can see that only data ID 1, 2, 9, and 10 are used as inputs to step 1. | |||||||||||||||
| > > |
The user ‘uAdmin’ can see all the steps of the workflow; and thus can see all data. | |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
The user ‘uBio’ sees stages 1 and 2 of the workflow as a single black box. First, he cannot see data whose DATAID are 11 to 13. Second, for him DATAIDs 1 through 10 are used to produce DATAIDs 15 through 22 as outputs. | |||||||||||||||
| > > |
The user ‘uBio’ sees align_warp and reslice calls as single boxes (step class box1), thus he cannot see data whose ids are 11 to 14 (Warp Params 1 to 4). He also sees slicer and convert as single boxes (step class box2), thus he cannot see data with ids 25 to 27 (Atlas X Slice, Atlas Y Slice and Atlas Z Slice). | |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
The user ‘uBlackBox’ cannot see the details of step 1 either. He is in the same case as UBio except that he cannot see data whose DATAIDs are 15 through 22. The only outputs of the workflow that he can see are the final outputs that is, DATAID 28 through 30. | |||||||||||||||
| > > |
The user ‘uBlackBox’ cannot see any detail of the workflow. He sees everything as a big box which entries are data id 1 to 10 and outputs are data id 28 to 30. | |||||||||||||||
The implications of user views when it comes to query answering, are discussed in the following section.
Suggested Queries | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
In this section, we answer query 1 considering user views and we introduce two other queries to emphasize the benefit of our approach. We will show that user views allow
| |||||||||||||||
| > > |
In this section, we answer query 1 considering user views and we introduce another query to emphasize the benefit of our approach. | |||||||||||||||
| ||||||||||||||||
| Line: 722 to 728 | ||||||||||||||||
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
1-box3 box3 1 Anatomy Image1 28 Atlas X Graphic 1-box3 box3 10 Reference Header 28 Atlas X Graphic 1-box3 box3 2 Anatomy Header1 28 Atlas X Graphic 1-box3 box3 3 Anatomy Image2 28 Atlas X Graphic 1-box3 box3 4 Anatomy Header2 28 Atlas X Graphic 1-box3 box3 5 Anatomy Image3 28 Atlas X Graphic 1-box3 box3 6 Anatomy Header3 28 Atlas X Graphic 1-box3 box3 7 Anatomy Image4 28 Atlas X Graphic 1-box3 box3 8 Anatomy Header4 28 Atlas X Graphic 1-box3 box3 9 Reference Image 28 Atlas X Graphic | |||||||||||||||
| > > |
stepBox3-1 box3 1 Anatomy Image1 23 Atlas Image stepBox3-1 box3 1 Anatomy Image1 24 Atlas Header stepBox3-1 box3 1 Anatomy Image1 28 Atlas X Graphic stepBox3-1 box3 10 Reference Header 23 Atlas Image stepBox3-1 box3 10 Reference Header 24 Atlas Header stepBox3-1 box3 10 Reference Header 28 Atlas X Graphic stepBox3-1 box3 17 Resliced Image2 23 Atlas Image stepBox3-1 box3 17 Resliced Image2 24 Atlas Header stepBox3-1 box3 17 Resliced Image2 28 Atlas X Graphic stepBox3-1 box3 18 Resliced Header2 23 Atlas Image stepBox3-1 box3 18 Resliced Header2 24 Atlas Header stepBox3-1 box3 18 Resliced Header2 28 Atlas X Graphic stepBox3-1 box3 19 Resliced Image3 23 Atlas Image stepBox3-1 box3 19 Resliced Image3 24 Atlas Header stepBox3-1 box3 19 Resliced Image3 28 Atlas X Graphic stepBox3-1 box3 2 Anatomy Header1 23 Atlas Image stepBox3-1 box3 2 Anatomy Header1 24 Atlas Header stepBox3-1 box3 2 Anatomy Header1 28 Atlas X Graphic stepBox3-1 box3 20 Resliced Header3 23 Atlas Image stepBox3-1 box3 20 Resliced Header3 24 Atlas Header stepBox3-1 box3 20 Resliced Header3 28 Atlas X Graphic stepBox3-1 box3 21 Resliced Image4 23 Atlas Image stepBox3-1 box3 21 Resliced Image4 24 Atlas Header stepBox3-1 box3 21 Resliced Image4 28 Atlas X Graphic stepBox3-1 box3 22 Resliced Header4 23 Atlas Image stepBox3-1 box3 22 Resliced Header4 24 Atlas Header stepBox3-1 box3 22 Resliced Header4 28 Atlas X Graphic stepBox3-1 box3 9 Reference Image 23 Atlas Image stepBox3-1 box3 9 Reference Image 24 Atlas Header stepBox3-1 box3 9 Reference Image 28 Atlas X Graphic | |||||||||||||||
| ||||||||||||||||
| Line: 743 to 768 | ||||||||||||||||
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME | ||||||||||||||||
| Deleted: | ||||||||||||||||
| < < |
1-box1 box1 1 Anatomy Image1 15 Resliced Image1 1-box1 box1 1 Anatomy Image1 16 Resliced Header1 1-box1 box1 1 Anatomy Image1 17 Resliced Image2 1-box1 box1 1 Anatomy Image1 18 Resliced Header2 1-box1 box1 1 Anatomy Image1 19 Resliced Image3 1-box1 box1 1 Anatomy Image1 20 Resliced Header3 1-box1 box1 1 Anatomy Image1 21 Resliced Image4 1-box1 box1 1 Anatomy Image1 22 Resliced Header4 1-box1 box1 10 Reference Header 15 Resliced Image1 1-box1 box1 10 Reference Header 16 Resliced Header1 1-box1 box1 10 Reference Header 17 Resliced Image2 1-box1 box1 10 Reference Header 18 Resliced Header2 1-box1 box1 10 Reference Header 19 Resliced Image3 1-box1 box1 10 Reference Header 20 Resliced Header3 1-box1 box1 10 Reference Header 21 Resliced Image4 1-box1 box1 10 Reference Header 22 Resliced Header4 1-box1 box1 2 Anatomy Header1 15 Resliced Image1 1-box1 box1 2 Anatomy Header1 16 Resliced Header1 1-box1 box1 2 Anatomy Header1 17 Resliced Image2 1-box1 box1 2 Anatomy Header1 18 Resliced Header2 1-box1 box1 2 Anatomy Header1 19 Resliced Image3 1-box1 box1 2 Anatomy Header1 20 Resliced Header3 1-box1 box1 2 Anatomy Header1 21 Resliced Image4 1-box1 box1 2 Anatomy Header1 22 Resliced Header4 1-box1 box1 3 Anatomy Image2 15 Resliced Image1 1-box1 box1 3 Anatomy Image2 16 Resliced Header1 1-box1 box1 3 Anatomy Image2 17 Resliced Image2 1-box1 box1 3 Anatomy Image2 18 Resliced Header2 1-box1 box1 3 Anatomy Image2 19 Resliced Image3 1-box1 box1 3 Anatomy Image2 20 Resliced Header3 1-box1 box1 3 Anatomy Image2 21 Resliced Image4 1-box1 box1 3 Anatomy Image2 22 Resliced Header4 1-box1 box1 4 Anatomy Header2 15 Resliced Image1 1-box1 box1 4 Anatomy Header2 16 Resliced Header1 1-box1 box1 4 Anatomy Header2 17 Resliced Image2 1-box1 box1 4 Anatomy Header2 18 Resliced Header2 1-box1 box1 4 Anatomy Header2 19 Resliced Image3 1-box1 box1 4 Anatomy Header2 20 Resliced Header3 1-box1 box1 4 Anatomy Header2 21 Resliced Image4 1-box1 box1 4 Anatomy Header2 22 Resliced Header4 1-box1 box1 5 Anatomy Image3 15 Resliced Image1 1-box1 box1 5 Anatomy Image3 16 Resliced Header1 1-box1 box1 5 Anatomy Image3 17 Resliced Image2 1-box1 box1 5 Anatomy Image3 18 Resliced Header2 1-box1 box1 5 Anatomy Image3 19 Resliced Image3 1-box1 box1 5 Anatomy Image3 20 Resliced Header3 1-box1 box1 5 Anatomy Image3 21 Resliced Image4 1-box1 box1 5 Anatomy Image3 22 Resliced Header4 1-box1 box1 6 Anatomy Header3 15 Resliced Image1 1-box1 box1 6 Anatomy Header3 16 Resliced Header1 1-box1 box1 6 Anatomy Header3 17 Resliced Image2 1-box1 box1 6 Anatomy Header3 18 Resliced Header2 1-box1 box1 6 Anatomy Header3 19 Resliced Image3 1-box1 box1 6 Anatomy Header3 20 Resliced Header3 1-box1 box1 6 Anatomy Header3 21 Resliced Image4 1-box1 box1 6 Anatomy Header3 22 Resliced Header4 1-box1 box1 7 Anatomy Image4 15 Resliced Image1 1-box1 box1 7 Anatomy Image4 16 Resliced Header1 1-box1 box1 7 Anatomy Image4 17 Resliced Image2 1-box1 box1 7 Anatomy Image4 18 Resliced Header2 1-box1 box1 7 Anatomy Image4 19 Resliced Image3 1-box1 box1 7 Anatomy Image4 20 Resliced Header3 1-box1 box1 7 Anatomy Image4 21 Resliced Image4 1-box1 box1 7 Anatomy Image4 22 Resliced Header4 1-box1 box1 8 Anatomy Header4 15 Resliced Image1 1-box1 box1 8 Anatomy Header4 16 Resliced Header1 1-box1 box1 8 Anatomy Header4 17 Resliced Image2 1-box1 box1 8 Anatomy Header4 18 Resliced Header2 1-box1 box1 8 Anatomy Header4 19 Resliced Image3 1-box1 box1 8 Anatomy Header4 20 Resliced Header3 1-box1 box1 8 Anatomy Header4 21 Resliced Image4 1-box1 box1 8 Anatomy Header4 22 Resliced Header4 1-box1 box1 9 Reference Image 15 Resliced Image1 1-box1 box1 9 Reference Image 16 Resliced Header1 1-box1 box1 9 Reference Image 17 Resliced Image2 1-box1 box1 9 Reference Image 18 Resliced Header2 1-box1 box1 9 Reference Image 19 Resliced Image3 1-box1 box1 9 Reference Image 20 Resliced Header3 1-box1 box1 9 Reference Image 21 Resliced Image4 1-box1 box1 9 Reference Image 22 Resliced Header4 10-box2box2 23 Atlas Image 28 Atlas X Graphic 10-box2box2 24 Atlas Header 28 Atlas X Graphic | |||||||||||||||
| 9 softmean 15 Resliced Image1 23 Atlas Image 9 softmean 15 Resliced Image1 24 Atlas Header 9 softmean 16 Resliced Header1 23 Atlas Image | ||||||||||||||||
| Line: 841 to 784 | ||||||||||||||||
| 9 softmean 21 Resliced Image4 24 Atlas Header 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas Header | ||||||||||||||||
| Added: | ||||||||||||||||
| > > |
stepBox1-1 box1 1 Anatomy Image1 15 Resliced Image1 stepBox1-1 box1 1 Anatomy Image1 16 Resliced Header1 stepBox1-1 box1 10 Reference Header 15 Resliced Image1 stepBox1-1 box1 10 Reference Header 16 Resliced Header1 stepBox1-1 box1 2 Anatomy Header1 15 Resliced Image1 stepBox1-1 box1 2 Anatomy Header1 16 Resliced Header1 stepBox1-1 box1 9 Reference Image 15 Resliced Image1 stepBox1-1 box1 9 Reference Image 16 Resliced Header1 stepBox1-2 box1 10 Reference Header 17 Resliced Image2 stepBox1-2 box1 10 Reference Header 18 Resliced Header2 stepBox1-2 box1 3 Anatomy Image2 17 Resliced Image2 stepBox1-2 box1 3 Anatomy Image2 18 Resliced Header2 stepBox1-2 box1 4 Anatomy Header2 17 Resliced Image2 stepBox1-2 box1 4 Anatomy Header2 18 Resliced Header2 stepBox1-2 box1 9 Reference Image 17 Resliced Image2 stepBox1-2 box1 9 Reference Image 18 Resliced Header2 stepBox1-3 box1 10 Reference Header 19 Resliced Image3 stepBox1-3 box1 10 Reference Header 20 Resliced Header3 stepBox1-3 box1 5 Anatomy Image3 19 Resliced Image3 stepBox1-3 box1 5 Anatomy Image3 20 Resliced Header3 stepBox1-3 box1 6 Anatomy Header3 19 Resliced Image3 stepBox1-3 box1 6 Anatomy Header3 20 Resliced Header3 stepBox1-3 box1 9 Reference Image 19 Resliced Image3 stepBox1-3 box1 9 Reference Image 20 Resliced Header3 stepBox1-4 box1 10 Reference Header 21 Resliced Image4 stepBox1-4 box1 10 Reference Header 22 Resliced Header4 stepBox1-4 box1 7 Anatomy Image4 21 Resliced Image4 stepBox1-4 box1 7 Anatomy Image4 22 Resliced Header4 stepBox1-4 box1 8 Anatomy Header4 21 Resliced Image4 stepBox1-4 box1 8 Anatomy Header4 22 Resliced Header4 stepBox1-4 box1 9 Reference Image 21 Resliced Image4 stepBox1-4 box1 9 Reference Image 22 Resliced Header4 stepBox2-1 box2 23 Atlas Image 28 Atlas X Graphic stepBox2-1 box2 24 Atlas Header 28 Atlas X Graphic | |||||||||||||||
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
He thinks that DATAIDs 1 through 10 are necessary. The same can be said for the steps inside box1 and box2.
| ||||||||||||||||
| Line: 898 to 876 | ||||||||||||||||
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
SELECT DISTINCT input | |||||||||||||||
| > > |
SELECT * | |||||||||||||||
| FROM uProcess upc WHERE usr = 'uBlackBox' START WITH outputName = 'Resliced Image1' | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
CONNECT BY PRIOR upc.output = upc.input; | |||||||||||||||
| > > |
CONNECT BY PRIOR upc.input = upc.output; | |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
INPUT
| |||||||||||||||
| > > |
USR STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME TIME uBio stepBox1-1 box1 1 Anatomy Image1 15 Resliced Image1 8/7/2006 uBio stepBox1-1 box1 10 Reference Header 15 Resliced Image1 8/7/2006 uBio stepBox1-1 box1 2 Anatomy Header1 15 Resliced Image1 8/7/2006 uBio stepBox1-1 box1 9 Reference Image 15 Resliced Image1 8/7/2006 | |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
Note that as expected, a lot of inputs are provided since this user sees the two first stages as a black box. Consequently, he cannot distinguish which inputs are used to produce a given output. This can be interesting for security reason. | |||||||||||||||
| > > |
| |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
INPUT
| |||||||||||||||
| > > |
USR STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME TIME uAdmin 5 reslice 11 Warp Parameters1 15 Resliced Image1 8/12/2006 uAdmin 1 align_warp 1 Anatomy Image1 11 Warp Parameters1 8/7/2006 uAdmin 1 align_warp 10 Reference Header 11 Warp Parameters1 8/7/2006 uAdmin 1 align_warp 2 Anatomy Header1 11 Warp Parameters1 8/7/2006 uAdmin 1 align_warp 9 Reference Image 11 Warp Parameters1 8/7/2006 | |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
Since uAdmin can see everything, he knows that only DATAIDs 1, 2 and 9 through 11 have been used as inputs. | |||||||||||||||
| > > |
Note that uAdmin can see that data id 11 (Warp Params 1) is part of the provenance while uBio cannot see it.
| |||||||||||||||
| Deleted: | ||||||||||||||||
| < < |
SELECT DISTINCT input
FROM uProcess
WHERE usr='uAdmin'
START WITH output = (
SELECT dataId FROM data WHERE name = 'Resliced Image1'
)
CONNECT BY PRIOR input = output
INTERSECT
SELECT DISTINCT input, inputName
FROM uProcess
WHERE usr='uBio'
START WITH output = (
SELECT dataId FROM data WHERE name = 'Resliced Image1'
)
CONNECT BY PRIOR input = output
INPUT
----------
1
2
9
10
Note that as expected, only the right inputs are provided and none intermediate results are provided.
| |||||||||||||||
Categorisation of queries | ||||||||||||||||
| Line: 1004 to 947 | ||||||||||||||||
| -- SarahCohenBoulakia - 11 Sep 2006 | ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| ||||||||||||||||
| Changed: | ||||||||||||||||
| < < |
| |||||||||||||||
| > > |
| |||||||||||||||
| ||||||||||||||||
| Added: | ||||||||||||||||
| > > |
| |||||||||||||||
Provenance Challenge Template | ||||||||
| Line: 904 to 904 | ||||||||
|---|---|---|---|---|---|---|---|---|
| ||||||||
| Changed: | ||||||||
| < < |
SELECT * | |||||||
| > > |
SELECT DISTINCT input | |||||||
| FROM uProcess upc WHERE usr = 'uBlackBox' START WITH outputName = 'Resliced Image1' | ||||||||
| Line: 949 to 949 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
SELECT DISTINCT input, inputName | |||||||
| > > |
SELECT DISTINCT input | |||||||
| FROM uProcess WHERE usr='uAdmin' START WITH output = ( | ||||||||
Provenance Challenge Template | ||||||||||||||||||||
| Line: 13 to 13 | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Workflow Representation | ||||||||||||||||||||
| Changed: | ||||||||||||||||||||
| < < |
Our aim is to provide a framework to represent minimal information necessary such that we can reason about provenance in scientific workflows. With this in mind, we distinguish between the specification of the workflow and the execution of the workflow. On the one hand, the workflow specification is composed of step-classes (alternatively called procedures, tasks, actors, processes, boxes) such as reslice in the challenge workflow. On the other hand, an execution of a workflow generates a partial order of steps, each of which has a set of input and output data objects. Each step is an instance of a step-class, and the input-output flow of data and class associated with each step must conform to the workflow specification. An example of step in the challenge workflow is “8.reslice” which is an instance of the step-class “reslice”. The proposed model of provenance for workflows can represent the relationship between a step-class (specification) and a step (execution). This is necessary for keeping track of the data and parameters used by each step of a workflow execution as well as to trace the data produced. The model is represented within a relational framework extended with transitive closure. It is implemented under Oracle 10.i and Java. An object layer together with a user interface (JDBC component) is provided. More details on our model and implementation of the Challenge workflow are provided in the following section. | |||||||||||||||||||
| > > |
Our aim is to provide a framework to represent minimal information necessary such that we can reason about provenance in scientific workflows. With this in mind, we distinguish between the specification of the workflow and the execution of the workflow. On the one hand, the workflow specification is composed of step-classes (alternatively called procedures, tasks, actors, processes, boxes) such as reslice in the challenge workflow. On the other hand, an execution of a workflow generates a partial order of steps, each of which has a set of input and output data objects. Each step is an instance of a step-class, and the input-output flow of data and class associated with each step must conform to the workflow specification. An example of step in the challenge workflow is “8.reslice” which is an instance of the step-class “reslice”. The proposed model of provenance for workflows can represent the relationship between a step-class (specification) and a step (execution). This is necessary for keeping track of the data and parameters used by each step of a workflow execution as well as to trace the data produced. The model is represented within a relational framework extended with transitive closure. It is implemented under Oracle 10.g and Java. An object layer together with a user interface (JDBC component) is provided. More details on our model and implementation of the Challenge workflow are provided in the following section. | |||||||||||||||||||
Provenance Trace | ||||||||||||||||||||
| Line: 95 to 95 | ||||||||||||||||||||
| ||||||||||||||||||||
| Changed: | ||||||||||||||||||||
| < < |
Note: The notion of stage was not initially in our model and we think it needs to be better defined. However, we added it to be able to answer some of the queries in the challenge. | |||||||||||||||||||
| > > |
Note: The notion of stage was not initially in our model and we think it needs to be better defined. We will go back to this point. However, we added it to be able to answer some of the queries in the challenge. | |||||||||||||||||||
|
stageInstance(step string, stage string) stores the relationship between a stage and a step. As an example, steps 1 through 4 are in stage 1.
| ||||||||||||||||||||
| Line: 144 to 144 | ||||||||||||||||||||
| ||||||||||||||||||||
| Changed: | ||||||||||||||||||||
| < < |
For each query, we provide the corresponding SQL query and the results obtained. We also provide these results in a user-friendly format (as screenshots of our JAVA user interface). | |||||||||||||||||||
| > > |
For each query, we provide the corresponding SQL query and the results obtained. We also provide these results in a user-friendly format (as screenshots of our Java user interface). | |||||||||||||||||||
| ||||||||||||||||||||
| Line: 158 to 158 | ||||||||||||||||||||
| CONNECT BY PRIOR input = output ORDER BY step | ||||||||||||||||||||
| Changed: | ||||||||||||||||||||
| < < |
| |||||||||||||||||||
| > > |
| |||||||||||||||||||
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME | ||||||||||||||||||||
| Line: 229 to 229 | ||||||||||||||||||||
| CONNECT BY PRIOR input = output ) | ||||||||||||||||||||
| Added: | ||||||||||||||||||||
| > > |
| |||||||||||||||||||
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME | ||||||||||||||||||||
| Line: 272 to 273 | ||||||||||||||||||||
| ) CONNECT BY PRIOR input = output | ||||||||||||||||||||
| Added: | ||||||||||||||||||||
| > > |
| |||||||||||||||||||
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME | ||||||||||||||||||||
| Line: 298 to 300 | ||||||||||||||||||||
| ||||||||||||||||||||
| Added: | ||||||||||||||||||||
| > > |
| |||||||||||||||||||
| ||||||||||||||||||||
| Line: 323 to 326 | ||||||||||||||||||||
| ) AND rtrim(to_char(time, 'DAY')) = 'MONDAY' | ||||||||||||||||||||
| Added: | ||||||||||||||||||||
| > > |
| |||||||||||||||||||
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME TIME | ||||||||||||||||||||
| Line: 357 to 361 | ||||||||||||||||||||
| ) AND type = 'Atlas Graphic' | ||||||||||||||||||||
| Added: | ||||||||||||||||||||
| > > |
| |||||||||||||||||||
NAME | ||||||||||||||||||||
| Line: 398 to 403 | ||||||||||||||||||||
| AND pred.orig = output.step AND output.dataId = data.dataId | ||||||||||||||||||||
| Added: | ||||||||||||||||||||
| > > |
Note that we understand "-m12" as a search for parameter whose order is equal to 12. We could also have chosen to search for a parameter whose description was like "-m 12". | |||||||||||||||||||
ORIG NAME | ||||||||||||||||||||
| Line: 421 to 428 | ||||||||||||||||||||
| AND dataAttributes.value = 'UChicago' AND process.output = data.dataId | ||||||||||||||||||||
| Added: | ||||||||||||||||||||
| > > |
| |||||||||||||||||||
NAME | ||||||||||||||||||||
| Line: 440 to 448 | ||||||||||||||||||||
| AND dataAttributes.attribute = 'studyModality' AND dataAttributes.value IN ('speech', 'visual', 'audio') | ||||||||||||||||||||
| Added: | ||||||||||||||||||||
| > > |
| |||||||||||||||||||
DATAID NAME ATTRIBUTE VALUE | ||||||||||||||||||||
| Line: 476 to 485 | ||||||||||||||||||||
| ||||||||||||||||||||
| Changed: | ||||||||||||||||||||
| < < |
As an example, box 1 covers all the step-class align_warp and reslice | |||||||||||||||||||
| > > |
As an example, box 1 covers all the step-class align_warp and reslice, and box3 contains box1. | |||||||||||||||||||
COMPOSTEPCLASS STEPCLASS --------------------------------- | ||||||||||||||||||||
| Line: 985 to 994 | ||||||||||||||||||||
| We have shown how our model is able to represent the challenge workflow and to answer the proposed queries. We have also shown how the notion of a user view can allow the user to manage the complexity of a workflow through higher levels of abstraction and layering. | ||||||||||||||||||||
| Changed: | ||||||||||||||||||||
| < < |
We would be interested in discussing with the workshop attendees about three points: (i) the meaning of a stage in a workflow (and in particular, how to interpret this notion with respect to user views), (ii) the biological significance of the procedures that can compose a workflow (in the workflow provided, what are the “significant” steps for a biologist?) and (iii) the (general) problem of computing a difference between two workflows (cf. query 7). | |||||||||||||||||||
| > > |
We would be interested in discussing with the workshop attendees about three points: (i) the meaning of a stage in a workflow (and in particular, how to interpret this notion with respect to user views), (ii) the biological significance of the procedures that can compose a workflow (in the workflow provided, what are the “significant” steps for a scientist?) and (iii) the (general) problem of computing a difference between two workflows (cf. query 7). | |||||||||||||||||||
AknowledgmentsThis work is supported by NSF grants 0513778, 0415810, and 0612177*. | ||||||||||||||||||||
Provenance Challenge Template | ||||||||
| Changed: | ||||||||
| < < |
Participating Team (Under construction) | |||||||
| > > |
Participating Team | |||||||
| ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| ||||||||
| Line: 146 to 146 | ||||||||
|---|---|---|---|---|---|---|---|---|
| For each query, we provide the corresponding SQL query and the results obtained. We also provide these results in a user-friendly format (as screenshots of our JAVA user interface). | ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
SELECT DISTINCT step, stepClass, input, inputName, output, outputName | ||||||||
| Line: 211 to 211 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
* Query2 | |||||||
| > > |
* Query2 | |||||||
SELECT DISTINCT step, stepClass, input, inputName, output, outputName | ||||||||
| Line: 256 to 256 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
SELECT DISTINCT step, stepClass, input, inputName, output, outputName | ||||||||
| Line: 299 to 299 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
SELECT step, stepClass, input, inputName, output, outputName, time | ||||||||
| Line: 332 to 332 | ||||||||
| 1 align_warp 10 Reference Header 11 Warp Parameters1 8/7/2006 1 align_warp 9 Reference Image 11 Warp Parameters1 8/7/2006 | ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| Added: | ||||||||
| > > |
| |||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
SELECT name | ||||||||
| Line: 365 to 366 | ||||||||
| Atlas Z Graphic | ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
SELECT distinct pred.orig, data.name FROM | ||||||||
| Line: 405 to 406 | ||||||||
| 9 Atlas Image | ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| ||||||||
| Deleted: | ||||||||
| < < |
||||||||
| ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
SELECT distinct data.name | ||||||||
| Line: 429 to 429 | ||||||||
| Warp Parameters1 | ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
SELECT distinct data.dataId, data.name, attribute, value | ||||||||
| Line: 481 to 482 | ||||||||
|
box1 align_warp box1 reslice | ||||||||
| Added: | ||||||||
| > > |
box3 box1 | |||||||
| ||||||||
| Line: 494 to 496 | ||||||||
| ORDER BY compoStepClass | ||||||||
| Added: | ||||||||
| > > |
COMPOSTEPCLASS STEPCLASS --------------------------------- box1 align_warp box1 reslice box3 box1 box3 align_warp box3 reslice | |||||||
| ||||||||
| Line: 673 to 685 | ||||||||
| The user ‘uBio’ sees stages 1 and 2 of the workflow as a single black box. First, he cannot see data whose DATAID are 11 to 13. Second, for him DATAIDs 1 through 10 are used to produce DATAIDs 15 through 22 as outputs. | ||||||||
| Changed: | ||||||||
| < < |
The user ‘uBlackBox’ cannot see the details of step 1 either. He is in the same case as UBio except that he cannot see data whose DATAIDs are 15 through 22. The only output that he can see are the final outputs that is, DATAID 28 through 30. | |||||||
| > > |
The user ‘uBlackBox’ cannot see the details of step 1 either. He is in the same case as UBio except that he cannot see data whose DATAIDs are 15 through 22. The only outputs of the workflow that he can see are the final outputs that is, DATAID 28 through 30. | |||||||
The implications of user views when it comes to query answering, are discussed in the following section.
Suggested QueriesIn this section, we answer query 1 considering user views and we introduce two other queries to emphasize the benefit of our approach. We will show that user views allow | ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
* Query1 for each user
| ||||||||
| Line: 713 to 728 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| ||||||||
| Line: 820 to 835 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| ||||||||
| Line: 871 to 887 | ||||||||
| 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas Header | ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| ||||||||
| Changed: | ||||||||
| < < |
We will consider two versions of this query. The first version (security) considers that for security reasons, the users are not allowed to know the details about the inputs used to produced a given output. The second version (high-level) considers that the users are allowed to know the details about the inputs used to produced a given output. In both situations, intermediate data and steps will not be known by the users according to their user view. | |||||||
| > > |
We will consider two versions of this query. The first version (security) considers that for security reasons, the users are not allowed to know the details about the inputs used to produced a given output. The second version (high-level) considers that the users are allowed to know the details about the inputs used to produced a given output. However, in both situations intermediate data and steps will not be known by the users according to their user view. | |||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
SELECT * FROM uProcess upc | ||||||||
| Changed: | ||||||||
| < < |
WHERE usr = 'uBlack' | |||||||
| > > |
WHERE usr = 'uBlackBox' | |||||||
| START WITH outputName = 'Resliced Image1' CONNECT BY PRIOR upc.output = upc.input; | ||||||||
| Line: 905 to 921 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
Note that as expected, a lot of inputs are provided since this user sees the two first stages as a black box. Consequently, he cannot distinguish which input is used to produce which output. This can be interesting for security reason. | |||||||
| > > |
Note that as expected, a lot of inputs are provided since this user sees the two first stages as a black box. Consequently, he cannot distinguish which inputs are used to produce a given output. This can be interesting for security reason. | |||||||
| ||||||||
| Line: 949 to 965 | ||||||||
| ||||||||
| Changed: | ||||||||
| < < |
Note that as expected, only the right inputs are provided. | |||||||
| > > |
Note that as expected, only the right inputs are provided and none intermediate results are provided. | |||||||
Categorisation of queries | ||||||||
| Deleted: | ||||||||
| < < |
According to your provenance approach, you may be able to provide a categorisation of queries. Can you elaborate on the categorisation and its rationale | |||||||
Generally speaking, our provenance queries can be categorized as such:
| ||||||||
| Line: 979 to 993 | ||||||||
| (* Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.) | ||||||||
| Changed: | ||||||||
| < < |
-- SarahCohenBoulakia - 10 Sep 2006 | |||||||
| > > |
-- SarahCohenBoulakia - 11 Sep 2006 | |||||||
| ||||||||
Provenance Challenge Template | |||||||||||||||||||
| Line: 207 to 207 | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas Header | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
The results may also be visualized with a user-friendly interface, where the data is grouped by steps. The data can be retrieved either by name or by id (the id option may be useful when a lot of data is retuned by a query). | ||||||||||||||||||
| > > |
The results can also be visualized within the user-friendly interface. On the top, the query is given by the user, and on the bottom the result (provenance) is represented through a graph. The user can access to the characteristics of a given step or data item by clicking on it (visualized on the right hand side). | ||||||||||||||||||
| |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
To be included soon | ||||||||||||||||||
| > > |
| ||||||||||||||||||
* Query2
| |||||||||||||||||||
| Line: 253 to 253 | |||||||||||||||||||
| 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas Header | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| > > |
| ||||||||||||||||||
| |||||||||||||||||||
| Line: 295 to 296 | |||||||||||||||||||
| 10 slicer 24 Atlas Header 25 Atlas X Slice 9 softmean 21 Resliced Image4 24 Atlas Header | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| > > |
| ||||||||||||||||||
* Query4
| |||||||||||||||||||
| Line: 330 to 332 | |||||||||||||||||||
| 1 align_warp 10 Reference Header 11 Warp Parameters1 8/7/2006 1 align_warp 9 Reference Image 11 Warp Parameters1 8/7/2006 | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| > > |
| ||||||||||||||||||
| |||||||||||||||||||
| Line: 680 to 683 | |||||||||||||||||||
| |||||||||||||||||||
| Deleted: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| |||||||||||||||||||
| Line: 693 to 695 | |||||||||||||||||||
| CONNECT BY PRIOR input = output ORDER BY step; | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| > > |
| ||||||||||||||||||
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME ----------------------------------------------------------------------------- | |||||||||||||||||||
| Line: 708 to 710 | |||||||||||||||||||
| 1-box3 box3 8 Anatomy Header4 28 Atlas X Graphic 1-box3 box3 9 Reference Image 28 Atlas X Graphic | |||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
| ||||||||||||||||||
| |||||||||||||||||||
| Line: 813 to 818 | |||||||||||||||||||
| 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas Header | |||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
| ||||||||||||||||||
| |||||||||||||||||||
| Line: 914 to 921 | |||||||||||||||||||
| Since uAdmin can see everything, he knows that only DATAIDs 1, 2 and 9 through 11 have been used as inputs. | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| > > |
| ||||||||||||||||||
| |||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
SELECT DISTINCT input, inputName FROM uProcess WHERE usr='uAdmin' START WITH output = ( SELECT dataId FROM data WHERE name = 'Resliced Image1' ) CONNECT BY PRIOR input = output INTERSECT SELECT DISTINCT input, inputName FROM uProcess WHERE usr='uBio' START WITH output = ( SELECT dataId FROM data WHERE name = 'Resliced Image1' ) CONNECT BY PRIOR input = output | ||||||||||||||||||
| Deleted: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
INPUT | |||||||||||||||||||
| Line: 931 to 948 | |||||||||||||||||||
| |||||||||||||||||||
| Deleted: | |||||||||||||||||||
| < < |
11 | ||||||||||||||||||
| Note that as expected, only the right inputs are provided. | |||||||||||||||||||
| Deleted: | |||||||||||||||||||
| < < |
INPUT
----------
1
2
9
10
11
Same results as previously.
| ||||||||||||||||||
Categorisation of queriesAccording to your provenance approach, you may be able to provide a categorisation of queries. Can you elaborate on the categorisation and its rationale | |||||||||||||||||||
| Line: 997 to 999 | |||||||||||||||||||
| |||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
| ||||||||||||||||||
Provenance Challenge Template | ||||||||
| Line: 971 to 971 | ||||||||
|---|---|---|---|---|---|---|---|---|
| We would be interested in discussing with the workshop attendees about three points: (i) the meaning of a stage in a workflow (and in particular, how to interpret this notion with respect to user views), (ii) the biological significance of the procedures that can compose a workflow (in the workflow provided, what are the “significant” steps for a biologist?) and (iii) the (general) problem of computing a difference between two workflows (cf. query 7). | ||||||||
| Added: | ||||||||
| > > |
AknowledgmentsThis work is supported by NSF grants 0513778, 0415810, and 0612177*. (* Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.) | |||||||
| -- SarahCohenBoulakia - 10 Sep 2006 | ||||||||
Provenance Challenge Template | ||||||||||||||||||||||||||||||||
| Line: 140 to 140 | ||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Provenance Queries | ||||||||||||||||||||||||||||||||
| Changed: | ||||||||||||||||||||||||||||||||
| < < |
Also, make sure you complete the ProvenanceQueriesMatrix. | |||||||||||||||||||||||||||||||
| > > |
| |||||||||||||||||||||||||||||||
| For each query, we provide the corresponding SQL query and the results obtained. We also provide these results in a user-friendly format (as screenshots of our JAVA user interface). | ||||||||||||||||||||||||||||||||
Provenance Challenge TemplateParticipating Team (Under construction)
| |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| > > |
| ||||||||||||||||||
| |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
| ||||||||||||||||||
| > > |
| ||||||||||||||||||
| |||||||||||||||||||
| Deleted: | |||||||||||||||||||
| < < |
The following will be updated soon! | ||||||||||||||||||
Workflow Representation | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
Provide here a description of how you have encoded the Challenge workflow. | ||||||||||||||||||
| > > |
Our aim is to provide a framework to represent minimal information necessary such that we can reason about provenance in scientific workflows. With this in mind, we distinguish between the specification of the workflow and the execution of the workflow. On the one hand, the workflow specification is composed of step-classes (alternatively called procedures, tasks, actors, processes, boxes) such as reslice in the challenge workflow. On the other hand, an execution of a workflow generates a partial order of steps, each of which has a set of input and output data objects. Each step is an instance of a step-class, and the input-output flow of data and class associated with each step must conform to the workflow specification. An example of step in the challenge workflow is “8.reslice” which is an instance of the step-class “reslice”. The proposed model of provenance for workflows can represent the relationship between a step-class (specification) and a step (execution). This is necessary for keeping track of the data and parameters used by each step of a workflow execution as well as to trace the data produced. The model is represented within a relational framework extended with transitive closure. It is implemented under Oracle 10.i and Java. An object layer together with a user interface (JDBC component) is provided. More details on our model and implementation of the Challenge workflow are provided in the following section. | ||||||||||||||||||
Provenance Trace | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
Upload a representation of the information you captured when executing the workflow. Explain the structure (provide pointers to documents describing your schemas etc.) | ||||||||||||||||||
| > > |
Our model is composed of base tables which provide minimal information about the workflow execution, the workflow specification, and the relationships between the specification and execution. The model is represented though the E/R diagram below. The tables IMMCONTAINS and USERVIEW will be defined in the section Worflow Variation. The other tables and their schemas are listed below. We list the type of each column (VARCHAR are strings) and the primary keys (in bold). For each table, we give an example based on the Challenge workflow. | ||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
Provenance Queries | ||||||||||||||||||
| > > |
Figure 1 – E/R schema for the model of Provenance
Note: The content of the tables is provided at the end of this page (see images tables1.png to tables4.png).
| ||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
For each query, if your system can support your query, provide a description of how you implement the query, what result is returned; otherwise, explain whether the query is in the remit of your system. | ||||||||||||||||||
| > > |
DATAID NAME TYPE
-----------------------------------------------------
1 Anatomy Image1 Anatomy Image
DATAID ATTRIBUTE VALUE
-----------------------------------------------------
1 center UChicago
STEP ATTRIBUTE VALUE
-----------------------------------------------------
1 linear false
1 order 12
1 model 1365
1 description -m 12 -q
STEP DATAID TS ----------------------------------------------------- 2 3 8/8/2006 2 4 8/8/2006 2 9 8/8/2006 2 10 8/8/2006* output output(step int, dataId int, ts Date) stores information about an output from a step and the timestamp it occurred. As an example, Step 10 (10. slicer) produced one output with a DATAID of 25 (Atlas X Slice) on 08/17/2006. STEP DATAID TS ----------------------------------------------------- 10 25 8/17/2006
STEP STEPCLASS TS
-----------------------------------------------------
1 align_warp 8/7/2006
2 align_warp 8/8/2006
3 align_warp 8/10/2006
4 align_warp 8/11/2006
STEP STAGE ----------------------------------------------------- 1 1 2 1 3 1 4 1
CREATE VIEW Process
AS
SELECT DISTINCT
instanceOf.step step,
instanceOf.stepClass stepClass,
input.dataId input,
dataInput.name inputName,
output.dataId output,
dataOutput.name outputName,
input.ts time
FROM
instanceOf,
input,
data dataInput,
output,
data dataOutput
WHERE instanceOf.step = input.step
AND input.dataId = dataInput.dataId
AND input.step = output.step
AND output.dataId = dataOutput.dataId
AND input.ts <= output.ts
As an example, step 13 is an execution of the step-class convert. It takes an input of DATAID 25 (Atlas X Slice) and was executed on 08/20, 2006. It does not take any parameters. It produces the output of DATAID 28 (Atlas X Graphic).
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME TIME ---------------------------------------------------------------- 13 convert 25 Atlas X Slice 28 Atlas X Graphic 8/20/2006 Provenance Queries | ||||||||||||||||||
| Also, make sure you complete the ProvenanceQueriesMatrix. | |||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
For each query, we provide the corresponding SQL query and the results obtained. We also provide these results in a user-friendly format (as screenshots of our JAVA user interface).
SELECT DISTINCT step, stepClass, input, inputName, output, outputName
FROM process
START WITH output =
(
SELECT dataId FROM data WHERE name = 'Atlas X Graphic'
)
CONNECT BY PRIOR input = output
ORDER BY step
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME ---------------------------------------------------------------- 1 align_warp 1 Anatomy Image1 11 Warp Parameters1 1 align_warp 10 Reference Header 11 Warp Parameters1 1 align_warp 2 Anatomy Header1 11 Warp Parameters1 1 align_warp 9 Reference Image 11 Warp Parameters1 10 slicer 23 Atlas Image 25 Atlas X Slice 10 slicer 24 Atlas Header 25 Atlas X Slice 13 convert 25 Atlas X Slice 28 Atlas X Graphic 2 align_warp 10 Reference Header 12 Warp Parameters2 2 align_warp 3 Anatomy Image2 12 Warp Parameters2 2 align_warp 4 Anatomy Header2 12 Warp Parameters2 2 align_warp 9 Reference Image 12 Warp Parameters2 3 align_warp 10 Reference Header 13 Warp Parameters3 3 align_warp 5 Anatomy Image3 13 Warp Parameters3 3 align_warp 6 Anatomy Header3 13 Warp Parameters3 3 align_warp 9 Reference Image 13 Warp Parameters3 4 align_warp 10 Reference Header 14 Warp Parameters4 4 align_warp 7 Anatomy Image4 14 Warp Parameters4 4 align_warp 8 Anatomy Header4 14 Warp Parameters4 4 align_warp 9 Reference Image 14 Warp Parameters4 5 reslice 11 Warp Parameters1 15 Resliced Image1 5 reslice 11 Warp Parameters1 16 Resliced Header1 6 reslice 12 Warp Parameters2 17 Resliced Image2 6 reslice 12 Warp Parameters2 18 Resliced Header2 7 reslice 13 Warp Parameters3 19 Resliced Image3 7 reslice 13 Warp Parameters3 20 Resliced Header3 8 reslice 14 Warp Parameters4 21 Resliced Image4 8 reslice 14 Warp Parameters4 22 Resliced Header4 9 softmean 15 Resliced Image1 23 Atlas Image 9 softmean 15 Resliced Image1 24 Atlas Header 9 softmean 16 Resliced Header1 23 Atlas Image 9 softmean 16 Resliced Header1 24 Atlas Header 9 softmean 17 Resliced Image2 23 Atlas Image 9 softmean 17 Resliced Image2 24 Atlas Header 9 softmean 18 Resliced Header2 23 Atlas Image 9 softmean 18 Resliced Header2 24 Atlas Header 9 softmean 19 Resliced Image3 23 Atlas Image 9 softmean 19 Resliced Image3 24 Atlas Header 9 softmean 20 Resliced Header3 23 Atlas Image 9 softmean 20 Resliced Header3 24 Atlas Header 9 softmean 21 Resliced Image4 23 Atlas Image 9 softmean 21 Resliced Image4 24 Atlas Header 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas HeaderThe results may also be visualized with a user-friendly interface, where the data is grouped by steps. The data can be retrieved either by name or by id (the id option may be useful when a lot of data is retuned by a query).
SELECT DISTINCT step, stepClass, input, inputName, output, outputName
FROM Process
START WITH output = (
SELECT dataId FROM data WHERE name = 'Atlas X Graphic'
)
CONNECT BY PRIOR input = output
MINUS (
SELECT DISTINCT step, stepClass, input, inputName, output, outputName
FROM Process
START WITH output IN (
SELECT input FROM Process WHERE stepClass = 'softmean'
)
CONNECT BY PRIOR input = output
)
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME ---------------------------------------------------------------- 10 slicer 23 Atlas Image 25 Atlas X Slice 10 slicer 24 Atlas Header 25 Atlas X Slice 13 convert 25 Atlas X Slice 28 Atlas X Graphic 9 softmean 15 Resliced Image1 23 Atlas Image 9 softmean 15 Resliced Image1 24 Atlas Header 9 softmean 16 Resliced Header1 23 Atlas Image 9 softmean 16 Resliced Header1 24 Atlas Header 9 softmean 17 Resliced Image2 23 Atlas Image 9 softmean 17 Resliced Image2 24 Atlas Header 9 softmean 18 Resliced Header2 23 Atlas Image 9 softmean 18 Resliced Header2 24 Atlas Header 9 softmean 19 Resliced Image3 23 Atlas Image 9 softmean 19 Resliced Image3 24 Atlas Header 9 softmean 20 Resliced Header3 23 Atlas Image 9 softmean 20 Resliced Header3 24 Atlas Header 9 softmean 21 Resliced Image4 23 Atlas Image 9 softmean 21 Resliced Image4 24 Atlas Header 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas Header
SELECT DISTINCT step, stepClass, input, inputName, output, outputName
FROM process
WHERE step IN (
SELECT step FROM StageInstance
WHERE (
stage in (3,4,5)
)
)
START WITH output = (
SELECT dataId FROM data WHERE name = 'Atlas X Graphic'
)
CONNECT BY PRIOR input = output
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME ---------------------------------------------------------------- 9 softmean 19 Resliced Image3 23 Atlas Image 9 softmean 21 Resliced Image4 23 Atlas Image 9 softmean 17 Resliced Image2 24 Atlas Header 9 softmean 20 Resliced Header3 24 Atlas Header 9 softmean 15 Resliced Image1 24 Atlas Header 9 softmean 19 Resliced Image3 24 Atlas Header 13 convert 25 Atlas X Slice 28 Atlas X Graphic 9 softmean 15 Resliced Image1 23 Atlas Image 9 softmean 18 Resliced Header2 23 Atlas Image 9 softmean 16 Resliced Header1 23 Atlas Image 9 softmean 18 Resliced Header2 24 Atlas Header 9 softmean 17 Resliced Image2 23 Atlas Image 10 slicer 23 Atlas Image 25 Atlas X Slice 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 16 Resliced Header1 24 Atlas Header 9 softmean 22 Resliced Header4 24 Atlas Header 9 softmean 20 Resliced Header3 23 Atlas Image 10 slicer 24 Atlas Header 25 Atlas X Slice 9 softmean 21 Resliced Image4 24 Atlas Header
SELECT step, stepClass, input, inputName, output, outputName, time
FROM process
WHERE stepClass = 'align_warp'
AND EXISTS
(
SELECT value
FROM stepParams
WHERE stepParams.step = process.step
AND attribute='order'
AND value='12'
)
AND EXISTS
(
SELECT value
FROM stepParams
WHERE stepParams.step = process.step
AND attribute='model'
AND value='1365'
)
AND rtrim(to_char(time, 'DAY')) = 'MONDAY'
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME TIME ---------------------------------------------------------------- 1 align_warp 1 Anatomy Image1 11 Warp Parameters1 8/7/2006 1 align_warp 2 Anatomy Header1 11 Warp Parameters1 8/7/2006 1 align_warp 10 Reference Header 11 Warp Parameters1 8/7/2006 1 align_warp 9 Reference Image 11 Warp Parameters1 8/7/2006
SELECT name
FROM data
WHERE dataId IN
(
SELECT output
FROM process
START WITH input IN (
SELECT data.dataId
FROM data, dataAttributes
WHERE data.dataId = dataAttributes.dataId
AND data.type = 'Anatomy Header'
AND dataAttributes.attribute = 'global maximum'
AND dataAttributes.value = '4095'
)
CONNECT BY PRIOR output=input
)
AND type = 'Atlas Graphic'
NAME --------------------- Atlas X Graphic Atlas Y Graphic Atlas Z Graphic
SELECT distinct pred.orig, data.name
FROM
(
SELECT distinct CONNECT_BY_ROOT step orig, step
FROM process
START WITH stepClass='softmean'
CONNECT BY PRIOR input=output
) pred,
output,
data
WHERE
EXISTS (
SELECT *
FROM stepParams
WHERE attribute = 'order'
AND value = 12
AND stepParams.step = pred.step
)
AND EXISTS
(
SELECT *
FROM stepParams
WHERE attribute = 'model'
AND value = 1365
AND stepParams.step = pred.step
)
AND pred.orig = output.step
AND output.dataId = data.dataId
ORIG NAME --------------------------------- 9 Atlas Header 9 Atlas Image
SELECT distinct data.name
FROM process, dataAttributes, data
WHERE process.stepClass='align_warp'
AND process.input = dataAttributes.dataId
AND dataAttributes.attribute = 'center'
AND dataAttributes.value = 'UChicago'
AND process.output = data.dataId
NAME ------------------ Warp Parameters2 Warp Parameters1
SELECT distinct data.dataId, data.name, attribute, value
FROM data, dataAttributes, attribute, value
WHERE
data.type='Atlas Graphic'
AND data.dataId = dataAttributes.dataId
AND dataAttributes.attribute = 'studyModality'
AND dataAttributes.value IN ('speech', 'visual', 'audio')
DATAID NAME ATTRIBUTE VALUE --------------------------------------------------------------------- 29 Atlas Y Graphic studyModality audio 29 Atlas Y Graphic studyModality visual 30 Atlas Z Graphic studyModality speech | ||||||||||||||||||
Suggested Workflow Variants | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
Suggest variants of the workflow that can exhibit capabilities that your system support. | ||||||||||||||||||
| > > |
The model we propose takes into account the capabilities of many of the current scientific workflow systems; one such capability is the idea of composite step-classes, that is, step-classes which are composed of step-classes. There are several reasons why composite step classes are useful in workflows. First, they help to manage the complexity of a workflow; users may wish to focus on a certain level of abstraction and ignore the lower levels of detail. Second, composite step-classes can also represent levels of ``authorization''; users without the appropriate clearance level would not be allowed to see the lower level executions of a step-class. | ||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
Typically, an input to a composite step-class is also an input to one or more of its substep classes while an output of a substep class is either an input to another substep class or becomes the output of a composite step class. Also, composite step-classes define a partition of the step-classes of the workflow (there is no step-class which belongs to multiple composite step-classes). | ||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
Suggested Queries | ||||||||||||||||||
| > > |
We have therefore defined the following notion of user views:
Given a workflow specification, the user view of a user (or class of users) U, userView(U), is the set of lowest-level step classes that U is entitled to see.
Note that a user view cannot contain two step classes such that one is contained in the other. We also assume that the user view is valid, i.e. that each of the highest-level step classes in the workflow specification is either in the view, or that at some lower-level all of its contained substeps are in the user view.
To answer questions of provenance, we must take the user view into account and reason about the input and output to steps which are instances of step-classes that are in the user view. We must know the containment relationship between step-classes and the relationship between each user and the step-classes he is entitled to see.
In the following, we introduce the part of our model which deals with user views and we will give examples based on a variation of the provenance workflow in the challenge. In the following discussion, we refer to composite steps as “boxes”.
Figure 2 – Workflow for the challenge with user views (boxes 1 to 3)
UserView Application
In this section, we introduce some additional tables which are needed for reasoning about provenance in the context of user views:
COMPOSTEPCLASS STEPCLASS --------------------------------- box1 align_warp box1 reslice
CREATE VIEW
AS
SELECT DISTINCT CONNECT_BY_ROOT compoStepClass as compoStepClass, stepClass
FROM immcontains
CONNECT BY PRIOR stepClass=compoStepClass
ORDER BY compoStepClass
CREATE VIEW virtualInstanceOf
AS
SELECT step step, stepClass, ts
FROM instanceOf
UNION
select min(step) || '-' || compoStepClass, compoStepClass, min(ts)
from contains, instanceOf
where contains.stepClass = instanceOf.stepClass
group By compoStepClass
For example, step 1 is an instance of align_wrap (as previously).
Step 1-box1 denotes the first step contained in box1, it is an instance of the composite step-class box1.
STEP STEPCLASS TS ------------------------------- 1 align_warp 8/7/2006 1-box1 box1 8/7/2006
USR STEPCLASS -------------------------------- uBlackBox box3 uBio box1 uBio softmean uBio box2 uAdmin align_warp uAdmin reslice uAdmin softmean uAdmin slicer uAdmin convertThe following two tables, Cinput and Coutput are input and output extended to composite steps.
CREATE VIEW cInput
AS
-- First union operand: "For each containment, the input is an input of
-- the higher level step class if and only if this input was not an output
-- from another of its sub steps
SELECT inst2.step, input.dataId, min(input.ts) ts
FROM input,
instanceOf inst1,
contains,
virtualInstanceOf inst2
WHERE
input.step = inst1.step
AND inst1.stepClass = contains.stepClass
AND contains.compoStepClass = inst2.stepClass
AND NOT EXISTS
(
SELECT o.step
FROM output o, instanceOf inst3, contains c2
WHERE o.dataId = input.dataId
AND o.step = inst3.step
AND inst3.stepClass = c2.stepClass
AND c2.compoStepClass = contains.compoStepClass
)
GROUP BY inst2.step, input.dataId
-- Second union operand
-- We want to keep the direct input relations
UNION
SELECT step, dataId, ts
FROM input
As an example, the step 10-box2 takes DATAID 23 (Atlas Image) and DATAID 24 (Atlas Header) as inputs.
STEP DATAID TS -------------------------- 10-box2 23 8/17/2006 10-box2 24 8/17/2006
CREATE VIEW coutput
AS
-- First union operand: "For each containment, the output is an output of
-- the higher level step class if and only if this output is used as an
-- input for another step class which is not contained in the same step class
-- or if this is a final output of the workflow
SELECT inst2.step, output.dataId, output.ts
FROM output,
instanceOf inst1,
contains,
virtualInstanceOf inst2
WHERE
output.step = inst1.step
AND inst1.stepClass = contains.stepClass
AND contains.compoStepClass = inst2.stepClass
AND EXISTS
(
/* Step class is used as an entry of
* at least one step
* class which is not a substep of
this step class */
SELECT i.step
FROM input i, instanceOf inst3
WHERE i.dataId = output.dataId
AND i.step = inst3.step
AND NOT EXISTS
(
SELECT contains.compoStepClass
FROM contains
WHERE contains.stepClass = inst3.stepClass
AND contains.compoStepClass = inst2.stepClass
)
UNION
-- Step class is a final output?
(
SELECT output.dataId
FROM dual
MINUS
SELECT i.dataId
FROM input i
)
)
UNION
-- Second union operand
-- We want to keep the direct output relations
SELECT step, dataId, ts
FROM output
As an example, step 10-box2 produced data ID 28 to 30 (Atlas Graphic X, Y and Z) on 08/20/2006, 08/21/2006, and 08/22/2006 respectively.
STEP DATAID TS ---------------------------- 10-box2 28 8/20/2006 10-box2 29 8/21/2006 10-box2 30 8/22/2006
CREATE VIEW UProcess
AS
SELECT DISTINCT
userView.usr usr,
instanceOf.step step,
instanceOf.stepClass stepClass,
input.dataId input,
dataInput.name inputName,
output.dataId output,
dataOutput.name outputName,
input.ts time
FROM
userView,
virtualInstanceOf instanceOf,
cinput input,
data dataInput,
coutput output,
data dataOutput
WHERE
userView.stepClass = instanceOf.stepClass
AND instanceOf.step = input.step
AND input.dataId = dataInput.dataId
AND input.step = output.step
AND output.dataId = dataOutput.dataId
AND input.ts <= output.ts
--ORDER BY usr, step, input
As an example, let us consider three user views and their perspectives on stage 1 of the workflow (align_warp).
The user ‘uAdmin’ can see all the steps of the workflow. First, he can see the wrap param images produced (DATAID 11 to 14). Then, he can see that only data ID 1, 2, 9, and 10 are used as inputs to step 1.
| ||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
Suggest significant queries that your system can support and are not in the proposed list of queries, and how you have implemented/would implement them. These queries may be with regards to a variant of the workflow suggested above. | ||||||||||||||||||
| > > |
The user ‘uBio’ sees stages 1 and 2 of the workflow as a single black box. First, he cannot see data whose DATAID are 11 to 13. Second, for him DATAIDs 1 through 10 are used to produce DATAIDs 15 through 22 as outputs. | ||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
The user ‘uBlackBox’ cannot see the details of step 1 either. He is in the same case as UBio except that he cannot see data whose DATAIDs are 15 through 22. The only output that he can see are the final outputs that is, DATAID 28 through 30.
The implications of user views when it comes to query answering, are discussed in the following section.
Suggested QueriesIn this section, we answer query 1 considering user views and we introduce two other queries to emphasize the benefit of our approach. We will show that user views allow
SELECT DISTINCT step, stepClass, input, inputName, output, outputName
FROM uProcess
WHERE usr='uBlackBox'
START WITH output = (
SELECT dataId FROM data WHERE name = 'Atlas X Graphic'
)
CONNECT BY PRIOR input = output
ORDER BY step;
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME ----------------------------------------------------------------------------- 1-box3 box3 1 Anatomy Image1 28 Atlas X Graphic 1-box3 box3 10 Reference Header 28 Atlas X Graphic 1-box3 box3 2 Anatomy Header1 28 Atlas X Graphic 1-box3 box3 3 Anatomy Image2 28 Atlas X Graphic 1-box3 box3 4 Anatomy Header2 28 Atlas X Graphic 1-box3 box3 5 Anatomy Image3 28 Atlas X Graphic 1-box3 box3 6 Anatomy Header3 28 Atlas X Graphic 1-box3 box3 7 Anatomy Image4 28 Atlas X Graphic 1-box3 box3 8 Anatomy Header4 28 Atlas X Graphic 1-box3 box3 9 Reference Image 28 Atlas X Graphic
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME 1-box1 box1 1 Anatomy Image1 15 Resliced Image1 1-box1 box1 1 Anatomy Image1 16 Resliced Header1 1-box1 box1 1 Anatomy Image1 17 Resliced Image2 1-box1 box1 1 Anatomy Image1 18 Resliced Header2 1-box1 box1 1 Anatomy Image1 19 Resliced Image3 1-box1 box1 1 Anatomy Image1 20 Resliced Header3 1-box1 box1 1 Anatomy Image1 21 Resliced Image4 1-box1 box1 1 Anatomy Image1 22 Resliced Header4 1-box1 box1 10 Reference Header 15 Resliced Image1 1-box1 box1 10 Reference Header 16 Resliced Header1 1-box1 box1 10 Reference Header 17 Resliced Image2 1-box1 box1 10 Reference Header 18 Resliced Header2 1-box1 box1 10 Reference Header 19 Resliced Image3 1-box1 box1 10 Reference Header 20 Resliced Header3 1-box1 box1 10 Reference Header 21 Resliced Image4 1-box1 box1 10 Reference Header 22 Resliced Header4 1-box1 box1 2 Anatomy Header1 15 Resliced Image1 1-box1 box1 2 Anatomy Header1 16 Resliced Header1 1-box1 box1 2 Anatomy Header1 17 Resliced Image2 1-box1 box1 2 Anatomy Header1 18 Resliced Header2 1-box1 box1 2 Anatomy Header1 19 Resliced Image3 1-box1 box1 2 Anatomy Header1 20 Resliced Header3 1-box1 box1 2 Anatomy Header1 21 Resliced Image4 1-box1 box1 2 Anatomy Header1 22 Resliced Header4 1-box1 box1 3 Anatomy Image2 15 Resliced Image1 1-box1 box1 3 Anatomy Image2 16 Resliced Header1 1-box1 box1 3 Anatomy Image2 17 Resliced Image2 1-box1 box1 3 Anatomy Image2 18 Resliced Header2 1-box1 box1 3 Anatomy Image2 19 Resliced Image3 1-box1 box1 3 Anatomy Image2 20 Resliced Header3 1-box1 box1 3 Anatomy Image2 21 Resliced Image4 1-box1 box1 3 Anatomy Image2 22 Resliced Header4 1-box1 box1 4 Anatomy Header2 15 Resliced Image1 1-box1 box1 4 Anatomy Header2 16 Resliced Header1 1-box1 box1 4 Anatomy Header2 17 Resliced Image2 1-box1 box1 4 Anatomy Header2 18 Resliced Header2 1-box1 box1 4 Anatomy Header2 19 Resliced Image3 1-box1 box1 4 Anatomy Header2 20 Resliced Header3 1-box1 box1 4 Anatomy Header2 21 Resliced Image4 1-box1 box1 4 Anatomy Header2 22 Resliced Header4 1-box1 box1 5 Anatomy Image3 15 Resliced Image1 1-box1 box1 5 Anatomy Image3 16 Resliced Header1 1-box1 box1 5 Anatomy Image3 17 Resliced Image2 1-box1 box1 5 Anatomy Image3 18 Resliced Header2 1-box1 box1 5 Anatomy Image3 19 Resliced Image3 1-box1 box1 5 Anatomy Image3 20 Resliced Header3 1-box1 box1 5 Anatomy Image3 21 Resliced Image4 1-box1 box1 5 Anatomy Image3 22 Resliced Header4 1-box1 box1 6 Anatomy Header3 15 Resliced Image1 1-box1 box1 6 Anatomy Header3 16 Resliced Header1 1-box1 box1 6 Anatomy Header3 17 Resliced Image2 1-box1 box1 6 Anatomy Header3 18 Resliced Header2 1-box1 box1 6 Anatomy Header3 19 Resliced Image3 1-box1 box1 6 Anatomy Header3 20 Resliced Header3 1-box1 box1 6 Anatomy Header3 21 Resliced Image4 1-box1 box1 6 Anatomy Header3 22 Resliced Header4 1-box1 box1 7 Anatomy Image4 15 Resliced Image1 1-box1 box1 7 Anatomy Image4 16 Resliced Header1 1-box1 box1 7 Anatomy Image4 17 Resliced Image2 1-box1 box1 7 Anatomy Image4 18 Resliced Header2 1-box1 box1 7 Anatomy Image4 19 Resliced Image3 1-box1 box1 7 Anatomy Image4 20 Resliced Header3 1-box1 box1 7 Anatomy Image4 21 Resliced Image4 1-box1 box1 7 Anatomy Image4 22 Resliced Header4 1-box1 box1 8 Anatomy Header4 15 Resliced Image1 1-box1 box1 8 Anatomy Header4 16 Resliced Header1 1-box1 box1 8 Anatomy Header4 17 Resliced Image2 1-box1 box1 8 Anatomy Header4 18 Resliced Header2 1-box1 box1 8 Anatomy Header4 19 Resliced Image3 1-box1 box1 8 Anatomy Header4 20 Resliced Header3 1-box1 box1 8 Anatomy Header4 21 Resliced Image4 1-box1 box1 8 Anatomy Header4 22 Resliced Header4 1-box1 box1 9 Reference Image 15 Resliced Image1 1-box1 box1 9 Reference Image 16 Resliced Header1 1-box1 box1 9 Reference Image 17 Resliced Image2 1-box1 box1 9 Reference Image 18 Resliced Header2 1-box1 box1 9 Reference Image 19 Resliced Image3 1-box1 box1 9 Reference Image 20 Resliced Header3 1-box1 box1 9 Reference Image 21 Resliced Image4 1-box1 box1 9 Reference Image 22 Resliced Header4 10-box2box2 23 Atlas Image 28 Atlas X Graphic 10-box2box2 24 Atlas Header 28 Atlas X Graphic 9 softmean 15 Resliced Image1 23 Atlas Image 9 softmean 15 Resliced Image1 24 Atlas Header 9 softmean 16 Resliced Header1 23 Atlas Image 9 softmean 16 Resliced Header1 24 Atlas Header 9 softmean 17 Resliced Image2 23 Atlas Image 9 softmean 17 Resliced Image2 24 Atlas Header 9 softmean 18 Resliced Header2 23 Atlas Image 9 softmean 18 Resliced Header2 24 Atlas Header 9 softmean 19 Resliced Image3 23 Atlas Image 9 softmean 19 Resliced Image3 24 Atlas Header 9 softmean 20 Resliced Header3 23 Atlas Image 9 softmean 20 Resliced Header3 24 Atlas Header 9 softmean 21 Resliced Image4 23 Atlas Image 9 softmean 21 Resliced Image4 24 Atlas Header 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas Header
STEP STEPCLASS INPUT INPUTNAME OUTPUT OUTPUTNAME -------------------------------------------------------------------------- 1 align_warp 1 Anatomy Image1 11 Warp Parameters1 1 align_warp 10 Reference Header 11 Warp Parameters1 1 align_warp 2 Anatomy Header1 11 Warp Parameters1 1 align_warp 9 Reference Image 11 Warp Parameters1 10 slicer 23 Atlas Image 25 Atlas X Slice 10 slicer 24 Atlas Header 25 Atlas X Slice 13 convert 25 Atlas X Slice 28 Atlas X Graphic 2 align_warp 10 Reference Header 12 Warp Parameters2 2 align_warp 3 Anatomy Image2 12 Warp Parameters2 2 align_warp 4 Anatomy Header2 12 Warp Parameters2 2 align_warp 9 Reference Image 12 Warp Parameters2 3 align_warp 10 Reference Header 13 Warp Parameters3 3 align_warp 5 Anatomy Image3 13 Warp Parameters3 3 align_warp 6 Anatomy Header3 13 Warp Parameters3 3 align_warp 9 Reference Image 13 Warp Parameters3 4 align_warp 10 Reference Header 14 Warp Parameters4 4 align_warp 7 Anatomy Image4 14 Warp Parameters4 4 align_warp 8 Anatomy Header4 14 Warp Parameters4 4 align_warp 9 Reference Image 14 Warp Parameters4 5 reslice 11 Warp Parameters1 15 Resliced Image1 5 reslice 11 Warp Parameters1 16 Resliced Header1 6 reslice 12 Warp Parameters2 17 Resliced Image2 6 reslice 12 Warp Parameters2 18 Resliced Header2 7 reslice 13 Warp Parameters3 19 Resliced Image3 7 reslice 13 Warp Parameters3 20 Resliced Header3 8 reslice 14 Warp Parameters4 21 Resliced Image4 8 reslice 14 Warp Parameters4 22 Resliced Header4 9 softmean 15 Resliced Image1 23 Atlas Image 9 softmean 15 Resliced Image1 24 Atlas Header 9 softmean 16 Resliced Header1 23 Atlas Image 9 softmean 16 Resliced Header1 24 Atlas Header 9 softmean 17 Resliced Image2 23 Atlas Image 9 softmean 17 Resliced Image2 24 Atlas Header 9 softmean 18 Resliced Header2 23 Atlas Image 9 softmean 18 Resliced Header2 24 Atlas Header 9 softmean 19 Resliced Image3 23 Atlas Image 9 softmean 19 Resliced Image3 24 Atlas Header 9 softmean 20 Resliced Header3 23 Atlas Image 9 softmean 20 Resliced Header3 24 Atlas Header 9 softmean 21 Resliced Image4 23 Atlas Image 9 softmean 21 Resliced Image4 24 Atlas Header 9 softmean 22 Resliced Header4 23 Atlas Image 9 softmean 22 Resliced Header4 24 Atlas Header
SELECT *
FROM uProcess upc
WHERE usr = 'uBlack'
START WITH outputName = 'Resliced Image1'
CONNECT BY PRIOR upc.output = upc.input;
INPUT
----------
1
2
3
4
5
6
7
8
9
10
Note that as expected, a lot of inputs are provided since this user sees the two first stages as a black box. Consequently, he cannot distinguish which input is used to produce which output. This can be interesting for security reason.
INPUT
----------
1
2
9
10
11
Since uAdmin can see everything, he knows that only DATAIDs 1, 2 and 9 through 11 have been used as inputs.
INPUT
----------
1
2
9
10
11
Note that as expected, only the right inputs are provided.
INPUT
----------
1
2
9
10
11
Same results as previously.
| ||||||||||||||||||
Categorisation of queries | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
According to your provenance approach, you may be able to provide a categorisation of queries. Can you elaborate on the categorisation and its rationale. | ||||||||||||||||||
| > > |
According to your provenance approach, you may be able to provide a categorisation of queries. Can you elaborate on the categorisation and its rationale
Generally speaking, our provenance queries can be categorized as such:
| ||||||||||||||||||
Live systems | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
If your system can be accessed live (through portal, web page, web service, or other), provide relevant information here. | ||||||||||||||||||
| > > |
Our implementation is not presently available to the public at large. However, we will be happy to share the code with those who are interested in seeing it. Please send us an email. | ||||||||||||||||||
Further Comments | |||||||||||||||||||
| Deleted: | |||||||||||||||||||
| < < |
Provide here further comments. | ||||||||||||||||||
Conclusions | |||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
Provide here your conclusions on the challenge, and issues that you like to see discussed at a face to face meeting. | ||||||||||||||||||
| > > |
We have shown how our model is able to represent the challenge workflow and to answer the proposed queries. We have also shown how the notion of a user view can allow the user to manage the complexity of a workflow through higher levels of abstraction and layering.
We would be interested in discussing with the workshop attendees about three points: (i) the meaning of a stage in a workflow (and in particular, how to interpret this notion with respect to user views), (ii) the biological significance of the procedures that can compose a workflow (in the workflow provided, what are the “significant” steps for a biologist?) and (iii) the (general) problem of computing a difference between two workflows (cf. query 7).
-- SarahCohenBoulakia - 10 Sep 2006
| ||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
| ||||||||||||||||||
| Changed: | |||||||||||||||||||
| < < |
-- SarahCohenBoulakia - 06 Sep 2006 | ||||||||||||||||||
| > > |
| ||||||||||||||||||
| Added: | |||||||||||||||||||
| > > |
| ||||||||||||||||||
Provenance Challenge Template | ||||||||
| Line: 9 to 9 | ||||||||
|---|---|---|---|---|---|---|---|---|
| ||||||||
| Changed: | ||||||||
| < < |
| |||||||
| > > |
| |||||||
| The following will be updated soon! | ||||||||
Provenance Challenge Template | ||||
| Line: 1 to 1 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Added: | ||||||||
| > > |
Provenance Challenge TemplateParticipating Team (Under construction)
Workflow RepresentationProvide here a description of how you have encoded the Challenge workflow.Provenance TraceUpload a representation of the information you captured when executing the workflow. Explain the structure (provide pointers to documents describing your schemas etc.)Provenance QueriesFor each query, if your system can support your query, provide a description of how you implement the query, what result is returned; otherwise, explain whether the query is in the remit of your system. Also, make sure you complete the ProvenanceQueriesMatrix.Suggested Workflow VariantsSuggest variants of the workflow that can exhibit capabilities that your system support.Suggested QueriesSuggest significant queries that your system can support and are not in the proposed list of queries, and how you have implemented/would implement them. These queries may be with regards to a variant of the workflow suggested above.Categorisation of queriesAccording to your provenance approach, you may be able to provide a categorisation of queries. Can you elaborate on the categorisation and its rationale.Live systemsIf your system can be accessed live (through portal, web page, web service, or other), provide relevant information here.Further CommentsProvide here further comments.ConclusionsProvide here your conclusions on the challenge, and issues that you like to see discussed at a face to face meeting. -- SarahCohenBoulakia - 06 Sep 2006 | |||||||