Iteration strategies must be defined when the processor has two or more input
ports. By default the workflow parser will use a dot iteration
strategy for all inputs. These operators use the index of data items
received or produced by workflow processors to combine them. The index of a
data item corresponds, for data items produced by a source, to the
order number in the source data file, and for data items produced by a standard
processor to the index of input data items eventually combined by the
operators. There are 4 data manipulation operators:
- dot (groups 2 or more ports): data from the different
ports are processed together when their index match exactly (data
with index 0 of one port is matched with data with index 0 of the
other ports). The output index is the same as the index of the input
data.
- cross (groups 2 ports): processes each data instance of
the first port with each data instance of the second port. This
processor will increase by one the index depth of the output (for
example: if data inputs have indexes 0 and 1 then the outputs have
the index 0_1).
- flatcross (groups 2 ports): same as cross but with a
different output indexation scheme. This operator does not increase
the depth of the output index but creates new indexes (e.g.,
if data inputs have indexes 1 and 2 with a maximum index of 3 for
the right input, then the output has the index ). Note
that this operator creates a synchronization constraint among all
instances as the maximum index of the right input must be known by
the workflow engine before being able to create new indexes.
- match (groups 2 ports): processes each data instance of
the first port with all the data instances of the second port that
have an index prefix that matches the first port's index (for
example: if left data has index 1_1, it will be processed with all
right data having an index beginning with 1_1). The output
index is the second port's index.
Here is an example of a Gwendia workflow (to be continued with the links part
below):
<workflow>
<interface>
<constant name="parameter" type="integer" value="50"/>
<source name="key" type="double" />
<sink name="results" type="file" />
</interface>
</processors>
<processor name="genParam">
<in name="paramKey" type="double"/>
<out name="paramFiles" type="file" depth="1"/>
<diet path="gen" estimation="constant"/>
</processor>
<processor name="docking">
<in name="param" type="integer" />
<in name="input" type="file" />
<out name="result" type="double" />
<iterationstrategy>
<cross>
<port name="param" />
<port name="input" />
</cross>
</iterationstrategy>
<diet path="dock" estimation="constant"/>
</processor>
<processor name="statisticaltest">
<in name="values" type="double" depth="1"/>
<out name="result" type="file"/>
<iterationstrategy>
<cross>
<port name="coefficient" />
<match>
<port name="values" />
<port name="weights" />
</match>
</cross>
</iterationstrategy>
<diet path="weightedaverage" />
</processor>
</processors>
<links>
<!-- LINKS (see below) -->
</links>
</workflow>
The DIET Team - Mer 29 nov 2017 15:13:36 EST