Iteration strategies

Iteration strategies must be defined when the processor has two or more input ports. By default the workflow parser will use a dot iteration strategy for all inputs. These operators use the index of data items received or produced by workflow processors to combine them. The index of a data item corresponds, for data items produced by a source, to the order number in the source data file, and for data items produced by a standard processor to the index of input data items eventually combined by the operators. There are 4 data manipulation operators:

dot (groups 2 or more ports): data from the different ports are processed together when their index match exactly (data with index 0 of one port is matched with data with index 0 of the other ports). The output index is the same as the index of the input data.
cross (groups 2 ports): processes each data instance of the first port with each data instance of the second port. This processor will increase by one the index depth of the output (for example: if data inputs have indexes 0 and 1 then the outputs have the index 0_1).
flatcross (groups 2 ports): same as cross but with a different output indexation scheme. This operator does not increase the depth of the output index but creates new indexes (e.g., if data inputs have indexes 1 and 2 with a maximum index of 3 for the right input, then the output has the index ). Note that this operator creates a synchronization constraint among all instances as the maximum index of the right input must be known by the workflow engine before being able to create new indexes.
match (groups 2 ports): processes each data instance of the first port with all the data instances of the second port that have an index prefix that matches the first port's index (for example: if left data has index 1_1, it will be processed with all right data having an index beginning with 1_1). The output index is the second port's index.

Here is an example of a Gwendia workflow (to be continued with the links part below):

<workflow>
  <interface>
    <constant name="parameter" type="integer" value="50"/>
    <source name="key" type="double" />
    <sink name="results" type="file" />
  </interface>

  </processors>
    <processor name="genParam">
      <in name="paramKey" type="double"/>
      <out name="paramFiles" type="file" depth="1"/>
      <diet path="gen" estimation="constant"/>
    </processor>

    <processor name="docking">
      <in name="param" type="integer" />
      <in name="input" type="file" />
      <out name="result" type="double" />
      <iterationstrategy>
        <cross>
          <port name="param" />
          <port name="input" />
        </cross>
      </iterationstrategy>
      <diet path="dock" estimation="constant"/>
    </processor>

    <processor name="statisticaltest">
      <in name="values" type="double" depth="1"/>
      <out name="result" type="file"/>
      <iterationstrategy>
        <cross>
          <port name="coefficient" />
          <match>
            <port name="values" />
            <port name="weights" />
          </match>
        </cross>
      </iterationstrategy>
      <diet path="weightedaverage" />
    </processor>
</processors>
<links>
  <!-- LINKS (see below) -->
</links>
</workflow>

The DIET Team - Mer 29 nov 2017 15:13:36 EST