Stream and batch transformations in Jitterbit Design Studio

Jitterbit supports these methods for processing a transformation:

Streaming transformation
Batch transformation
Chunking

Streaming and batch transformations are the preferred methods to use when the amount of memory a Jitterbit transformation uses needs to be limited. In cases where you are unable to use either a streaming or batch transformation, chunking may be applicable as described below.

Note

From the perspective of a transformation, a SOAP web service response/request corresponds to an XML source/target when considering these limitations on processing.

Streaming transformation

A streaming transformation loads one record at a time into memory, performs the transformation of the record, and writes the target to disk. This minimizes the amount of memory that is used during the transformation to what is needed to transform one record.

Streaming is automatically applied to transformations where the source and target are both flat structures (for example, a single database table or a single CSV file) and these requirements are fulfilled:

Streaming has not been explicitly disabled by setting AutoStreaming=0 in the jitterbit.conf file.
Streaming has not been explicitly disabled by setting the Jitterbit variable $jitterbit.transformation.auto_streaming to 0 or false.
No instance-resolving functions are used, such as FindByPos, FindValue, or Sum.
These dictionary and array functions are not present: GetSourceInstanceMap, GetSourceAttrNames, GetSourceElementNames, GetSourceInstanceElementMap, GetSourceInstanceArray, GetSourceInstanceElementArray.
The XML function GetXMLString is not present.
There are no multiple mappings in the target.
The transformation does not have a condition defined in the target.

No other action is needed; streaming will automatically be used by the transformation when it is processed.

Example

Transformations that automatically use streaming include these data structures:

CSV to CSV
Single table to single table
Single table to CSV
CSV to single table

Batch transformation

With transformations that do not meet the criteria for streaming, the entire source is read into memory and the transformation is performed in memory. This method is usually the most efficient in terms of time, but can lead to out-of-memory errors if the source is very large. In those cases, either a batch transformation or chunking must be used.

A batch transformation is similar to a streaming transformation, but it processes several source records at a time (the batches) and has fewer limitations than streaming. It can be used in cases where streaming is not automatically used and is applicable for hierarchical sources (for example, for multiple database tables with one or more parent-child relationship or for hierarchical file formats.

To enable a batch transformation, right-click on the source node that is to become the batch source node and from the menu displayed select Define batch transformation.

In the dialog that appears, enter the maximum batch size. This is the number of records that should be read into memory in each batch. Choosing a batch size that is too small will slow the transformation. However, all the source and target data for a batch must fit into the available memory.

Example

Batch transformations can be used in these cases where the source is very large:

Hierarchical database to either CSV or a single table
Hierarchical database to hierarchical text/database
Hierarchical text to either CSV or a single table
Hierarchical text to hierarchical text/database

Chunking

In a situation where a streaming or batch transformation is neither applicable or possible, you may be able to use chunking to make the transformation require less memory. For very large XML sources and targets, chunking may be the only option. (If memory use is not an issue, a streaming or batch transformation is always the preferred choice.) For more information, see Chunking.