Skip to Content

Amazon Redshift Insert Bulk Activity

Introduction

An Amazon Redshift Insert Bulk activity, using its Amazon Redshift connection, inserts multiple records into a table at Amazon Redshift, and is intended to be used as a target to consume data in an operation. This activity provides the option to set the number of records per batch and the option to stop processing the remaining records if an error is found.

Create an Amazon Redshift Insert Bulk Activity

An instance of an Amazon Redshift Insert Bulk activity is created from an Amazon Redshift connection using its Insert Bulk activity type.

To create an instance of an activity, drag the activity type to the design canvas or copy the activity type and paste it on the design canvas. For details, see Creating an Activity Instance in Component Reuse.

An existing Amazon Redshift Insert Bulk activity can be edited from these locations:

Configure an Amazon Redshift Insert Bulk Activity

Follow these steps to configure an Amazon Redshift Insert Bulk activity:

Step 1: Enter a Name and Select a Schema

In this step, provide a name for the activity and select a schema. Each user interface element of this step is described below.

Amazon Redshift Insert Bulk Activity Configuration Step 1

  • Name: Enter a name to identify the activity. The name must be unique for each Amazon Redshift Insert Bulk activity and must not contain forward slashes (/) or colons (:).

  • Select a Schema: This section displays schemas available in the Amazon Redshift endpoint. When reopening an existing activity configuration, only the selected schema is displayed instead of reloading the entire schema list.

    • Selected Schema Name: After a schema is selected, it is listed here.

    • Search: Enter any part of the schema name into the search box to filter the list of schemas. The search is not case-sensitive. If schemas are already displayed within the table, the table results are filtered in real time with each keystroke. To reload schemas from the endpoint when searching, enter search criteria and then refresh, as described below.

    • Refresh: Click the refresh icon Refresh icon or the word Refresh to reload schemas from the Amazon Redshift endpoint. This may be useful if schemas have been added to Amazon Redshift. This action refreshes all metadata used to build the table of schemas displayed in the configuration.

    • Selecting a Schema: Within the table, click anywhere on a row to select a schema. Only one schema can be selected. The information available for each schema is fetched from the Amazon Redshift endpoint:

    Tip

    If the table does not populate with available schemas, the Amazon Redshift connection may not be successful. Ensure you are connected by reopening the connection and retesting the credentials.

  • Optional Settings: Click to expand additional optional settings:

    • Batch Size: Enter a batch size that is greater than 0 or less than 10000. Default value: 100.
    • Continue on Error: Select to continue the activity execution if an error is encountered for a dataset in a batch request. If any errors are encountered, they are written to the operation log.
  • Save & Exit: If enabled, click to save the configuration for this step and close the activity configuration.

  • Next: Click to temporarily store the configuration for this step and continue to the next step. The configuration will not be saved until you click the Finished button on the last step.

  • Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.

Step 2: Select a Table

In this step, select a table. Each user interface element of this step is described below.

Amazon Redshift Insert Bulk Activity Configuration Step 2

  • Select a Table: This section displays tables available in the Amazon Redshift endpoint. When reopening an existing activity configuration, only the selected table is displayed instead of reloading the entire table list.

    • Selected Schema Name: The schema name selected in the previous step is listed here.

    • Select Table Name: After a table is selected, it is listed here.

    • Search: Enter any part of the table name into the search box to filter the list of tables. The search is not case-sensitive. If tables are already displayed within the table, the table results are filtered in real time with each keystroke. To reload tables from the endpoint when searching, enter search criteria and then refresh, as described below.

    • Refresh: Click the refresh icon Refresh icon or the word Refresh to reload tables from the Amazon Redshift endpoint. This may be useful if tables have been added to Amazon Redshift. This action refreshes all metadata used to build the table of tables displayed in the configuration.

    • Selecting a Table: Within the table, click anywhere on a row to select a table. Only one table can be selected. The information available for each table is fetched from the Amazon Redshift endpoint:

    Tip

    If the table does not populate with available tables, the Amazon Redshift connection may not be successful. Ensure you are connected by reopening the connection and retesting the credentials.

  • Back: Click to temporarily store the configuration for this step and return to the previous step.

  • Next: Click to temporarily store the configuration for this step and continue to the next step. The configuration will not be saved until you click the Finished button on the last step.

  • Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.

Step 3: Review the Data Schemas

Any request or response schemas generated from the endpoint are displayed. Each user interface element of this step is described below.

Amazon Redshift Insert Bulk Activity Configuration Step 3

  • Data Schemas: These data schemas are inherited by adjacent transformations and are displayed again during transformation mapping.

    Note

    Data supplied in a transformation takes precedence over the activity configuration.

    The Amazon Redshift connector uses the Amazon Redshift JDBC Driver and Amazon Redshift SQL Commands. Refer to the Amazon Redshift documentation and the Amazon Redshift System Overview documentation for additional information.

    The request and response data schemas consist of these nodes and fields:

    • Request

      Request Schema Field/Node Notes
      accounts Node representing the accounts where records are to be bulk inserted
      id ID to be inserted
      name Name to be inserted
      balance Value to be inserted
    • Response

      Response Schema Field/Node Notes
      bulkErrorResponse The format of the request schema
      tableName Name of the table where records were bulk inserted
      responseDetails Node of details from response
      batchSize Number of records that were bulk inserted per batch
      totalRecords Total number of records that were processed
      recordsAffected Total number of records that were bulk inserted
      errorDetails Node containing any error messages
      SqlState Code which identifies SQL error conditions
      errorMsg Error message
      errorCode Error code
  • Refresh: Click the refresh icon Refresh icon or the word Refresh to regenerate schemas from the Amazon Redshift endpoint. This action also regenerates a schema in other locations throughout the project where the same schema is referenced, such as in an adjacent transformation.

  • Back: Click to temporarily store the configuration for this step and return to the previous step.

  • Finished: Click to save the configuration for all steps and close the activity configuration.

  • Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.

Next Steps

After configuring an Amazon Redshift Insert Bulk activity, complete the configuration of the operation by adding and configuring other activities, transformations, or scripts as operation steps. You can also configure the operation settings, which include the ability to chain operations together that are in the same or different workflows.

Menu actions for an activity are accessible from the project pane and the design canvas. For details, see Activity Actions Menu in Connector Basics.

Amazon Redshift Insert Bulk activities can be used as a target with these operation patterns:

A typical use case is to use an Amazon Redshift Insert Bulk activity in the Two-transformation Pattern. In this example, the first transformation (Insert Bulk Request) creates a request structure that is passed to the Amazon Redshift Insert Bulk activity. The second transformation (Insert Bulk Response) receives the response structure, which is then written to a variable by a Variable Write activity (Write Insert Bulk Response) and a message is then logged by the Write to Operation Log script:

Amazon Redshift Insert Bulk operation

To use the activity with scripting functions, write the data to a temporary location and then use that temporary location in the scripting function.

When ready, deploy and run the operation and validate behavior by checking the operation logs.