Amazon Redshift Insert Bulk activity

Introduction

An Amazon Redshift Insert Bulk activity, using its Amazon Redshift connection, inserts multiple records into a table at Amazon Redshift, and is intended to be used as a target to consume data in an operation. This activity provides the option to set the number of records per batch and the option to stop processing the remaining records if an error is found.

Create an Amazon Redshift Insert Bulk activity

An instance of an Amazon Redshift Insert Bulk activity is created from an Amazon Redshift connection using its Insert Bulk activity type.

To create an instance of an activity, drag the activity type to the design canvas or copy the activity type and paste it on the design canvas. For details, see Creating an activity instance in Component reuse.

An existing Amazon Redshift Insert Bulk activity can be edited from these locations:

The design canvas (see Component actions menu in Design canvas).
The project pane's Components tab (see Component actions menu in Project pane Components tab).

Configure an Amazon Redshift Insert Bulk activity

Follow these steps to configure an Amazon Redshift Insert Bulk activity:

Step 1: Enter a name and select a schema
Provide a name for the activity and select a schema.
Step 2: Select a table
Select the table containing the records that are to be inserted.
Step 3: Review the data schemas
Any request or response schemas generated from the endpoint are displayed.

Step 1: Enter a name and select a schema

In this step, provide a name for the activity and select a schema. Each user interface element of this step is described below.

Amazon Redshift Insert Bulk activity configuration step 1

Name: Enter a name to identify the activity. The name must be unique for each Amazon Redshift Insert Bulk activity and must not contain forward slashes (/) or colons (:).
Select a Schema: This section displays schemas available in the Amazon Redshift endpoint. When reopening an existing activity configuration, only the selected schema is displayed instead of reloading the entire schema list.
- Selected Schema Name: After a schema is selected, it is listed here.
- Search: Enter any column's value into the search box to filter the list of schemas. The search is not case-sensitive. If schemas are already displayed within the table, the table results are filtered in real time with each keystroke. To reload schemas from the endpoint when searching, enter search criteria and then refresh, as described below.
- Refresh: Click the refresh icon or the word Refresh to reload schemas from the Amazon Redshift endpoint. This may be useful if schemas have been added to Amazon Redshift. This action refreshes all metadata used to build the table of schemas displayed in the configuration.
- Selecting a Schema: Within the table, click anywhere on a row to select a schema. Only one schema can be selected. The information available for each schema is fetched from the Amazon Redshift endpoint:
  - Schema: The name of the Amazon Redshift schema.
Tip

If the table does not populate with available schemas, the Amazon Redshift connection may not be successful. Ensure you are connected by reopening the connection and retesting the credentials.
Optional Settings: Click to expand additional optional settings:
- Batch Size: Enter a batch size that is greater than 0 or less than 10000. Default value: 100.
- Continue on Error: Select to continue the activity execution if an error is encountered for a dataset in a batch request. If any errors are encountered, they are written to the operation log.
Save & Exit: If enabled, click to save the configuration for this step and close the activity configuration.
Next: Click to temporarily store the configuration for this step and continue to the next step. The configuration will not be saved until you click the Finished button on the last step.
Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.

Step 2: Select a table

In this step, select a table. Each user interface element of this step is described below.

Amazon Redshift Insert Bulk activity configuration step 2

Select a Table: This section displays tables available in the Amazon Redshift endpoint. When reopening an existing activity configuration, only the selected table is displayed instead of reloading the entire table list.
- Selected Schema Name: The schema name selected in the previous step is listed here.
- Select Table Name: After a table is selected, it is listed here.
- Search: Enter any column's value into the search box to filter the list of tables. The search is not case-sensitive. If tables are already displayed within the table, the table results are filtered in real time with each keystroke. To reload tables from the endpoint when searching, enter search criteria and then refresh, as described below.
- Refresh: Click the refresh icon or the word Refresh to reload tables from the Amazon Redshift endpoint. This may be useful if tables have been added to Amazon Redshift. This action refreshes all metadata used to build the table of tables displayed in the configuration.
- Selecting a Table: Within the table, click anywhere on a row to select a table. Only one table can be selected. The information available for each table is fetched from the Amazon Redshift endpoint:
  - Table Name: The name of the Amazon Redshift table.
  - Schema: The name of the Amazon Redshift schema.
  - Catalog: The name of the Amazon Redshift catalog.
Tip

If the table does not populate with available tables, the Amazon Redshift connection may not be successful. Ensure you are connected by reopening the connection and retesting the credentials.
Back: Click to temporarily store the configuration for this step and return to the previous step.
Next: Click to temporarily store the configuration for this step and continue to the next step. The configuration will not be saved until you click the Finished button on the last step.
Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.

Step 3: Review the data schemas

Any request or response schemas generated from the endpoint are displayed. Each user interface element of this step is described below.

Amazon Redshift Insert Bulk activity configuration step 3

Data Schemas: These data schemas are inherited by adjacent transformations and are displayed again during transformation mapping.

Note

Data supplied in a transformation takes precedence over the activity configuration.

The Amazon Redshift connector uses the Amazon Redshift JDBC Driver version 2.1.0.28 and Amazon Redshift SQL Commands. Refer to the Amazon Redshift documentation and the Amazon Redshift System Overview documentation for additional information.

The request and response data schemas consist of these nodes and fields:

Request

Request Schema Field/Node Notes

accounts Node representing the accounts where records are to be bulk inserted

id ID to be inserted

name Name to be inserted

balance Value to be inserted

Response

Response Schema Field/Node	Notes
`bulkErrorResponse`	The format of the request schema
`tableName`	Name of the table where records were bulk inserted

`responseDetails`	Node of details from response
`batchSize`	Number of records that were bulk inserted per batch
`totalRecords`	Total number of records that were processed
`recordsAffected`	Total number of records that were bulk inserted

`errorDetails`	Node containing any error messages
`SqlState`	Code which identifies SQL error conditions
`errorMsg`	Error message
`errorCode`	Error code

Refresh: Click the refresh icon or the word Refresh to regenerate schemas from the Amazon Redshift endpoint. This action also regenerates a schema in other locations throughout the project where the same schema is referenced, such as in an adjacent transformation.
Back: Click to temporarily store the configuration for this step and return to the previous step.
Finished: Click to save the configuration for all steps and close the activity configuration.
Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.

Next steps

After configuring an Amazon Redshift Insert Bulk activity, complete the configuration of the operation by adding and configuring other activities, transformations, or scripts as operation steps. You can also configure the operation settings, which include the ability to chain operations together that are in the same or different workflows.

Menu actions for an activity are accessible from the project pane and the design canvas. For details, see Activity actions menu in Connector basics.

Amazon Redshift Insert Bulk activities can be used as a target with these operation patterns:

Transformation pattern
Two-transformation pattern (as the first or second target)

To use the activity with scripting functions, write the data to a temporary location and then use that temporary location in the scripting function.

When ready, deploy and run the operation and validate behavior by checking the operation logs.

Request Schema Field/Node	Notes
`accounts`	Node representing the accounts where records are to be bulk inserted
`id`	ID to be inserted
`name`	Name to be inserted
`balance`	Value to be inserted