Amazon Relational Database Services (RDS) is a managed database service that runs on familiar database engines like PostgreSQL.

In this tutorial, we’ll walk you through how to spin up a Postgres-RDS instance and then connect it to Stitch as a destination, or data warehouse.


Prerequisites

The only requirement for using PostgreSQL with Stitch is an Amazon Web Services (AWS) account.

To sign up for an AWS account, click here or enter this URL in your browser: http://aws.amazon.com/

Follow the instructions on the website to complete setting up your account.


Step 1: Define Data Warehouse Settings

In this step, you’ll sign into your AWS account and configure the basic settings for the Postgres-RDS database.

  1. Sign into your AWS account.
  2. Once you’re signed into the AWS console, click the Services menu located in the top-left corner of the page.
  3. Locate the RDS option. This should be in the Database section of the page.
  4. Select the Region you want to launch the instance in. You can do this by clicking the Region drop-down menu in the upper right corner of the console and selecting the appropriate region:

    Selecting a Region in the RDS-AWS console.

  5. In the Create Instance section of the page, click the Launch a DB Instance button.
  6. In the Select Engine page, click the PostgreSQL icon and then the Select button.
  7. In the Production? page, you have two options:
    • Production: The Production option uses Multi-AZ Deployment and Provisioned IOPS Storage, which are features that are intended to guard against downtime and disk I/O performance issues. This option is a good idea if you or your company can’t afford downtime or you anticipate heavy usage of the database.
    • Dev/Test: This option is suitable if this database will operate outside of production, can handle downtime, don’t anticipate heavy usage, or if you simply are giving RDS a try by using the Free Usage Tier.
  8. Select the option you want and click the Next Step button.

Step 2: Configure the Database

There are two steps to configuring the database: specifying the basic details and configuring more advanced settings.

Specifying the Database Details

In the Specify DB Details page, you’ll define the basic settings for your Postgres-RDS database. There are two sections on this page:

Instance Specifications

In the Instance Specifications section, you can select the licensing model, version, and more.

Setting What it means What you should do
License Model This is the license used by the database engine. Leave as-is - Postgres only has one license.
DB Engine Version This is the version of PostgreSQL the database will run on. Select your version, but keep in mind that Stitch only supports versions +.
DB Instance Class This indicates the Instance Class your database will run on. Select the Instance Class you want your database to run on.

If you’re not sure what this means or how to select the best option, check out Amazon’s documentation.
Multi-AZ Deployment This indicates whether your instance will use Multi-AZ Deployment. This is only available if you selected Production in the last section. This will default to Yes. Leave this setting as-is if you want Amazon to maintain a synchronous standby replica in a different Availability Zone. Amazon recommends this for Production Instances.
Storage Type This indicates the type of storage your database will use. This is only available if you selected Production in the last section. Select the type of storage you want your database to use.

If you’re not sure what this means or how to select the best option, check out Amazon’s documentation.
Allocated Storage This is the amount of storage for your database. Enter the amount of storage (in GB) you want for your database. If you need help deciding the amount, check out Amazon’s documentation.

Settings

In the Settings section, you’ll define the name of the database and the master user credentials.

  • DB Instance Identifier: Enter a name for the database instance. This name must be unique for your account in the Region you selected.
  • Master Username: Enter a username for the master user. For info on the permissions this user is granted, click here.
  • Master Password: Enter a password for the master user.
  • Confirm Password: Enter the master user’s password again to confirm.

When finished, click the Next Step button.

Configuring the Advanced Settings

In the Configuring Advanced Settings page, you’ll define the last group of settings the instance needs to run. There are several sections on this page:

Network & Security

In this section, you’ll define the network security settings for the database.

Setting What it means What you should do
VPC This setting refers to the Virtual Private Cloud, or the group of servers hosting the instance. If you’re a new AWS customer or don’t want to set up a VPC, you can use the Default option.

Otherwise, select the VPC you want the instance to reside in. Steps for how to create your own VPC are outside the scope of this tutorial, but Amazon has instructions here.
Subnet Group This setting refers to the subnets that the instance can use in the selected VPC. If you’re a new AWS customer or you’re not sure what this means, select the Default option.

You can refer to Amazon’s documentation for more info on subnets and VPCs.
Publicly Accessible This setting determines whether your instance has a publicly accessible IP address, which means it can be accessed from outside the selected VPC. Set this to Yes. This is required to connect the instance to Stitch.
Availability Zone This setting defines what availability zone you want the instance to reside in. If you have a preference, select it.

Otherwise, leave this as No Preference.
Security Groups This setting defines the in- and outbound traffic that can access the instance. If you have a Security Group you’d like to use, select it from the list.

If you’re a new AWS customer, you can use the Default option or Create new Security Group.

If you opt to create a new group, don’t worry too much about the settings for it - you’ll configure them in the next section of this tutorial.

Database Options

In this section, you’ll define the database’s name and parameter settings.

Setting What it means What you should do
Database Name This is the name of the initial database in the instance. This is the name you’ll enter into Stitch at the end of this tutorial to complete the connection setup. Enter a name for the database.
Database Port This is the port that the database will listen on. Use the default (5432) or enter a port.
DB Parameter Group This is the parameter group for the database. If you’ve created your own parameter group, select it here.

Otherwise use the Default group.
Option Group This is the option group for the database. If you’ve created your own option group, select it here.

Otherwise use the Default group.
Snapshots This will copy any database instance tags to a database snapshot when you create a snapshot. You can learn more about tagging here. This is your preference; it isn’t required by Stitch.
Encryption This determines if your data will be encrypted with AES-256 encryption in both the instances themselves and at rest. There are additional steps to setting this up, which you can find help for here. This is your preference; it isn’t required by Stitch.

Backup

In this section, you’ll define the backup settings for the database.

Setting What it means What you should do
Backup Retention Period This defines the number of days you want automatic backups of your database to be retained. This is your preference; it isn’t required by Stitch.
Backup Window This defines the time your database is backed up. Unless you have a preference, use the default of No Preference.

This isn’t required by Stitch.

Maintenance

In this section, you’ll define the maintenance settings for the database.

Setting What it means What you should do
Auto Minor Version Upgrade This defines whether your database instance will automatically receive minor database engine upgrades when they’re available. This is your preference; it isn’t required by Stitch.
Maintenance Window This defines the 30 minute window in which pending changes to your database instance are applied. Unless you have a preference, use the default of No Preference.

This is your preference; it isn’t required by Stitch.

Step 3: Launch the Database

When you’re finished defining the configuration settings, click Launch DB Instance to create and launch the instance.

Note that it may take a few minutes for the instance to complete the provisioning process. The status in the RDS Dashboard page will change to Available when the process completes:

Completed provisioning of the PostgreSQL-RDS database.

Once the status has changed to Available, you can move onto the next step.


Step 4: Configure the Access Settings

For Stitch to successfully connect with your Postgres-RDS instance, you’ll need to add our IP addresses to the appropriate database Security Group.

  1. In the RDS Dashboard page, click the grey selection box (this is the first column in the table) next to the PostgreSQL instance you created.
  2. Click the Instance Actions menu.
  3. Select the See Details option. This will open the details page for the instance.
  4. Locate the Security Groups field in the Security and Network section.
  5. Click the Security Group. This will open the Security Groups page.

  6. Click the Inbound tab located towards the bottom of the page.
  7. Click the Edit button.
  8. This is where you’ll add the Stitch IP addresses that can access the instance. Fill out the fields in this screen as follows:
    • Type: Set this to Custom TCP.
    • Protocol: This will default to TCP - leave it as-is.
    • Port Range: Enter the port associated with the instance. This will be the same port you used to create the instance.
    • Source: Paste one of the IP addresses listed below into this field. You need to add them one at a time, meaning each IP address gets its own rule:

      • 52.23.137.21/32

      • 52.204.223.208/32

      • 52.204.228.32/32

      • 52.204.230.227/32

  9. Repeat this process until all of the Stitch IP addresses (along with any others you may want to whitelist) have been added.
  10. When finished, click Save.

Step 5: Create a Stitch Database User

As we mentioned when you created the instance, the credentials used to launch belong to the master user, or admin.

If you want to control the permission settings, you’ll need to create an additional database user for Stitch. An added benefit is that Stitch will be easily distinguishable in any logs or audits.

Requirements for Creating a PostgreSQL User

You must be a superuser to create a PostgreSQL user. If you’re not a superuser or you’re not sure what this means, please contact someone on your technical team before proceeding.

Required PostgreSQL Permissions

To successfully pipe your data to your PostgreSQL data warehouse, the Stitch user must:

  • Be able to CREATE schemas, tables, and views
  • Be able to SELECT from systems tables
  • Be the owner of all integration schemas and tables. We occasionally need to run COMMENT and ALTER TABLE commands to properly insert your data.

While you technically can manually create integration schemas in PostgreSQL and then grant Stitch access to those schemas, we don’t recommend it. We understand that security is a high priority and permissions should be carefully granted, but this approach will likely lead to headaches for you.

What do we mean? If you take this approach, you’ll need to first create the schema in your data warehouse and then grant the Stitch user the appropriate permissions to access it every single time a new integration is added in Stitch. If this isn’t done, Stitch won’t be able to load data into your data warehouse.

Additionally, if there are any differences between the name of the schema you create in your data warehouse and the schema name you enter in Stitch when setting up an integration, Stitch won’t be able to load data into your data warehouse.

Creating the Stitch User

Depending on the type of connection this is, the steps for creating a PostgreSQL database user for Stitch will vary a bit:

  • If Stitch hasn’t ever replicated data to your PostgreSQL destination, follow the instructions in the NEW Connections tab.
  • If Stitch has replicated data to your PostgreSQL destination in the past, follow the instructions in the EXISTING Connections tab.

If you haven’t already, connect to your PostgreSQL instance using your SQL client. Note that anything inside square brackets - [like this] - is something you need to define when running the commands yourself.

After connecting, run this command to create the user:

create user [stitch_username] with password '[password]';

Next, you’ll assign the CREATE permissions to the user. For [database_name], enter the name of the database where all Stitch-replicated data should be loaded.

Note that this must be a pre-existing database.

grant create on database [database_name] to [stitch_username];

If you restricted access to the information schema, you’ll also need to run this command to grant the Stitch user SELECT permissions:

grant select on all tables in schema information_schema to [stitch_username];

If you haven’t already, connect to your PostgreSQL instance using your SQL client. Note that anything inside square brackets - [like this] - is something you need to define when running the commands yourself.

After connecting, run this command to create the user:

create user [stitch_username] with password '[password]';

Next, you’ll assign the CREATE permissions to the user. For [database_name], enter the name of the database where all Stitch-replicated data should be loaded.

Note that this must be a pre-existing database.

grant create on database [database_name] to [stitch_username];

If you restricted access to the information schema, you’ll also need to run this command to grant the Stitch user SELECT permissions:

grant select on all tables in schema information_schema to [stitch_username];

Transferring Integration Schema & Table Ownership

When an integration is initially connected to Stitch, a schema specific to that connection is created in your data warehouse. Ownership to those schemas and all the tables contained within them must be granted to the Stitch user for data replication to be successful.

First, you need to list all tables and their owners. Query the catalogue tables using this statement:

select * from [stitch_username] pg_tables;

Take note of any integration tables that aren’t owned by the Stitch user - those are the tables you’ll need to update.

Now you’ll transfer ownership of the Stitch integration schemas:

alter schema [schema_name] owner to [stitch_username];

Next, transfer ownership of all the tables in the integration schema to the Stitch user. You can do this table-by-table or run a script that will detect every table in an integration schema and re-assign ownership to Stitch.

To transfer ownership table-by-table, run this command:

alter table [table_name] owner to [stitch_username];

Or, run this script from the command line to do it automatically:

for table in `psql -qAt --host [your.postgres.amazonaws.com] --port [port] --user [admin_user] -c "select tablename from pg_tables where schemaname = '[schema_name]';" [database_name]` ; 
do psql -qAt --host [your.postgres.amazonaws.com] --port [port] --user [admin_user] -c "alter table \"[schema_name]\".\"$table\" owner to [stitch_username];" [database_name] ; done

Repeat the steps in this section for every Stitch integration schema in your data warehouse.


Step 6: Connect Stitch

To complete the setup, you need to locate your PostgreSQL connection details and enter them into the Destination Settings page in Stitch.

Locate the PostgreSQL Connection Details

  1. In the RDS Dashboard page, click the grey selection box (this is the first column in the table) next to the PostgreSQL instance you created.
  2. Click the Instance Actions menu.
  3. Select the See Details option. This will open the details page for the instance.

  4. Locate the following fields:
    • Endpoint
    • Database (DB) Name: This is the name of the database you want to connect to Stitch. This must be a pre-existing database - Stitch won’t create one for you. It could be:
      • The name of the database you used to launch the instance.
      • The name of another database within PostgreSQL.
    • Username: You’ll only need this if you’re using the master user’s credentials to connect to Stitch.
    • Port: This is the port used by the database.

Entering the Connection Details into Stitch

To wrap things up, you need to enter the PostgreSQL database connection and user info into Stitch.

  1. Click the Destination tab.
  2. Click the PostgreSQL icon.
  3. Fill in the fields as follows:
    • Host: Enter the host address (endpoint) used by the PostgreSQL instance.
    • Port: Enter the port used by the PostgreSQL instance.
    • Username: Enter the Stitch PostgreSQL user’s username.
    • Password: Enter the password for the Stitch PostgreSQL user.
    • Database: Enter the name of the PostgreSQL database that you created for Stitch.
  4. When finished, click Update PostgreSQL Settings.

Stitch will perform a connection test to the PostgreSQL database; if successful, a Success! mesage will display at the top of the screen. Note that this test may take a few minutes to complete.


Troubleshooting Connection Errors

If you receive an error message, we recommend trying these troubleshooting steps for common destination connection problems before reaching out to support.

We’ve found that the majority of the time, these resources can resolve most issues.


Learning About Stitch & PostgreSQL

Now that your PostgreSQL data warehouse is up and running, we recommend learning about how Stitch will load your data into it.

  • PostgreSQL Destination Overview: This overview will give you the high-level basics for working with your PostgreSQL data warehouse.
  • Destination Data Loading Guide: This article covers many of the common scenarios Stitch will encounter when loading data into your PostgreSQL destination and the expected behavior for each scenario.


Questions? Feedback?

Did this article help? If you have questions or feedback, please reach out to us.