Panoply Data Loading Behavior | Stitch Documentation

Because data can come from a variety of integrations and all those integrations may structure or handle data differently, Stitch will encounter numerous scenarios when replicating and loading data into your Panoply data warehouse.

This doc covers many of the common scenarios Stitch will encounter and how Panoply specifically handles them.

To make browsing easier, we’ve grouped these scenarios into the following categories:

New table scenarios - These scenarios apply when Stitch creates a new table in Panoply.
Data loading scenarios - These scenarios apply when Stitch encounters specific data types (ex: timezones).
Data typing scenarios - These scenarios apply when Stitch encounters specific data types (ex: timezones).
Schema change scenarios - These scenarios apply when a table undergoes structural changes.
Nested data scenarios - These scenarios apply when Stitch loads nested data into Panoply.
Direct data warehouse changes - These scenarios apply when you make a direct change to your Panoply data warehouse.

Tip: Enter some keywords in the Search box above the tables to display specific rows of information.

New Table Scenarios

These scenarios apply when Stitch creates a new table in Panoply.

Search:

Summary	Scenario	Expected Behavior
Single Primary Key	First batch of data arrives with a single Primary Key	Table is created without Primary Key and no `NOT-NULL` columns. Primary Key info is stored as a comment on the table.
Multiple Primary Keys	First batch of data arrives with multiple Primary Keys	Table is created without Primary Key and no `NOT-NULL` columns. Primary Key info is stored as a comment on the table.
No Primary Keys	First batch of data arrives without Primary Keys	Table is created without Primary Key and no `NOT-NULL` columns.
Replication Key contains `NULLs`	First batch of data arrives where the Replication Key column contains `NULLs`	Table is created and rows with `NULL` Replication Keys are created. Rows with `NULL` Replication Keys will not be replicated after the initial sync if there is at least one non-`NULL` value in the column.
Some empty columns	First batch of data arrives with some columns that are completely empty	Only columns that are populated in at least one record are created.
ALL empty columns	First batch of data arrives with columns that are all `NULL`	Table is created and contains only the Stitch replication (`_sdc`) columns.
Table name is too long	Data arrives with a table name that exceeds the maximum length for Panoply. Table names are limited to 127 characters.	Panoply will reject all data for the table.
Column name is too long	Data arrives with a column name that exceeds the maximum length for Panoply. Column names are limited to 115 characters.	Columns with names longer than 115 characters will be rejected; columns with names less than 115 characters will persist to Panoply. Note that the limit of 115 characters is to leave room for suffixes that are a result of column splitting.
Too many columns in table	A table arrives with more columns than Panoply allows. The maximum number of allowed columns is 1,600.	Panoply will reject all data for the table and surface a `“too many columns”` error.
Duplicate fields	Data arrives with two column names that canonicalize to the same name	Panoply will reject the entire record containing the two columns. Note that Panoply is case-insensitive.
Negative/positive infinity, NaN	Special cases of `FLOAT/DOUBLE` values	These values will be converted into `NULLs`.
Mixed case column name	Data arrives with columns named `LiKeThIs`	Column names will be converted to lowercase.
Column name with spaces	Data arrives with column names containing spaces	Spaces are maintained.
Unsupported special characters in column name	Data arrives with column names containing special characters that are unsupported by Panoply	Special characters are removed.
Column name starts with non-letter	Data arrives with column names that start with numeric characters: `123Column`	This scenario is unremarkable to Panoply.

Data Loading Scenarios

These scenarios apply when Stitch encounters specific data types (ex: timezones).

Summary	Scenario	Expected Behavior
Data for existing table version	Data arrives that updates existing records in the data warehouse (matched on Primary Key)	Also known as an upsert, the original row is deleted and a new row is created with the updated data.
Data with a new table version	A full set of data for a table (set to Full Table Replication) arrives	Table is truncated and new data is loaded in a single atomic action.
Replication Key contains `NULLs` (initial batch)	Initial batch of data arrives where the Replication Key column contains `NULLs`	Table is created and all rows are loaded.
Replication Key contains `NULLs` (subsequent batch)	Subsequent batch of data arrives where the Replication Key column contains `NULLs`	Stitch will not replicate rows where the Replication Key is NULL. Rows with values in the Replication Key column will persist to Panoply.
Timestamps out of range	Data arrives with a date that is out of range for Panoply to handle. Panoply’s range is 4713 BC to 294276 AD.	Panoply will reject records that contains that fall outside the supported time range.
`VARCHARs` max width	Data arrives within a text column that exceeds the Panoply size maximum. `VARCHARs` are limited to 65K.	Data will be truncated to the maximum width.
Decimal range	Data arrives within a decimal column that exceeds the Panoply size limit. The limit is 38 numbers, or places.	Panoply will reject the entire record.
Integer range	Data arrives within an integer column that exceeds the Panoply size limit	Panoply will reject records that exceed the size limit.

Data Typing Scenarios

These scenarios apply when Stitch encounters specific data types (ex: timezones).

Summary	Scenario	Expected Behavior
Single column contains mixed data types (same batch)	Initial batch of data contains mixed data types within a single column. For example: a column contains `string` and `INTEGER` data.	Column will be “split” and the column name will be appended with the data type. For example: `age_fl, age_bi`
Single column contains mixed data types (different batch)	Initial batch of data contains single data type columns, but a subsequent batch contains mixed data types in a single column. For example: a column contains `string` and `INTEGER` data.	Column will be “split” and the column name will be appended with the data type. For example: `age_fl, age_bi`
`VARCHARs` (same batch)	Initial data for a `VARCHAR` column arrives with mixed sizes. Some records contain more than 128 characters.	Column is created at a width large enough to contain the largest value.
`VARCHARs` (different batch)	Data for an existing `VARCHAR` column arrives; some records exceed the max size for the existing column.	New column is created at a size to accommodate the largest value. Data is copied into the new column, the old column is dropped, and the new column is renamed. Note that if you create views with dependencies on a table where this occurs, you will have to temporarily drop the view via `CASCADE` to allow Stitch to update the table.
Dates with timezones	Timestamp data arrives with no timestamp information	Data type is changed to `TIMESTAMP WITHOUT TIME ZONE` and adjusted to show as UTC.
Decimals	Data arrives in decimal format with an arbitrary amount of precision.	Panoply will truncate the data to 6 places after the decimal.

Schema Changing Scenarios

These scenarios apply when a table undergoes structural changes.

Summary	Scenario	Expected Behavior
New column	A new column is added to an existing table	New column is appended to the end of the table.
Column removed	A column is no longer seen within rows of data coming for an existing table	Default `NULL` values are placed into the column.
Column added to nested table	Data arrives with a nested table that contains a field that did not exist in that nested table before	New field is added to the nested table.

Nested Data Scenarios

These scenarios apply when Stitch loads nested data into Panoply.

Summary	Scenario	Expected Behavior
Nested JSON objects	Data arrives that is nested with many top-level properties	Keys in the object are de-nested and become columns in the table where the column name follows this naming convention: `top_level_key__secondary__third__etc`
Nested JSON top-level arrays	Data arrives that is nested and amongst the nested data are properties that contain arrays	A subtable is created. Subtables follow this naming convention: `top_level_table__array_name`
Nested JSON sub-level arrays	Data arrives that is nested and amongst the nested data are properties that contain arrays, which themselves contain properties that are arrays	See above.
Nested JSON new columns in arrays	The sub-level arrays in nested JSON data have a varying set of columns	See above.
Nested table name too long	Data arrives for a nested table with a name that exceeds the maximum length for Panoply. Table names are limited to 127 characters.	Panoply will reject all records in the table.

Direct Data Warehouse Changes

These scenarios apply when you make a direct change to your Panoply data warehouse.

Summary	Scenario	Expected Behavior
You add columns	You add additional columns to a Stitch-generated table in Panoply	Fine, as long as nullable.
You apply indexes	You apply indexes to columns created by Stitch	No issue; Stitch will not remove them.
You apply Sort / Distribution Keys	You apply Sort / Distribution Keys to columns created by Stitch	Stitch operations are unaffected as long as the Primary Key info is maintained via a table comment.
You remove Stitch (`_sdc`) columns from a table	You remove one or more of the `_sdc` columns created by Stitch	Inserting and updating data will break as Stitch requires the `_sdc_table_version` and `_sdc_sequence` columns.
You remove a data column from the data source	You remove one or more of the data columns from the data source	Default `NULL` values are placed into the column.
You remove a data column from the data warehouse	You remove one or more of the data columns created by Stitch from the data warehouse Note: it’s sometimes possible to remove columns from database integrations only.	Removing a column with a data type suffix (ex: `__boolean`) will break the table. If new data is detected for the deleted column, Stitch will recreate it in your data warehouse.
You revoke permissions to schemas and/or tables	You remove Stitch’s access to create schemas/tables or to write to Stitch-created tables	An error will surface in Stitch, data replication will stop until sufficient permissions are re-instated, and you will be notified.
You revoke data warehouse permissions	You remove access to the data warehouse as a whole	An error will surface in Stitch, data will fail to load, and you will be notified.

Questions? Feedback?

Did this article help? If you have questions or feedback, please reach out to us.

Tags:

Related	Troubleshooting
Loading Data into Your Destination Table Schema Changes Nested Structures & Row Count Impact Querying Append-Only Tables	Data Discrepancy Guide Identifying Rejected Records Destination Data Loading Errors Postgres Data Loading Errors Redshift Data Loading Errors