- If your database is IP access restricted, use 'ping + stackname.datatap.adverity.com' (e.g. 'ping helpcenter.datatap.adverity.com') in Window’s Command Prompt to identify Adverity's IP address and whitelist it to access the database.
- Navigate to Transfer > Destinations > +Add.
- Choose SQL Database from the list of Destinations.
- Populate all the fields required.
- If using a different port than the respective default, you can optionally define it by populating field hostname with hostname:port.
- Each Datastream will automatically generate a separate table in the database (see below).
- Assign the Destination to a Workspace.
TARGET TABLE NAMES
- The default name of the output table is datastreamtype_datastream-id e.g. mailgun_83.
- To configure the table names of a Datastream select the Datastream in the left column and put a value in the right column.
- Allowed characters for Table Name are "a-z", "A-Z", "0-9", and "_".
- To import several Datastreams into one table change the table name accordingly (will perform a full outer join on common columns and concatenate values, all fields which are not shared will have null values for rows from the other stream).
- To optionally define the schema, please put schema.tablename.
DATATYPES AND SCHEMA MAPPING
Datatypes in the SQL Destination are set based upon the Datatype or Schema applied to each column field when the first Data Extract is transferred to an SQL database. Once imported the first time, if the Data Schema or Datatype of any column field is changed in the Adverity platform, this change will not be applied to your SQL instance.
Example: If a column field Adverity is, e.g. ‘Float’, in the Adverity platform, then once imported that field will always remain ‘Numeric’ in the SQL table, even if you try to change the datatype in Adverity from “Float” to “Currency” and export it again.
In order to change a field's datatype after export, it will be necessary to completely drop the table from your SQL database and repopulate it from scratch with another Data Extract that has the correct Schema or datatypes.
ADVERITY - SQL DATATYPE EQUIVALENCIES
As a rule, Adverity uses field types as per SQLAchemy.
Below is a list of the most common field types, giving their Adverity titles and SQLAlchemy equivalents.
|Adverity UI||SQLAlchemy||Additional information?|
|String||Unicode||Length is applied|
|JSON||String||Length is applied|
|Float||Numeric||Precision: 13, Scale: 4|
|DateTime||DateTime||If Snowflake: TIMESTAMP_NTZ|
How these later appear in each specific engine type (e.g. PostgreSQL, Redshift etc.) depends on the database engine dialect definition in SQLAlchemy, which is based on what the database actually supports.
INSERT AND OVERWRITE DATA
For SQL Databases, Adverity automatically creates 3 additional columns in the target table:
- dt_created (creation date of current extract).
- dt_updated (date when current extract was last updated).
- dt_filename (filename containing datastream-slug [gsc-104] and hash or custom filename, e.g. gsc-104-20180708-64ecec6ab9c94a29bd72b24deeb1bfd0.csv. The Datastream slug works as the unique identifier in each instance.
- Key Columns:
Can be defined in Schema Mapping. This setting is only recommended when using 1:1 Datastream : Database Table relationships, and Key Columns always have precedence over other criteria. As soon as a column is marked as a key column within a Datastream's schema mapping, it will be used as a primary database key, upon which updates will be based.[Default=Disabled]
Will delete all rows from particular Datastream. The Datastream is recognized by its Datastream slug --> all rows in column dt_filename starting with datastream-slug == gsc-104 will be deleted. [Default=Enabled]
Uses current filename and deletes all rows in column dt_filename holding the same value, e.g. gsc-104-20180708-64ecec6ab9c94a29bd72b24deeb1bfd0.csv. [Default=Disabled]
- Date Range:
Uses selected date column and overwrites data for current extract's earliest to latest date. [Default=Enabled]
- Regular: Datastream + Date Range delete data from same Datastream within same date range. Quick solution that behaves similar to default Adverity Explore.
- Advanced: Key Columns without any other option enabled.
Can be used whenever ALL old extracts from same Datastream should be deleted from table.
- Filename + Unique Retention Per Day:
Has the same effect as the regular case.
- Overwrite options are available for importing into databases:
- Use Key columns.
- Same Datastreams.
- Duplicate Filename.
- Date Range.
- Combination of Datastream and/or date range and/or key columns.
- When overwriting on Key Columns, the following has to be considered:
- Create primary key by choosing as many key columns as necessary to uniquely identify a dataset within one data extract.
- Whenever the structure of the database table has changed, it’s necessary to drop or alter the table. The table structure changes whenever:
- key columns are modified
- datatype of a column is changed
- length of a column is changed
During a data transfer to a database Destination, a temporary "_stage" table will be created in your database during processing to ensure atomicity. Once transfer is complete or aborted, this table will be removed.
LOAD STATE TABLE
The load state table is table created in database which contains the activity logs and overview of the load operations. You can optionally provide a name for the table, otherwise it defaults to `dt_loadstate`.
Every time a new dataset is imported, the state table gets updated with the latest information. Some of the information you can find in state table is:
- Datastream display name
- Datastream extract column count
- Datastream extract created
- Datastream extract date pattern
- Datastream extract range start date
- Datastream extract range end date
- Datastream extract row count
NOTE: For a complete list, please take a look at load state table in database.