- Most common file types are: CSV, XLS, XLSX
- The file type has to be defined as very first step in the script by using either the command CSV or XLSX. See Import Operations on DataTap transformation reference for full list of parameters and supported file types
- Apply command convertnumbers to any column containing metrics to ensure they are imported correctly into your target destination
- If the source file is in a zip/gzip-container, this has to be defined in the File Pattern. Example: ^%Y%m%d.gz$ or ^%Y%m%d.zip$
FILE-SPECIFIC CONFIGURATION OPTIONS
If a fetch finds no available file or valid URL link, a warning will be sent to the user via the 'Issues' pane in the Overview.
- File Pattern:
A regex expression that will be matched to recognize attachment files.
- Zip Match:
The file-type suffix for which Datatap should search any attached .zip files.
- Filename Date Match:
If attached files follow a consistent name format to define e.g. their date of creation, this field defines the date expression format that will be used, e.g. "filename-%Y-%m-%d".
- Filename Date Match:
If attached files follow a consistent name format to define e.g. their date of creation, this field defines the date expression format that will be used, e.g. "%Y-%m-%d".
- Keep filename:
If ticked, the extract will duplicate the name of the original source file. If the source file is always of the same name, ticking this would cause each extract to also be identically named, and therefore consistently be overwritten as a new file with each new fetch.
- Has Adverity Header
If the Datasource is also an Adverity product, then ticking this will automatically allow the Mailgun Datastream to parse header information in the standard Adverity format.
A drop-down menu of compatible file formats compatible, select the appropriate option for your attachment files, RAW, CSV, EXCEL, AVRO, or PARQUET.
- Source Encoding (RAW, CSV, EXCEL, AVRO, PARQUET):
Select the encoding convention used by your file structure.
- Delimiter (CSV):
A one-character string used to separate fields.
- Quote Char (CSV):
A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters.
- Quoting (CSV):
Controls when quotes should be recognized when parsing files.
- Sheet (EXCEL):
If your attachments are in an excel file that utilizes multiple sheets, specify the one that should be parsed (e.g. "Sheet1", "Conversion Figures").
- Column Offset (EXCEL):
Allows you to skip columns in the sheet that should not be imported, e.g. if Column A contains header information unnecessary to Datatap.
- Row Offset (RAW, CSV, EXCEL, AVRO, PARQUET):
Allows you to skip rows in the sheet that should not be imported, e.g. if Row 1 contains header information unnecessary to Datatap.
TIME RANGE OPTIONS
- Time range options are currently not supported for File datasource
- In order to fetch files of a certain date the file name must hold information on the date. See section FETCHING FILES OF A CERTAIN DATE below for further information
FETCHING FILES OF A CERTAIN DATE
- Under Configuration --> File Matching Options --> File pattern use can use %Y, %m, %d as placeholders in the regular expression for the file name
- In this case only files where the name matches the date of the fetch are processed.
- The placeholders always relate to the date of the current fetch. It is currently not possible to define a range of dates that should be fetched
- However, when using Preset: Yesterday or Preset: Today on the Time Range tab the files matching yesterday's or today's date are being processed
- Tick Configuration --> File Processing --> Process all to process all files matching the regular expression
- example: your file pattern is set to filename-%Y-%m-%d. In this case the fetch performed on 01st January 2018 would only fetch files where the name equals filename-2018-01-01
FILENAME DATE MATCH + FILENAME DATE PATTERN
- When populated these two options enable the Datastream to sort file processing based on Sortorder --> Date Match. They are not related to which dates are fetched.