WHAT IS A BUNDLE STREAM?
A bundle Datastream is a special kind of Datastream, as it is able to combine data from one or multiple other existing Datastreams. It has multiple applications, but there are 2 very useful cases for it.
- Combining multiple separate Datastreams (e.g. different sources)
- Combining multiple extracts into one extract
Note that previously, the bundle Datastream used to be called "fork", hence all extracts will be prefixed with "fork"-datastreamID, e.g. "fork-123".
HOW TO CREATE A BUNDLE STREAM
1. No connection is needed, but at least one Datastream must exist on the Workspace.
2. Datastreams -> Add -> Bundle
3. Select at least one existing Datastreams to be part of the bundle.
There are multiple options which can be selected to find existing extracts in the selected Datastreams and which can work in combination with a date range. For all options, all extracts (status: collected or status: imported) of the chosen Datastreams will be returned based on the match criteria & fetch date range.
- Pattern: The default match option. If the underlying chosen datastream has "Manage Extract Names" enabled, then this can be used in combination with a regular expression pattern that includes placeholders for the date. Extracts will be returned if they match the regular expression pattern and the fetch date range selected.
Default Regular Expression:
- Created Date: Extracts will be returned if the fetch date and the creation date of the extract match.
- Scheduled Date: Extracts will be returned if the fetch date and the scheduled date of the extract match. Note that the scheduled date refers to the earliest date contained within an extract, e.g. for an extract which contains 30 days, it will indicate the first day as in the example below.
To find the scheduled or created date, look into the meta data section of the extract.
Apply Schema Mapping: This is ticked by default. This will take the schema mapping of the chosen Datastream. This will mean that only mapped fields will be shown, and the names of the schema mapping will be shown in the bundle stream. This will also automatically combine all mapped columns from different Datastreams into one column. When this is not ticked, all fields from the underlying Datastream will appear in their original names.
In order to harmonize data, the underlying fields should have the same schema mapping applied.
ga:campaign -> campaign
Campaign -> campaign
CampaignName -> campaign
Read more about the data schema & schema mapping here.
Concatenate: This is ticked by default. This will combine all extracts for a fetched time range instead of creating separate extracts per day.
BUNDLING STREAMS FROM PARENT WORKSPACES
By default, a bundle stream will not be able to access data from streams which reside in a parent Workspace. However, if this is needed, the relevant Datastreams can be made available through the Share with Children option.
- Navigate to the relevant Datastream in the parent-level Workspace.
- In the left hand menu screen, under Advanced Settings, click on Other.
- Tick the checkbox 'Share with children'.
- The Datastream is now available for its child Workspaces.