For data transfers in the Supermetrics data warehouse and cloud storage destinations, you can set up update operations using scheduled refreshes or historical backfill. Combine these with different update types to find the right refresh type and cadence for your data.
Learn more about update operations and update types below. We've included an overview table that summarizes each option.
Update operations
Scheduled refresh
When you configure a transfer, you can specify when and how frequently its results should be updated with a scheduled refresh operation.
When you set this operation up, you'll define a refresh window for each query. The refresh window determines how many days of data will be updated every time the data is refreshed. For example, a refresh window of 7 ensures that each time the transfer runs, the last 7 days of data, including today's data, will be replaced individually.
Historical backfill
Historical backfills allow you to transfer data from a specific period in the past. Once you specify a range of dates, Supermetrics will pull this data from the source one day at a time.
A query's maximum historical backfill range is determined by the following:
- Your Supermetrics license
- The data source's data retention policies
- This can also apply to retention policies for specific fields. For example, some data sources limit retention for demographic data fields.
Update types
Rolling updates
Supermetrics' update operations will default to rolling updates as long as the "Date" dimension is included in the query. It does this because most of the data sources it connects to supply data that can be partitioned by date (sometimes called "time-series data").
Rolling updates supply a range of run dates, beginning with the start date and ending with the end date, to the processing queue. The data is processed in reverse chronological order, one day at a time, with each day considered independently. This applies to both scheduled refreshes and historical backfills.
Different destinations process this update type in these ways:
- BigQuery: Single days of data are stored as individual table shards. Update operations will overwrite the date-partitioned shard, eliminating the chance of introducing duplicates to the dataset.
- Data warehouse destinations: Queries with rolling updates will store data in the same table. Supermetrics will perform a DELETE and COPY INTO operation based on the date associated with the data being updated.
- Cloud storage (including SFTP): Data from each day is stored in a separate file. Update operations will replace files in the destination folder with new data from Supermetrics.
Single-day snapshot updates
Single-day snapshot updates capture data from a data source at a specific point in time.
They're useful when:
- The API or report type doesn't contain historical data. This is common in organic social APIs when tracking metrics like Follower Count or Video Views which aren't split into daily values
Single-day snapshot updates replace the entire table's data every day. Tables in data warehouse destinations will be dropped and recreated with the query results each day, while files in cloud storage destinations will be completely overwritten.
Single-day snapshot updates occur when the "Date" dimension is omitted from the query but "Today" is included. For snapshot updates to run successfully, it's important to have the refresh window set to at least 2 to ensure that at least one of the run dates is equal to the previous day.
Historical snapshot
Historical snapshots request data for the available lifetime of the data source based on the underlying API limits. Learn more about data source historical range limits.
Historical snapshots are useful when:
- You want to pull lifetime metrics from a given data source. For example, non-aggregatable metrics like Reach and Frequency benefit from snapshot updates.
- The data from API isn't time-series data. This can happen when pulling data from fields like Contacts, Companies, or Deals from customer relationship management (CRM) sources.
When the "Run date" is equal to the previous day, this will trigger the snapshot update, adding a start- and end-date time range to the underlying query. The start date will be the maximum allowable historical limit defined by the underlying data source, and the end date is the previous day.
Historical snapshots can be achieved by omitting both "Date" and "Today" from your query. This update type will replace the entire table in your data warehouse or the entire file in your data lake every day. Historical snapshots are not compatible with backfills. For historical snapshot updates to run successfully, it's important to have the refresh window set to at least 2 to ensure that at least one of the run dates is equal to the previous day.
Rolling historical snapshot
Rolling historical snapshots split the results of a historical snapshot query by day and store them each day for historical use.
These are useful when:
- You want to track how an aggregatable metric, such as cost or impressions, evolves over time in a single table.
- You're comfortable processing and storing a high volume of data.
- The amount of data processed by your query is sufficiently small enough to be run in a single query. Depending on the quantity of data you have access to and the granularity of your query, you may run into quota issues or query execution issues due to the data size.
Rolling historical snapshots can be created by adding both "Date" and "Today" fields to your query. The resulting query will contain the maximum amount of data permitted from the data source, split by date, and be appended to the existing table in the data warehouse. For cloud storage destinations, a new file will be created each day the transfer runs. In cases where the maximum amount of data isn't specified by the data source, a default value of 730 days is used.
For rolling historical snapshot updates to run successfully, it's important to have the refresh window set to at least 2 to ensure that at least one of the run dates is equal to the previous day. Like historical snapshot updates, rolling historical snapshots are incompatible with historical backfills.
Update types overview
Rolling updates | Single-day snapshot | Historical snapshot | Rolling historical snapshot | |
Description | Updates each day of data individually | Captures a snapshot of yesterday's data | Captures a snapshot of all data available | Captures and appends a snapshot of all data available |
Metrics | Aggregatable, date-dependent metrics (clicks, cost, impressions) | Date-independent, non-aggregatable metrics, (Followers, likes) | Date-dependent, non-aggregatable, (Reach, frequency, unique users) | Aggregatable, date-dependent metrics (clicks, cost, impressions) |
Required time dimensions | Date | Today | - | Date + Today |
Refresh window (days) | 2 to 30 | 2 | 2 | 2 |
Supports historical backfills | Yes | No | No | No |
Accumulation of data history | Yes | No | No | Yes |
Run date execution | All days | Yesterday | Yesterday | Yesterday |
Query date range start | Run date | Yesterday | Max permitted by data source | Max permitted by data source |
Query date range end | Run date | Yesterday | Yesterday | Yesterday |