Open navigation

How to use Parquet output format for data lake destinations

Parquet output format makes it easy to set up data pipelines for data lakes. Parquet is more efficient than CSV for storing and querying the data, and it makes processing the data easy as it contains metadata such as the data types of each field.


You can select the Parquet output format for cloud storage destinations when setting up a destination on the Supermetrics Hub.


Instructions

  1. Log in to the Supermetrics Hub.
  2. In the sidebar, select Storage → Connect to data warehouse.
  3. Click Create destination, or if you have already set up some destinations, click Create new.
  4. Fill in the necessary details as you would normally do. Take a look at our prerequisite and configuration guides for various data warehouse destinations.
  5. Set a unique upload path for the data, avoiding conflicts with existing destinations.
  6. In the Output format setting, select Parquet as the output format for the destination. This setting is only visible for cloud storage destinations, not data warehouses.
  7. Click Save to apply the changes.
  8. Create a transfer to your data lake destination. The standard steps for creating transfers apply, so as usual, specify the source, destination, and any additional transfer settings required.

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.