Link Search Menu Expand Document

Airbyte is an open-source platform to sync data from applications, APIs & databases to warehouses, lakes and other destinations. Use Airbyte’s connectors to get your data pipelines to consolidate many input sources.

Table of contents

  1. Using lakeFS with Airbyte
  2. Use-cases
  3. S3 Connector
    1. Configuring lakeFS using the connector

Using lakeFS with Airbyte

The integration between the two open-source projects brings resilience and manageability when using Airbyte connectors to sync data to your S3 buckets by leveraging lakeFS branches and atomic commits and merges.

Use-cases

You can leverage lakeFS consistency guarantees and CI/CD capabilities when ingesting data to S3 using lakeFS:

  1. Consolidate many data sources to a single branch and expose them to the consumers simultaneously when merging to the main branch.
  2. Test incoming data for breaking schema changes, using lakeFS hooks.
  3. Prevent consumers from reading partial data from connectors which failed half-way through sync.
  4. Experiment with ingested data on a branch before exposing it.

S3 Connector

lakeFS exposes an S3 Gateway that enables applications to communicate with lakeFS in the same way they would with Amazon S3. You can use Airbyte’s S3 Destination for uploading the data to lakeFS.

Configuring lakeFS using the connector

Set the following parameters when creating a new Destination of type S3:

Name Value Example
Endpoint The lakeFS S3 gateway URL https://cute-axolotol.lakefs-demo.io
S3 Bucket Name The lakeFS repository where the data will be written example-repo
S3 Bucket Path The branch and the path where the data will be written main/data/from/airbyte Where main is the branch name, and data/from/airbyte is the path under the branch.
S3 Bucket Region Not applicable to lakeFS, use us-east-1 us-east-1
S3 Key ID The lakeFS access key id used to authenticate to lakeFS. AKIAlakefs12345EXAMPLE
S3 Access Key The lakeFS secret access key used to authenticate to lakeFS. abc/lakefs/1234567bPxRfiCYEXAMPLEKEY

Note S3 Destination connector supports custom S3 endpoints starting with Airbyte’s version v0.26.0-alpha released on Jun 17th 2021

The UI configuration will look like:

S3 Destination Connector Configuration