Link Search Menu Expand Document

On-Prem deployment

⏰ Expected deployment time: 25 min

Prerequisites

To use lakeFS, you’ll need to have access to an S3-compatible object store such as MinIO

For more information on how to set up MinIO, see the official deployment guide

Setting up a database

lakeFS requires a PostgreSQL database to synchronize actions on your repositories. This section assumes that you already have a PostgreSQL >= 11.0 database accessible.

Setting up a lakeFS Server

Connect to your host using SSH:

  1. Create a config.yaml on your VM, with the following parameters:

    ---
    database:
      type: "postgres"
      postgres:
        connection_string: "[DATABASE_CONNECTION_STRING]"
      
    auth:
      encrypt:
        # replace this with a randomly-generated string. Make sure to keep it safe!
        secret_key: "[ENCRYPTION_SECRET_KEY]"
       
    blockstore:
      type: s3
      s3:
         force_path_style: true
         endpoint: http://<minio_endpoint>
         discover_bucket_region: false
         credentials:
            access_key_id: <minio_access_key>
            secret_access_key: <minio_secret_key>
    

    ⚠️ Notice that the lakeFS Blockstore type is set to s3 - This configuration works with S3-compatible storage engines such as MinIO.

  2. Download the binary to the server.
  3. Run the lakefs binary:

    lakefs --config config.yaml run
    

Note: It’s preferable to run the binary as a service using systemd or your operating system’s facilities.

To support container-based environments, you can configure lakeFS using environment variables. Here is a docker run command to demonstrate starting lakeFS using Docker:

docker run \
  --name lakefs \
  -p 8000:8000 \
  -e LAKEFS_DATABASE_TYPE="postgres" \
  -e LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING="[DATABASE_CONNECTION_STRING]" \
  -e LAKEFS_AUTH_ENCRYPT_SECRET_KEY="[ENCRYPTION_SECRET_KEY]" \
  -e LAKEFS_BLOCKSTORE_TYPE="s3" \
  -e LAKEFS_BLOCKSTORE_S3_FORCE_PATH_STYLE="true" \
  -e LAKEFS_BLOCKSTORE_S3_ENDPOINT="http://<minio_endpoint>" \
  -e LAKEFS_BLOCKSTORE_S3_DISCOVER_BUCKET_REGION="false" \
  -e LAKEFS_BLOCKSTORE_S3_CREDENTIALS_ACCESS_KEY_ID="<minio_access_key>" \
  -e LAKEFS_BLOCKSTORE_S3_CREDENTIALS_SECRET_ACCESS_KEY="<minio_secret_key>" \
  treeverse/lakefs:latest run

⚠️ Notice that the lakeFS Blockstore type is set to s3 - This configuration works with S3-compatible storage engines such as MinIO.

See the reference for a complete list of environment variables.

You can install lakeFS on Kubernetes using a Helm chart.

To install lakeFS with Helm:

  1. Copy the Helm values file relevant for S3-Compatible storage (MinIO in this example):

    secrets:
        # replace this with the connection string of the database you created in a previous step:
        databaseConnectionString: [DATABASE_CONNECTION_STRING]
        # replace this with a randomly-generated string
        authEncryptSecretKey: [ENCRYPTION_SECRET_KEY]
    lakefsConfig: |
        blockstore:
          type: s3
          s3:
            force_path_style: true
            endpoint: http://<minio_endpoint>
            discover_bucket_region: false
            credentials:
              access_key_id: <minio_access_key>
              secret_access_key: <minio_secret_key>
    

    ⚠️ Notice that the lakeFS Blockstore type is set to s3 - This configuration works with S3-compatible storage engines such as MinIO.

  2. Fill in the missing values and save the file as conf-values.yaml. For more configuration options, see our Helm chart README.

    The lakefsConfig parameter is the lakeFS configuration documented here but without sensitive information. Sensitive information like databaseConnectionString is given through separate parameters, and the chart will inject it into Kubernetes secrets.

  3. In the directory where you created conf-values.yaml, run the following commands:

    # Add the lakeFS repository
    helm repo add lakefs https://charts.lakefs.io
    # Deploy lakeFS
    helm install my-lakefs lakefs/lakefs -f conf-values.yaml
    

    my-lakefs is the Helm Release name.

    Load balancing

    To configure a load balancer to direct requests to the lakeFS servers you can use the LoadBalancer Service type or a Kubernetes Ingress. By default, lakeFS operates on port 8000 and exposes a /_health endpoint that you can use for health checks.

    💡 The NGINX Ingress Controller by default limits the client body size to 1 MiB. Some clients use bigger chunks to upload objects - for example, multipart upload to lakeFS using the S3-compatible Gateway or a simple PUT request using the OpenAPI Server. Checkout Nginx documentation for increasing the limit, or an example of Nginx configuration with MinIO.

Create the admin user

When you first open the lakeFS UI, you will be asked to create an initial admin user.

  1. open http://<lakefs-host>/ in your browser. If you haven’t set up a load balancer, this will likely be http://<instance ip address>:8000/
  2. On first use, you’ll be redirected to the setup page:

    Create user

  3. Follow the steps to create an initial administrator user. Save the credentials you’ve received somewhere safe, you won’t be able to see them again!

    Setup Done

  4. Follow the link and go to the login screen. Use the credentials from the previous step to log in.

Create your first repository

  1. Use the credentials from the previous step to log in
  2. Click Create Repository and choose Blank Repository.

    Create Repo

  3. Under Storage Namespace, enter a path to your desired location on the object store. This is where data written to this repository will be stored.
  4. Click Create Repository
  5. You should now have a configured repository, ready to use!

    Repo Created

Congratulations! Your environment is now ready 🤩