You are now ready to install the lakeFS server. Following are some options for doing that.
lakeFS can be easily installed on Kubernetes using a Helm chart. To install lakeFS with Helm:
conf-values.yamlfile, replacing values as described in the comments:
secrets: # replace this with the connection string of the database you created in a previous step: databaseConnectionString: postgres://postgres:myPassword@my-lakefs-db.rds.amazonaws.com:5432/lakefs?search_path=lakefs # replace this with a randomly-generated string authEncryptSecretKey: <some random string> lakefsConfig: | blockstore: type: s3 s3: region: us-east-1 gateways: s3: # replace this with the host you will use for the lakeFS S3-compatible endpoint: domain_name: s3.lakefs.example.com
See below for more configuration options. The
lakefsConfigparameter is the lakeFS configuration documented here, but without sensitive information. Sensitive information like
databaseConnectionStringis given through separate parameters, and the chart will inject them into Kubernetes secrets.
In the directory where you created
conf-values.yaml, run the following commands:
# Add the lakeFS repository helm repo add lakefs https://charts.lakefs.io # Deploy lakeFS helm install example-lakefs lakefs/lakefs -f conf-values.yaml
example-lakefsis the Helm Release name.
You should give your Kubernetes nodes access to all S3 buckets you intend to use lakeFS with. If you can’t provide such access, lakeFS can be configured to use an AWS key-pair to authenticate (part of the
lakefsConfig YAML below).
| ||PostgreSQL connection string to be used by lakeFS|
| ||A random (cryptographically safe) generated string that is used for encryption and HMAC signing|
| ||lakeFS config YAML stringified, as shown above. See reference for available configurations.|
| ||Number of lakeFS pods|| |
| ||Pod resource requests & limits|| |
| ||Kubernetes service type||ClusterIP|
| ||Kubernetes service external port||80|
| ||Name of a Kubernetes secret containing extra environment variables|
To deploy using Docker, create a yaml configuration file. Here is a minimal example, but you can see the reference for the full list of configurations.
database: connection_string: "postgres://user:pass@<RDS_ENDPOINT>:5432/postgres" auth: encrypt: secret_key: "<RANDOM_GENERATED_STRING>" blockstore: type: s3 gateways: s3: domain_name: s3.lakefs.example.com
Depending on your runtime environment, running lakeFS using docker would look like this:
docker run \ --name lakefs \ -p 8000:8000 \ -v <PATH_TO_CONFIG_FILE>:/home/lakefs/.lakefs.yaml \ treeverse/lakefs:latest run
Some environments make it harder to use a configuration file, and are best configured using environment variables.
Here is an example of running lakeFS using environment variables. See the reference for the full list of configurations.
docker run \ --name lakefs \ -p 8000:8000 \ -e LAKEFS_DATABASE_CONNECTION_STRING="postgres://user:pass@<RDS ENDPOINT>..." \ -e LAKEFS_AUTH_ENCRYPT_SECRET_KEY="<RANDOM_GENERATED_STRING>" \ -e LAKEFS_BLOCKSTORE_TYPE="s3" \ -e LAKEFS_GATEWAYS_S3_DOMAIN_NAME="s3.lakefs.example.com" \ treeverse/lakefs:latest run
Alternatively, you can run lakeFS directly on an EC2 instance:
- Download the binary for your operating system
lakefsis a single binary, you can run it directly, but preferably run it as a service using systemd or your operating system’s facilities.
lakefs --config <PATH_TO_CONFIG_FILE> run