Install¶
Info
For production deployments of lakeFS Enterprise, follow this guide.
lakeFS Enterprise Architecture¶
We recommend reviewing the lakeFS Enterprise architecture to understand the components you will be deploying.
Note
Fluffy service is deprecated in chart version 1.5.0 and later. For more information, see the Upgrade Guide.
Deploy lakeFS Enterprise on Kubernetes¶
The guide is using the lakeFS Helm Chart to deploy a fully functional lakeFS Enterprise.
The guide includes example configurations, follow the steps below and adjust the example configurations according to:
- The platform you run on: among the platform supported by lakeFS
- Type of KV store you use
- Your SSO IdP and protocol
Prerequisites¶
- You have a Kubernetes cluster running in one of the platforms supported by lakeFS.
- Helm is installed
- Access to download treeverse/lakefs-enterprise from Docker Hub. Contact us to gain access to lakeFS Enterprise features.
- A KV Database. The available options are dependent in your deployment platform.
- A method to route traffic into lakeFS from outside of the cluster (via Ingress or Service).
Optional¶
Access to configure your SSO IdP supported by lakeFS Enterprise.
Info
You can install lakeFS Enterprise without configuring SSO and still benefit from all other lakeFS Enterprise features.
Add the lakeFS Helm Chart¶
- Add the lakeFS Helm repository with
helm repo add lakefs https://charts.lakefs.io
- The chart contains a values.yaml file you can customize to suit your needs as you follow this guide. Use
helm show values lakefs/lakefs
to see the default values. - Configure
image.privateRegistry.secretToken
with the Docker Hub token you received.
Authentication Configuration¶
Authentication in lakeFS Enterprise is handled directly by the lakeFS Enterprise service. This section explains the configurations required for setting up SSO.
See SSO for lakeFS Enterprise for the supported identity providers and protocols.
The examples below include example configuration for each of the supported SSO protocols. Note the IdP-specific details you'll need to replace with your IdP details.
The following values
file will run lakeFS Enterprise with OIDC integration.
Tip
The full OIDC configurations explained here.
enterprise:
enabled: true
auth:
oidc:
enabled: true
# secret given by the OIDC provider (e.g auth0, Okta, etc)
client_secret: <oidc-client-secret>
image:
privateRegistry:
enabled: true
secretToken: <dockerhub-token>
lakefsConfig: |
logging:
level: "INFO"
blockstore:
type: s3
auth:
logout_redirect_url: https://oidc-provider-url.com/logout/example
oidc:
# the claim that's provided by the OIDC provider (e.g Okta) that will be used as the username according to OIDC provider claims provided after successful authentication
friendly_name_claim_name: "<some-oidc-provider-claim-name>"
default_initial_groups: ["Developers", "Admins"]
# if true then the value of friendly_name_claim_name will be refreshed during each login to maintain the latest value
# and the the claim value (i.e user name) will be stored in the lakeFS database
persist_friendly_name: true
providers:
oidc:
post_login_redirect_url: /
url: https://oidc-provider-url.com/
client_id: <oidc-client-id>
callback_base_url: https://<lakefs.acme.com>
# the claim name that represents the client identifier in the OIDC provider (e.g Okta)
logout_client_id_query_parameter: client_id
# the query parameters that will be used to redirect the user to the OIDC provider after logout
logout_endpoint_query_parameters:
- returnTo
- https://<lakefs.acme.com>/oidc/login
ingress:
enabled: true
ingressClassName: <class-name>
hosts:
- host: <lakefs.acme.com>
paths:
- /
The following values
file will run lakeFS Enterprise with SAML using Azure AD as the IdP.
You can use this example configuration to configure Active Directory Federation Services (AD FS) with SAML.
Tip
The full SAML configurations explained here.
Azure App Configuration¶
- Create an Enterprise Application with SAML toolkit - see Azure quickstart
- Add users: App > Users and groups: Attach users and roles from their existing AD users list - only attached users will be able to login to lakeFS.
- Configure SAML: App > Single sign-on > SAML:
- Entity ID: Add 2 ID's, lakefs-url + lakefs-url/saml/metadata (e.g. https://lakefs.acme.com and https://lakefs.acme.com/saml/metadata)
- Reply URL: lakefs-url/saml (e.g. https://lakefs.acme.com/saml)
- Sign on URL: lakefs-url/sso/login-saml (e.g. https://lakefs.acme.com/sso/login-saml)
- Relay State (Optional, controls where to redirect after login): /
SAML Configuration¶
- Configure SAML application in your IdP (i.e Azure AD) and replace the required parameters into the
values.yaml
below. - To generate certificates keypair use:
openssl req -x509 -newkey rsa:2048 -keyout myservice.key -out myservice.cert -days 365 -nodes -subj "/CN=lakefs.acme.com"
enterprise:
enabled: true
auth:
saml:
enabled: true
createCertificateSecret: true # NEW: Auto-creates secret
certificate:
# certificate and private key for the SAML service provider to sign outgoing SAML requests
samlRsaPublicCert: | # RENAMED: from saml_rsa_public_cert
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
samlRsaPrivateKey: | # RENAMED: from saml_rsa_private_key
-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----
secrets:
authEncryptSecretKey: "some random secret string"
image:
privateRegistry:
enabled: true
secretToken: <dockerhub-token>
lakefsConfig: |
logging:
level: "DEBUG"
blockstore:
type: local
auth:
logout_redirect_url: https://<lakefs.acme.com>
cookie_auth_verification:
auth_source: saml
# claim name to use for friendly name in lakeFS UI
friendly_name_claim_name: displayName
external_user_id_claim_name: samName
default_initial_groups:
- "Developers"
providers:
saml:
post_login_redirect_url: https://<lakefs.acme.com>
sp_root_url: https://<lakefs.acme.com>
sp_sign_request: false
sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256"
idp_metadata_url: "https://<adfs-auth.company.com>/federationmetadata/2007-06/federationmetadata.xml"
# the default id format urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified
# idp_authn_name_id_format: "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified"
idp_skip_verify_tls_cert: true
ingress:
enabled: true
ingressClassName: <class-name>
annotations: {}
hosts:
- host: <lakefs.acme.com>
paths:
- /
The following values
file will run lakeFS Enterprise with LDAP.
Tip
The full LDAP configurations explained here.
enterprise:
enabled: true
auth:
ldap:
enabled: true
bindPassword: <ldap bind password>
image:
privateRegistry:
enabled: true
secretToken: <dockerhub-token>
lakefsConfig: |
logging:
level: "INFO"
blockstore:
type: local
auth:
ui_config:
login_url: /auth/login
logout_url: /logout
login_cookie_names:
- internal_auth_session
providers:
ldap:
server_endpoint: 'ldaps://ldap.company.com:636'
bind_dn: uid=<bind-user-name>,ou=Users,o=<org-id>,dc=<company>,dc=com
username_attribute: uid
user_base_dn: ou=Users,o=<org-id>,dc=<company>,dc=com
user_filter: (objectClass=inetOrgPerson)
connection_timeout_seconds: 15
request_timeout_seconds: 17
# RBAC group for first time users
default_user_group: "Developers"
ingress:
enabled: true
ingressClassName: <class-name>
hosts:
- host: <lakefs.acme.com>
paths:
- /
See additional examples on GitHub we provide for each authentication method (oidc, saml, ldap, rbac, external AWS IAM).
Database Configuration¶
In this section, you will learn how to configure lakeFS Enterprise to work with the KV Database you created (see prerequisites).
Notes:
- By default, the lakeFS Helm chart comes with
useDevPostgres: false
, you can change it touseDevPostgres: true
for dev use. This setup is useful when you want to run a setup with multiple replicas or want to prevent data loss between containers restarts. - See lakeFS database configuration.
The database configuration can be set directly via lakefsConfig
, via K8S Secret Kind, or via environment variables.
This example uses Postgres as KV Database configured via environment variables.
This example uses DynamoDB as KV Database.
Install the lakeFS Helm Chart¶
After populating your values.yaml file with the relevant configuration, in the desired K8S namespace run helm install lakefs lakefs/lakefs -f values.yaml
Access the lakeFS UI¶
In your browser, go to the Ingress host to access lakeFS UI.
Log Collection¶
The recommended practice for collecting logs would be sending them to the container std (default configuration) and letting an external service to collect them to a sink. An example for logs collector would be fluentbit that can collect container logs, format them and ship them to a target like S3.
There are 2 kinds of logs: - Regular logs like an API error or some event description used for debugging - Audit logs that describe user actions (i.e create branch)
The distinction between regular logs and audit_logs is in the boolean field log_audit
.
Advanced Deployment Configurations¶
The following example demonstrates a scenario where you need to configure an HTTP proxy for lakeFS, TLS certificates for the Ingress and extending the K8S manifests without forking the Helm chart.
ingress:
enabled: true
ingressClassName: <class-name>
# configure TLS certificate for the Ingress
tls:
- hosts:
- lakefs.acme.com
secretName: somesecret
hosts:
- host: lakefs.acme.com
paths:
- /
# configure proxy for lakeFS
extraEnvVars:
- name: HTTP_PROXY
value: 'http://my.company.proxy:8081'
- name: HTTPS_PROXY
value: 'http://my.company.proxy:8081'
# advanced: extra manifests to extend the K8S resources
extraManifests:
- apiVersion: v1
kind: ConfigMap
metadata:
name: '{% raw %}{{ .Values.lakefs.name }}{% endraw %}-extra-config'
data:
config.yaml: my-data