Configuration Reference
Table of contents
- Reference
- Using Environment Variables
- Example: Local Development
- Example: AWS Deployment
- Example: Google Storage
- Example: MinIO
Configuring lakeFS is done using a yaml configuration file. This reference uses .
to denote the nesting of values.
Reference
logging.format
(one of ["json", "text"] : "text")
- Format to output log message inlogging.level
(one of ["DEBUG", "INFO", "WARN", "ERROR", "NONE"] : "DEBUG")
- Logging level to outputlogging.output
(string : "-")
- Path name to write logs to."-"
means Standard Outputdatabase.connection_string
(string : "postgres://localhost:5432/postgres?sslmode=disable")
- PostgreSQL connection string to usedatabase.max_open_connections
(int : 25)
- Maximum number of open connections to the databasedatabase.max_idle_connections
(int : 25)
- Sets the maximum number of connections in the idle connection pooldatabase.connection_max_lifetime
(duration : 5m)
- Sets the maximum amount of time a connection may be reusedlisten_address
(string : "0.0.0.0:8000")
- A<host>:<port>
structured string representing the address to listen onauth.cache.enabled
(bool : true)
- Whether to cache access credentials and user policies in-memory. Can greatly improve throughput when enabled.auth.cache.size
(int : 1024)
- How many items to store in the auth cache. Systems with a very high user count should use a larger value at the expense of ~1kb of memory per cached user.auth.cache.ttl
(time duration : "20s")
- How long to store an item in the auth cache. Using a higher value reduces load on the database, but will cause changes longer to take effect for cached users.auth.cache.jitter
(time duration : "3s")
- A random amount of time between 0 and this value is added to each item’s TTL. This is done to avoid a large bulk of keys expiring at once and overwhelming the database.-
auth.encrypt.secret_key
(string : required)
- A random (cryptographically safe) generated string that is used for encryption and HMAC signingNote: It is best to keep this somewhere safe such as KMS or Hashicorp Vault, and provide it to the system at run time
blockstore.type
(one of ["local", "s3", "gs", "mem"]: "mem")
- Block adapter to use. This controls where the underlying data will be storedblockstore.local.path
(string: "~/lakefs/data")
- When using the local Block Adapter, which directory to store files inblockstore.gs.credentials_file
(string : )
- If specified will be used as a file path of the JSON file that contains your Google service account keyblockstore.gs.credentials_json
(string : )
- If specified will be used as JSON string that contains your Google service account key (when credentials_file is not set)blockstore.s3.region
(string : "us-east-1")
- When using the S3 block adapter, AWS region to useblockstore.s3.profile
(string : )
- If specified, will be used as a named credentials profileblockstore.s3.credentials_file
(string : )
- If specified, will be used as a credentials fileblockstore.s3.credentials.access_key_id
(string : )
- If specified, will be used as a static set of credentialblockstore.s3.credentials.access_secret_key
(string : )
- If specified, will be used as a static set of credentialblockstore.s3.credentials.session_token
(string : )
- If specified, will be used as a static session tokenblockstore.s3.endpoint
(string : )
- If specified, custom endpoint for the AWS S3 API (https://s3_compatible_service_endpoint:port)blockstore.s3.force_path_style
(boolean : false)
- When true, use path-style S3 URLs (https:/// instead of https:// . ) blockstore.s3.streaming_chunk_size
(int : 1048576)
- Object chunk size to buffer before streaming to blockstore (use a lower value for less reliable networks). Minimum is 8192.blockstore.s3.streaming_chunk_timeout
(time duration : "60s")
- Per object chunk timeout for blockstore streaming operations (use a larger value for less reliable networks).blockstore.s3.retention.role_arn
- ARN of IAM role to use to perform AWS S3 Batch tagging operations. This role must be configured according to Granting permissions for Amazon S3 Batch Operations for “PUT object tagging”, with these permissions:ListBucket
on all buckets used for storing repositories.PutObjectTagging
andPutObjectVersionTagging
on all buckets and prefixes used for storing repositories.GetObject
underblockstore.s3.retention.manifest_base_url
,PutObject
underblockstore.s3.retention.report_s3_prefix_url
.
blockstore.s3.retention.manifest_base_url
- Base S3 URL to use for uploading batch tagging manifest files. Must be readable byblockstore.s3.retention.role_arn
and writable by the configured AWS role runninglakefs
.blockstore.s3.retention.report_s3_prefix_url
- Base S3 URL to use for writing batch tagging completion reports. Must be writable byblockstore.s3.retention.role_arn
.gateways.s3.domain_name
(string : "s3.local.lakefs.io")
- a FQDN representing the S3 endpoint used by S3 clients to call this server (*.s3.local.lakefs.io
always resolves to 127.0.0.1, useful for local developmentgateways.s3.region
(string : "us-east-1")
- AWS region we’re pretending to be. Should match the region configuration used in AWS SDK clientsgateways.s3.fallback_url
(string)
- If specified, requests with a non-existing repository will be forwarded to this url. This can be useful for using lakeFS side-by-side with S3, with the URL pointing at an S3Proxy instance.stats.enabled
(boolean : true)
- Whether or not to periodically collect anonymous usage statistics
Using Environment Variables
All configuration variables can be set or overridden using environment variables. To set an environment variable, prepend LAKEFS_
to its name, convert it to upper case, and replace .
with _
:
For example, logging.format
becomes LAKEFS_LOGGING_FORMAT
, blockstore.s3.region
becomes LAKEFS_BLOCKSTORE_S3_REGION
, etc.
Example: Local Development
---
listen_address: "0.0.0.0:8000"
database:
connection_string: "postgres://localhost:5432/postgres?sslmode=disable"
logging:
format: text
level: DEBUG
output: "-"
auth:
encrypt:
secret_key: "10a718b3f285d89c36e9864494cdd1507f3bc85b342df24736ea81f9a1134bcc09e90b6641"
blockstore:
type: local
local:
path: "~/lakefs/dev/data"
gateways:
s3:
domain_name: s3.local.lakefs.io
region: us-east-1
Example: AWS Deployment
---
logging:
format: json
level: WARN
output: "-"
database:
connection_string: "postgres://user:pass@lakefs.rds.amazonaws.com:5432/postgres"
auth:
encrypt:
secret_key: "10a718b3f285d89c36e9864494cdd1507f3bc85b342df24736ea81f9a1134bcc"
blockstore:
type: s3
s3:
region: us-east-1
credentials_file: /secrets/aws/credentials
profile: default
gateways:
s3:
domain_name: s3.my-company.com
region: us-east-1
Example: Google Storage
---
logging:
format: json
level: WARN
output: "-"
database:
connection_string: "postgres://user:pass@lakefs.rds.amazonaws.com:5432/postgres"
auth:
encrypt:
secret_key: "10a718b3f285d89c36e9864494cdd1507f3bc85b342df24736ea81f9a1134bcc"
blockstore:
type: gs
gs:
credentials_file: /secrets/lakefs-service-account.json
gateways:
s3:
domain_name: s3.my-company.com
region: us-east-1
Example: MinIO
---
logging:
format: json
level: WARN
output: "-"
database:
connection_string: "postgres://user:pass@lakefs.rds.amazonaws.com:5432/postgres"
auth:
encrypt:
secret_key: "10a718b3f285d89c36e9864494cdd1507f3bc85b342df24736ea81f9a1134bcc"
blockstore:
type: s3
s3:
region: us-east-1
force_path_style: true
endpoint: http://localhost:9000
credentials:
access_key_id: minioadmin
access_secret_key: minioadmin
gateways:
s3:
domain_name: s3.my-company.com
region: us-east-1