Configuring lakeFS to use S3 Virtual-Host addressing
Understanding virtual-host addressing
Some systems require S3 endpoints (such as lakeFS’s S3 Gateway) to support virtual-host style addressing.
lakeFS supports this, but requires some configuration in order to extract the bucket name (used as the lakeFS repository ID) from the host address.
For example:
GET http://foo.example.com/some/location
There are two ways to interpret the URL above:
- as a virtual-host URL where the endpoint URL is
example.com
, the bucket name isfoo
, and the path is/some/location
- as a path-based URL where the endpoint is
foo.example.com
, the bucket name issome
and the path islocation
By default, lakeFS reads URLs as path-based. To read the URL as a virtual-host request, lakeFS requires additional configuration which includes defining an explicit set of DNS records for the lakeFS S3 gateway.
Adding an explicit S3 domain name to the S3 Gateway configuration
The first step would be to tell the lakeFS installation which hostnames are used for the S3 Gateway. This should be a different DNS record from the one used for e.g. the UI or API.
Typically, if the lakeFS installation is served under lakefs.example.com
, a good choice would be s3.lakefs.example.com
.
This could be done using either an environment variable:
LAKEFS_GATEWAYS_S3_DOMAIN_NAME="s3.lakefs.example.com"
Or by adding the gateways.s3.domain_name
setting to the lakeFS config.yaml
file:
---
database:
connection_string: "..."
...
# This section defines an explict S3 gateway address that supports virtual-host addressing
gateways:
s3:
domain_name: s3.lakefs.example.com
For more information on how to configure lakeFS, check out the configuration reference
Setting up the appropriate DNS records
Once our lakeFS installation is configured with an explicit S3 gateway endpoint address, we need to define 2 DNS records and have them point at our lakeFS installation. This requires 2 CNAME records:
s3.lakefs.example.com
- CNAME tolakefs.example.com
. This would be used as the S3 endpoint when configuring clients and will serve as our bare domain.*.s3.lakefs.example.com
- Also a CNAME tolakefs.example.com
. This will resolve virtual-host requests such asexample-repo.s3.lakefs.example.com
that lakeFS would now know how to parse.
For more information on how to configure these, see the official documentation of your DNS provider.
On AWS, This could also be done using ALIAS records for a load balancer.