Configuring lakeFS to use S3 Virtual-Host addressing
Understanding virtual-host addressing
Some systems require S3 endpoints (such as lakeFS’ S3 Gateway) to support virtual-host style addressing.
lakeFS supports this, but requires some configuration in order to extract the bucket name (used as the lakeFS repository ID) from the host address.
For example:
GET http://foo.example.com/some/location
In this case, there’s no way for lakeFS to determine whether this is a virtual-host request where the endpoint url is example.com
, the bucket name is foo
and the path is /some/location
,
or a path-based request where the endpoint is foo.example.com
, the bucket name is some
and the path is location
.
This requires an extra step: Defining an explicit set of DNS record for lakeFS S3 gateway.
Adding an explicit S3 domain name to the S3 Gateway configuration
The first step would be to tell the lakeFS installation which hostnames are used for the S3 Gateway. This should be a different DNS record from the one used for e.g. the UI or API.
Typically, if the lakeFS installation is served under lakefs.example.com
, a good choice would be s3.lakefs.example.com
.
This could be done using either an environment variable:
LAKEFS_GATEWAYS_S3_DOMAIN_NAME="s3.lakefs.example.com"
Or by adding the gateways.s3.domain_name
setting to the lakeFS config.yaml
file:
---
database:
connection_string: "..."
...
# This section defines an explict S3 gateway address that supports virtual-host addressing
gateways:
s3:
domain_name: s3.lakefs.example.com
For more information on how to configure lakeFS, check out the configuration reference
Setting up the appropriate DNS records
Once our lakeFS installation is configured with an explicit S3 gateway endpoint address, we need to define 2 DNS records and have them point at our lakeFS installation. This requires 2 CNAME records:
s3.lakefs.example.com
- CNAME tolakefs.example.com
. This would be used as the S3 endpoint when configuring clients and will serve as our bare domain.*.s3.lakefs.example.com
- Also a CNAME tolakefs.example.com
. This will resolve virtual-host requests such asexample-repo.s3.lakefs.example.com
that lakeFS would now know how to parse.
For more information on how to configure these, see the official documentation of your DNS provider.
On AWS, This could also be done using ALIAS records for a load balancer.