Rclone is a command line program to sync files and directories between cloud providers. To use it with lakeFS, just create an Rclone remote as describe below, and then use it as you would any other Rclone remote.
- Creating a remote for lakeFS in Rclone
To add the remote to Rclone, choose one of the following options:
Find the path to your Rclone configuration file and copy it for the next step.
rclone config file # output: # Configuration file is stored at: # /home/myuser/.config/rclone/rclone.conf
If your lakeFS access key is already set in an AWS profile or environment variables, just run the following command, replacing the endpoint property with your lakeFS endpoint:
cat <<EOT >> /home/myuser/.config/rclone/rclone.conf # output: # [lakefs] # type = s3 # provider = AWS # endpoint = https://s3.lakefs.example.com # # EOT
Otherwise, also include your lakeFS access key pair in the Rclone configuration file:
cat <<EOT >> /home/myuser/.config/rclone/rclone.conf # output: # [lakefs] # type = s3 # provider = AWS # env_auth = false # access_key_id = AKIAIOSFODNN7EXAMPLE # secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY # endpoint = https://s3.lakefs.example.com # EOT
Run this command and follow the instructions:
Choose AWS S3 as your type of storage, and enter your lakeFS endpoint as your S3 endpoint. You will have to choose whether you use your environment for authentication (recommended), or to enter the lakeFS access key pair into the Rclone configuration.
rclone sync mys3remote://mybucket/path/ lakefs:example-repo/master/path
rclone sync /home/myuser/path/ lakefs:example-repo/master/path