I am using saveAsTextFile() to store the results of a Spark job in the folder dbfs:/FileStore/my_result.
I can access to the different "part-xxxxx" files using the web browser, but I would like to automate the process of downloading all files to my local machine.
I have tried to use cURL, but I can't find the RestAPI command to download a dbfs:/FileStore file.
Question: How can I download a dbfs:/FileStore file to my Local Machine?
I am using Databricks Community Edition to teach an undergraduate module in Big Data Analytics in college. I have Windows 7 installed in my local machine. I have checked that cURL and the _netrc files are properly installed and configured as I manage to successfully run some of the commands provided by the RestAPI.
Thank you very much in advance for your help! Best regards, Nacho
Using browser, you can access to individual file in File Store. You cannot access or even list directories. So you first have to put some file into the file store. If you've got a file "example.txt" at "/FileStore/example_directory/", you can download it via the following URL:
https://community.cloud.databricks.com/files/example_directory/example.txt?o=###
In that URL, "###" has to be replaced by the long number you find at the end of your community edition URL (after you logged into your community edition account).
Add comment · Share
There are a few options for downloading FileStore files to your local machine.
Easier options:
dbfs cp
command. For example:dbfs cp dbfs:/FileStore/test.txt ./test.txt
. If you want to download an entire folder of files, you can usedbfs cp -r
.https://<YOUR_DATABRICKS_INSTANCE_NAME>.cloud.databricks.com/files/
. If you are using Databricks Community Edition then you may need to use a slightly different path. This download method described in more detail in the FileStore docs.Advanced options:
read
API call. To download a large file, you may need to issue multipleread
calls to access chunks of the full file.