Actions

File transfer from and to SurfDrive and ResearchDrive

From ALICE Documentation

File transfer from and to SURFdrive and ResearchDrive

This section describes how to transfer files to and from SURFdrive and ResearchDrive, the storage solutions offered by SURF. On ALICE, this is only possible via rclone which is available as a module. The procedure described below is very similar for ResearchDrive and SurfDrive and is based on the documentation in the ResearchDrive wiki: External Wiki.

IMPORTANT: Support for rclone on ALICE is new and therefore still considered experimental. You are free to use it of course and we welcome any feedback that you have.

Preparations

  • Make sure that you have an account on ResearchDrive and/or SURFdrive
  • Log in to ResearchDrive or SURFdrive
  • In ResearchDrive/SURFdrive, go to your username in the top right corner, choose "Settings" and then go to "Security"
  • In the section "WebDAV passwords", set an app name and generate a password. This will show your username and WebDAV password. Do not click on "Done" at this point, but keep it open until the setup procedure is complete.
  • Log in to ALICE

Procedure

  • After you have logged in ALICE, load the rclone module. Whenever you want to use rclone, you have to load the module
 # to always load the latest version, just do:
 module load rclone
  • Now we start the configuration process of rclone with
 [me@nodelogin01 ~]$ rclone config
  • In the first configuration step, select "n" for "New Remote"
 No remotes found - make a new one
 n) New remote
 s) Set configuration password
 q) Quit config
 n/s/q> n
  • Next, rclone asks you to choose a name for the connection. It is arbitrary and it will be used later to identify the remote connection. Here we will use as an example "RD" for ResearchDrive and "SD" for SURFdrive
 name> SD
  • Next, rclone will present you with a list of connection types. Find the one that says "WebDAV" and type in the corresponding number. In this example, it is 37, but it might be another one when you run it.
 Type of storage to configure.
 Enter a string value. Press Enter for the default ("").
 Choose a number from below, or type in your own value
  1 / 1Fichier
    \ "fichier"
  2 / Alias for an existing remote
    \ "alias"
  3 / Amazon Drive
    \ "amazon cloud drive"
  4 / Amazon S3 Compliant Storage Providers including AWS, Alibaba, Ceph, Digital Ocean, Dreamhost, IBM COS, Minio, and Tencent COS
    \ "s3"
 ...
 37 / Webdav
    \ "webdav"
 38 / Yandex Disk
    \ "yandex"
 39 / Zoho
    \ "zoho"
 40 / http Connection
    \ "http"
 41 / premiumize.me
    \ "premiumizeme"
 42 / seafile
    \ "seafile"
 Storage> 37
  • In the following step, you have to enter an url. You can find the required url when you switch back to your browser where you have the "Security" settings open for your RD or SD account. Copy the url below "To access your files through WebDAV, please use the following URL:" and paste in the terminal on ALICE which you use to configure rclone.
 Storage> 37
 ** See help for webdav backend at: https://rclone.org/webdav/ **
 
 URL of http host to connect to
 Enter a string value. Press Enter for the default ("").
 Choose a number from below, or type in your own value
  1 / Connect to example.com
    \ "https://example.com"
 url> <paste_the_url_from_SD/RD_here> 
  • In the next rclone configuration step, select the corresponding number for "OwnCloud". Here it is 2, but it could be another one in your case.
 Name of the Webdav site/service/software you are using
 Enter a string value. Press Enter for the default ("").
 Choose a number from below, or type in your own value
  1 / Nextcloud
    \ "nextcloud"
  2 / Owncloud
    \ "owncloud"
  3 / Sharepoint
    \ "sharepoint"
  4 / Other site/service or software
    \ "other"
 vendor> 2 
  • Next, rclone will ask you for your SD or RD username. You can also find this in the Security settings for your RD or SD account where you have generated the WebDAV password.
 User name
 Enter a string value. Press Enter for the default ("").
 user> <insert_the_SD/RD_webdav_username_here> 
  • In the following step, you need to type in "y" to set your own password. This will be the WebDAV password that you generated on the Security settings of your RD or SD account.
 Password.
 y) Yes type in my own password
 g) Generate random password
 n) No leave this optional password blank (default)
 y/g/n> y 
  • Rclone will ask you for a password now. Copy the WebDAV password from your SD and RD account and paste it here. This is *not* your SD/RD account password, but the one that you generated in the security settings.
 Enter the password:
 password: <insert_your_RD/SD_webdav_password_here> 
  • Finally, you can confirm the default settings for the next two steps
 Bearer token instead of user/pass (e.g. a Macaroon)
 Enter a string value. Press Enter for the default ("").
 bearer_token>
 
 Edit advanced config? (y/n)
 y) Yes
 n) No (default)
 y/n> n 
  • then confirm with "y" if the settings are correct
 Remote config
 --------------------
 [SD]
 type = webdav
 url = <your_webdav_url>
 vendor = owncloud
 user = <your_user_name>
 pass = *** ENCRYPTED ***
 --------------------
 y) Yes this is OK (default)
 e) Edit this remote
 d) Delete this remote
 y/e/d> y  
  • and quite the configuration scheme if everything checks out and you do not do anything else:
 Current remotes:
 
 Name                 Type
 ====                 ====
 SD                   webdav
 
 e) Edit existing remote
 n) New remote
 d) Delete remote
 r) Rename remote
 c) Copy remote
 s) Set configuration password
 q) Quit config
 e/n/d/r/c/s/q> q

Testing

If the setup procedure was successful, you should now be able to access content in your SD or RD storage from ALICE and move data back and forth. Here, we have collected a few examples for you to try out.

One of the first things that you might want to try is to list the top level file content of your SD or RD storage:

 [me@nodelogin01 ~]$ rclone ls --max-depth 1 RD:

If you want to list the content for your SD account, then you just have to replace "RD" with "SD"

Whenever you want to access your RD or SD storage, you have to use RD: or SD:

You could go ahead now and try to copy data to or from ALICE. It would be best to start with a single small file. For example

 [me@nodelogin01 ~]$ rclone copy <my_local_file> RD:

or

 [me@nodelogin01 ~]$ rclone copy RD:<my_remote_file> ./

Best Practices

Rclone has a lot of options for improving the performance of your data transfer. We recommend to have a look at the ResearchDrive wiki linked above and the rclone documentation.

Here, we list some best practices for ALICE taking into account that rclone on ALICE is still experimental

  • Despite the recommendations on the ResearchDrive wiki, DO NOT increase the number of transfer beyond 8. This is because we still need to gather data on the impact of the number of transfers on the network for other users.
  • For ResearchDrive and SURFdrive, rclone is okay to be used for files up to 30GB. Above that it might get unstable. For files above 100GB in size, other methods have to be used to copy data to ResearchDrive and SURFdrive. Please consult the ResearchDrive and SURFdrive documentation.
  • Use the --timeout option when copying files setting it to 10m for each GB of the biggest file.
  • Always use the --use-cookies whenever you transfer files.