Uploading data into SWB

Users are able to upload data they have stored locally into SWB using 2 methods.

Method 1: Using the user interface of a SageMaker workspace

✅ Easy, point and click interface, especially if you are already planning on using SageMaker for analysis

❌ Limited file sizes can be uploaded using this method (<30GB)

  1. Follow the instructions here to create and connect to a SageMaker workspace linked to a study that you have read/write access to.

  2. Navigate to the location where you would like to upload your files by clicking 'studies' and then the name of your linked study. Here, I am uploading data to the simran-NOWINDOWS study.

  3. Click Upload and use the file explorer to select the file you want to upload, then click Open.

  4. Confirm your upload by clicking Upload.

Method 2: Using the secure copy (SCP) command in a Linux workspace

✅ Upload any size file

  1. Follow the instructions here to create a Linux workspace linked to a study that you have read/write access to.

BCH Users: You will need to widen your CIDR block range to allow access.

  1. click the "Edit CIDRs" button

  2. Copy the IP address that is listed, and change the numbers after the / to be 24.

  3. Remove the old version (click the "x" next to the IP address)

  4. click submit

Example:

  • Original allowed CIDR block: 134.279.26.0/34

  • New allowed CIDR block: 134.279.26.0/24

For more information on changing your CIDR block, see the FAQ page here.

  1. Once your Linux workspace is listed as Available, click Connections. If this is the first time you are creating a linux workspace you will need to download your SSH key and save it to your computer when prompted.

  2. Keep your SWB browser window handy and open a Terminal window (Mac). Using the command line in the Terminal window, navigate to the directory where your SSH Key is saved.

  3. Change the permissions of the key file to allow access by running the following command in the terminal:

chmod 400 <YOUR-KEY-FILE.pem>
  • YOUR-KEY.pem should be the name of your SSH key (in the red box above).

  • FILE-TO-UPLOAD should be the name and full file path of the file you want to upload.

  • EC2-ADDRESS can be found in the next step, when you click "Use this SSH Key" in SWB.

  • YOUR-STUDY-FOLDER should be the name of the study you linked to your workspace.

  1. Next, modify the following highlighted sections in the command below in a text editor.

scp -i '<YOUR-KEY.pem>' <FILE-TO-UPLOAD> ec2-user@<EC2ADDRESS>:/home/ec2-user/studies/<YOUR-STUDY-FOLDER>
  1. In the SWB browser, click "Use this SSH Key". Under the SSH connections section, copy the public host address and paste it the <EC2ADDRESS> section of the secure copy command.

In the example command shown below, I am using the SSH key 'simran-admin-2.pem' to upload the 'test.txt' file from my desktop into the 'testing-simran' study linked to my Linux workspace. This workspace can be accessed via this EC2 address: ec2-user@ec2-54-81-208-84.compute-1.amazonaws.com.

scp -i 'simran-admin-2.pem' /Users/smakwana/Desktop/test.txt ec2-user@ec2-54-81-208-84.compute-1.amazonaws.com:/home/ec2-user/studies/testing-simran/
  1. The secure copy command is now ready to use. If the 60 second countdown time has lapsed before you ran the command, click "Use this SSH Key" to restart the time and paste your complete secure copy command in the terminal window and hit enter.

You only have 60 seconds to run the command and connect to your workspace. In order to successfully transfer files, you must establish a connection during these 60 seconds. If the time has elapsed, just click "Use this SSH Key" again to restart the countdown.

  1. A successful copy will look like the image below. Your data is now in your S3 subfolder (linked study), accessible to any workspace associated with your subfolder.

Last updated