Phase 3: Data Analysis

This section covers how to use a SageMaker workspace to access saved data subsets for analysis and run analyses on the OCHIN data.

Repeat steps from Phase 2 as needed to create and/or connect to a SageMaker workspace.

chevron-rightCreate your workspacehashtag

Step 1: Navigate to the Studies page. The organization studies are linked to Amazon S3 secure storage. This means that anything saved in these study folders will be securely saved and accessible through any workspace the study is mounted to.

For more information about studies, view the documentation herearrow-up-right.

Step 2: Select the Organizations tab

Step 3:Select the study with your project and user name attached

Step 4: Click Next

Step 5: Select the Sagemaker Notebook as shown below

Step 6: Click Next

Step 7: Enter a name: Any name. Note that the Name can contain only alphanumeric characters (case sensitive) and hyphens. It must start with an alphabetic character and cannot be longer than 128 characters.

No change necessary for the Restricted CIDR field

Step 8: Select the Project Id dropdown

Step 9: Select your AIM AHEAD affiliation: For example Research-Fellowship or Consortium-Development-Project

Step 10: Select a sagemaker-small workspace

Step 11: Enter a Description for your benefit: Any description. Note that the Description must be at least 3 characters.

Step 12: Click Create Research Workspace

chevron-rightWait for your workspace to become available.hashtag

This may take 12-20 minutes.

Once your workspace is listed as AVAILABLE, you can connect to it.

triangle-exclamation

Connecting your project database

chevron-rightConnect to your SageMaker workspace.hashtag

Step 1: Click Connections

Step 2: Click Connect. A new tab in your internet browser will open with your SageMaker workspace.

Step 3: In the new window, select the Sagemaker Examples tab at the top of the page.

Step 4: Under the Access to Data and Compute Using Service Workbench section, click the Use button next to one of the example notebooks, such as Connecting to OCHIN DB - R.ipynb.

Step 5: Click Create Copy on the pop-up window. NOTE: This will create a copy of ALL example notebooks listed in the Access to Data and Compute Using Service Workbench section, so you only need to do this action once to get access to all the notebooks.

Step 6: Select the newly created folder Access-to-Data-and-Compute-Using-Service-Workbench

Step 7: Select a notebook you will use to access your data. There is an example provided using R, and one using Python. It is recommended that you choose the programming language you are more comfortable with.

triangle-exclamation

Using R and python to analyze your data

chevron-rightLoad previous data subsetshashtag
  1. If saved in the correct folder while using the Connecting to Ochin Data.ipynb notebook in Phase 2, you will be able to load previous data from any SageMaker instance

  2. Create a new R/python notebook to begin your analysis using your saved data

    1. If you need ideas on how to begin your analysis, follow though an Example Notebook that uses fake test data

circle-info

The Connecting to Ochin Data.ipynb notebook contains python code to help you open .csv files saved to your SageMaker workspace or studies folder

circle-info

The Access Saved Data With R.ipynb notebook contains R code to help you open .csv files saved to your SageMaker workspace or studies folder

Last updated