Service Workbench
  • General User Guide
    • Introduction
    • Studies
      • Open Data Studies
      • Controlled Data Studies
        • Human Connectome
        • TCGA (The Cancer Genome Atlas)
        • EMory BrEast Imaging Dataset (EMBED)
      • Organization Studies
    • Research workspaces
      • Workspace statuses
      • Workspace configurations
      • How can I create a workspace linked to a study?
      • How can I create a workspace WITHOUT a study?
      • Installing packages into workspaces
      • Using custom Jupyter Notebook kernels
      • Uploading data into SWB
    • Instances
    • Example Analyses & Notebooks
    • FAQ
      • Why was my workspace stopped when I was working on it?
      • How can I install devtools in R/Rstudio?
      • I don't see a 'studies' folder in my workspace?
      • Error: 'Forbidden' or 'Unable to connect'
      • Error: 'We have a problem! null is not an object'...
      • 403 error page
      • Error provisioning environment sagemaker
      • Workspace in Unknown status
      • Cannot connect to SageMaker workspace
      • I can't get through to the log-in page on SWB.
    • Help / Contact Us
    • Release Notes
  • AIM-AHEAD
    • Accessing OCHIN Data
      • Phase 1: Regulatory Requirements
      • Phase 2: Data Exploration
      • Phase 3: Data Analysis
      • Example Researcher Workflow
    • Understanding the OCHIN Data
      • i2b2 Common Data Model
      • Example Queries
      • Additional information
  • Harvard Medical School
    • Introduction
    • Activating your account
    • Creating your workspace
  • LEAP-DEV
    • Workspace Configurations
Powered by GitBook
On this page
  • Creating a SageMaker workspace
  • Using R and python to access tables in your project database
  1. AIM-AHEAD
  2. Accessing OCHIN Data

Phase 3: Data Analysis

This section covers how to use a SageMaker workspace to access saved data subsets for analysis and run analyses on the OCHIN data.

PreviousPhase 2: Data ExplorationNextExample Researcher Workflow

Last updated 6 months ago

Creating a SageMaker workspace

Step 1. Connect to your SageMaker workspace.

Click "Connections" and then "Connect". A new window will open with your SageMaker workspace.

Step 2. Open the SageMaker Examples tab.

Step 3. Copy the Access to Data and Compute using Service Workbench folder.

Click "Use" next to any of the notebooks under "Access to Data and Compute Using Service Workbench" and then "Create copy". This will copy the entire example folder to your workspace.

Step 4. Move the desired example code notebooks into your study folder.

When you first copy the examples, a new window will open with the notebook you copied.

If you see a popup that says "Kernel not found", select conda_python3 from the dropdown menu and click "Set Kernel".

Close out of that tab and navigate back to your Home Page.

Click on the Files tab of your Home page. You will notice the example code folder has been added.

Check the box next to the Access to Data and Compute Using Service Workbench folder and click "Move".

For the directory path, type /studies/ followed by the name of the study folder you linked to your workspace. Click Move.

Using R and python to access tables in your project database

The R and python example notebooks contain all the steps needed to connect to the OCHIN Database.

They walk through the following steps:

  1. Install necessary drivers and packages

  2. Read and parse your DB credentials

  3. Connect to the database

  4. Query the database

Step 4: Provision your workspace.

This may take 12-20 minutes.

Once your workspace is listed as AVAILABLE, you can connect to it.

Step 3: Set your workspace parameters.

Name: Any name. Note that the Name can contain only alphanumeric characters (case sensitive) and hyphens. It must start with an alphabetic character and cannot be longer than 128 characters.

Restricted CIDR: No change necessary

Project ID: Select your AIM AHEAD affiliation, for example Research-Fellowship or Consortium-Development-Project

Configuration: sagemaker-small

Description: Any description. Note that the Description must be at least 3 characters.

Step 1: Navigate to the Studies page and click on the Organization tab. Mount your study(ies) to your workspace.

The organization studies are linked to Amazon S3 secure storage. This means that anything saved in these study folders will be securely saved and accessible through any workspace the study is mounted to.

To mount a study to your workspace, check the box next to the study name.

For more information about studies, view the documentation .

here
Step 2: Select the SageMaker compute platform.

Connecting your project database

When you are finished working on your workspace for the day, please STOP the workspace to avoid incurring excess cloud costs.