NHLBI BioData Catalyst® Powered by PIC-SURE
  • NHLBI BioData Catalyst® Powered by PIC-SURE User Guide
    • Frequently Asked Questions
  • Introduction to PIC-SURE
    • General Layout
    • Browse vs. Explore
  • Browse
    • Browse All Data
    • Features of Browse
  • Explore
    • Log in to Explore
    • Features of Explore
      • Prepare for Analysis
      • PFB Handoff to BioData Catalyst Powered by Terra
      • PFB Handoff to BioData Catalyst Powered by Seven Bridges
    • Manage Datasets
  • Data in PIC-SURE
    • Data Organization in BDC-PIC-SURE
      • BDC-PIC-SURE Data Format
    • Available Data & Managing Data Access
      • Publicly Available Datasets
      • TOPMed and TOPMed Related Datasets
        • Harmonized Data (TOPMed DCC Harmonized Clinical Variables)
      • BioLINCC Datasets
      • CONNECTS Datasets
  • Prepare for Data Analysis Using the PIC-SURE API
    • What is the PIC-SURE API?
    • PIC-SURE Personal Access Token
    • Analysis in the BioData Catalyst Ecosystem
      • BDC Powered by Seven Bridges
      • BDC Powered by Terra
    • Data Dictionaries via PIC-SURE API
    • More information about the PIC-SURE API
  • Citation and Acknowledgement of BioData Catalyst
  • Release Notes
    • Release Notes
      • 2025 June 4 Release
      • 2025 May 22 Release
      • 2025 May 8 Release
      • 2025 April 3 Release
      • 2025 March 5 Release
      • 2025 February 10 Release
      • 2024 Release Notes
        • 2024 December 19 Release
        • 2024 November 21 Release
        • 2024 November 4 Release
        • 2024 October 3 Release
        • 2024 September 5 Release
        • 2024 August 20 Release
        • 2024 August 1 Release
        • 2024 June 18 Release
        • 2024 May 29/30 Release
        • 2024 May 10/14 Release
        • 2024 March 26/28 Release
        • 2024 February 20/22 Release
        • 2024 January 30/31
        • 2024 January 16 Release
        • 2024 June 27 Release
      • 2023 Release Notes
        • 2023 December 12/14 Release
        • 2023 November 17 Release
        • 2023 October 23/31 Releases
        • 2023 October 13 Release
        • 2023 October 6 Release
        • 2023 September 28 Release
        • 2023 August 29 Release
        • 2023 July 27 Release
        • 2023 May 25 Release
        • 2023 March 30 Release
        • 2023 January 26 Release
  • Video Tutorials
    • Introduction to BioData Catalyst Powered by PIC-SURE
    • Basics: Finding Variables
    • Basics: Applying a Filter on a Variable
    • Basics: Editing a Variable Filter
    • PIC-SURE Open Access: Interpreting the Results
    • PIC-SURE Authorized Access: Add Variables to Export
    • PIC-SURE Authorized Access: Applying a Genomic Filter
    • PIC-SURE Authorized Access: Variable Distributions Tool
    • PIC-SURE Open Application Programming Interface (API)
  • Appendix
    • Glossary
    • Appendix 1: BDC Identifiers - dbGaP, TOPMed, and PIC-SURE
    • Appendix 2: Table of TOPMed DCC Harmonized Variables in PIC-SURE
Powered by GitBook
On this page
  1. Data in PIC-SURE

Data Organization in BDC-PIC-SURE

PreviousManage DatasetsNextBDC-PIC-SURE Data Format

Last updated 7 months ago

BDC-PIC-SURE integrates clinical and genomic datasets across BDC, including TOPMed and TOPMed related studies, COVID-19 studies, and BioLINCC studies. Each variable is organized as a concept path that contains information about the study, variable group, and variable. Though the specifics of the concept paths are dependent on the type of study, the overall information included in the same.

For more information about additional dbGaP, TOPMed, and PIC-SURE concept paths, refer to .

Table of Data Fields in BDC-PIC-SURE

Data Organization
TOPMed & TOPMed-Related Studies
Studies where the variables are not indexed by dbGaP

General organization

Data organized using the format implemented by the . Find more information on the dbGaP datastructure . Generally, a given study will have several tables, and those tables will have several variables.

Data do not follow dbGaP format; there are no phv or pht accessions. Data are organized in groups of like variables, when available. For example, variables like Age, Gender, and Race could be part of the Demographics variable group.

Concept path structure (flexible concept path strucutre)

\phs\pht\phv\variable name\

\phs\variable name or \phs\form name\variable or \phs\form group\form name\variable group\variable

Variable ID

phv corresponding to the variable accession number

Equivalent to variable name

Variable name

Encoded variable name that was used by the original submitters of the data

Encoded variable name that was used by the original submitters of the data

Variable description

Description of the variable

Description of the variable, as available

Dataset ID

pht corresponding to the trait table accession number

Equivalent to dataset name

Dataset name / Form Group / Variable Group

Name of the trait table

Name of a group of like variables, as available

Dataset description / Form description / Variable description

Description of the trait table

Description of a group of variables, as available

Study ID

phs corresponding to the study accession number

phs corresponding to the study accession number

Study description

Description of the study from dbGaP

Description of the study from dbGaP

Note that there are two data types in PIC-SURE: categorical and continuous data. Categorical variables refer to any variables that have categorized values. For example, “Have you ever had asthma?” with values “Yes” and “No” is a categorical variable. Continuous variables refer to any variables that have a numeric range of values. For example, “Age” with a value range from 10 to 90 is a continuous variable. The internal PIC-SURE data load process determines the type of each variable based on the data.

Appendix 1
database of Genotypes and Phenotypes (dbGaP)
here