Glossary

Commonly used terms relating to BioData Catalyst Powered by PIC-SURE

Term
Definition

API

Application Programming Interface: in this context, it refers to using code to search and query data using PIC-SURE.

BioData Catalyst, BDC

Cloud-based ecosystem of tools and resources funded by the National Heart, Lung, and Blood Institute

Browse (page)

A publicly available tool that allows anyone to search all data on BDC, apply filters, and assess the feasibility of research projects and data

Concept Path

A unique string or identifier for a variable with a tree-like structure that contains information about the study, dataset, and variable (for example: /study/dataset/variable/)

Data Dictionary

A document that defines the structure, content, and meaning of data, providing information about the datasets, variables, and variable values contained within a given study

Data Frame

A data structure constructed with rows and columns, similar to a spreadsheet; referred to with Python or R

Dataset ID

A specific identifier or string that is associated with a PIC-SURE query or data export

Data Access Request

See dbGaP, a process by which investigators request access to participant-level data based on a research question

dbGaP

Database of Genotypes and Phenotypes: A repository that stores studies and grants access authorization for investigators

eRA Commons

An NIH account that is used to log into BioData Catalyst

Explore (page)

A tool that allows authorized users to search data on BDC, apply filters, and export participant-level data for analysis

Facet

A search filter that helps users refine their search results by narrowing down options in multiple dimensions

Filter

An action that describes applying criteria to select a subcohort (for example, filtering to "females" using the sex variable or those with BRCA1 high severity genomic variants)

Obfuscation

In Browse, the process of disguising confidential or sensitive data to protect it from unauthorized access

Participant-level data

This describes information collected from participants of a study or data that shows the values of each participant for each variable

Personal Access Token

Refers to the PIC-SURE Personal Access Token: a unique "password" or string that is used by the PIC-SURE API to authenticate and authorize access to data

PFB

Portable Format for Biomedical Data:

PIC-SURE

Patient Information Commons - Standard Unification of Research Elements: a search, query, and export platform that integrates clinical and genomic data

Query

An action that involves extracting information from a database, such as counts or participant-level information (for example, applying a filter and looking at counts of the cohort)

Search

An action that compares a term to the data dictionary and returns relevant results (for example, typing "sickle cell" into the search bar)

Seven Bridges

Analysis platform available on BioData Catalyst

Stigmatizing

Terra

Analysis platform available on BioData Catalyst

TOPMed

Trans-Omics for Precision Medicine: an NHLBI-funded program to generate and analyze -omics data for many heart, lung, blood, and sleep studies

Value

Refers to a variable value: this describes the value that may be associated with a participant for a given variable (for example: "female" for sex or "28.1" for body mass index)

Variable

Refers to a clinical or study variable: any characteristic or attribute that can be measured and can take on different values (for example: sex, body mass index, and age)

Variant

Refers to a genomic variant: DNA sequence differences between individuals or populations, which can be applied as a filter in Explore

Last updated