Glossary
Commonly used terms relating to BioData Catalyst Powered by PIC-SURE
API
Application Programming Interface: in this context, it refers to using code to search and query data using PIC-SURE.
BioData Catalyst, BDC
Cloud-based ecosystem of tools and resources funded by the National Heart, Lung, and Blood Institute
Browse (page)
A publicly available tool that allows anyone to search all data on BDC, apply filters, and assess the feasibility of research projects and data
Concept Path
A unique string or identifier for a variable with a tree-like structure that contains information about the study, dataset, and variable (for example: /study/dataset/variable/
)
Data Dictionary
A document that defines the structure, content, and meaning of data, providing information about the datasets, variables, and variable values contained within a given study
Data Frame
A data structure constructed with rows and columns, similar to a spreadsheet; referred to with Python or R
Dataset ID
A specific identifier or string that is associated with a PIC-SURE query or data export
Data Access Request
See dbGaP, a process by which investigators request access to participant-level data based on a research question
dbGaP
Database of Genotypes and Phenotypes: A repository that stores studies and grants access authorization for investigators
eRA Commons
An NIH account that is used to log into BioData Catalyst
Explore (page)
A tool that allows authorized users to search data on BDC, apply filters, and export participant-level data for analysis
Facet
A search filter that helps users refine their search results by narrowing down options in multiple dimensions
Filter
An action that describes applying criteria to select a subcohort (for example, filtering to "females" using the sex variable or those with BRCA1 high severity genomic variants)
Obfuscation
In Browse, the process of disguising confidential or sensitive data to protect it from unauthorized access
Participant-level data
This describes information collected from participants of a study or data that shows the values of each participant for each variable
Personal Access Token
Refers to the PIC-SURE Personal Access Token: a unique "password" or string that is used by the PIC-SURE API to authenticate and authorize access to data
PFB
Portable Format for Biomedical Data:
PIC-SURE
Patient Information Commons - Standard Unification of Research Elements: a search, query, and export platform that integrates clinical and genomic data
Query
An action that involves extracting information from a database, such as counts or participant-level information (for example, applying a filter and looking at counts of the cohort)
Search
An action that compares a term to the data dictionary and returns relevant results (for example, typing "sickle cell" into the search bar)
Seven Bridges
Analysis platform available on BioData Catalyst
Stigmatizing
Terra
Analysis platform available on BioData Catalyst
TOPMed
Trans-Omics for Precision Medicine: an NHLBI-funded program to generate and analyze -omics data for many heart, lung, blood, and sleep studies
Value
Refers to a variable value: this describes the value that may be associated with a participant for a given variable (for example: "female" for sex or "28.1" for body mass index)
Variable
Refers to a clinical or study variable: any characteristic or attribute that can be measured and can take on different values (for example: sex, body mass index, and age)
Variant
Refers to a genomic variant: DNA sequence differences between individuals or populations, which can be applied as a filter in Explore
Last updated