Glossary
Commonly used terms relating to BioData Catalyst Powered by PIC-SURE
Term | Definition |
---|---|
API | Application Programming Interface: in this context, it refers to using code to search and query data using PIC-SURE. |
BioData Catalyst, BDC | Cloud-based ecosystem of tools and resources funded by the National Heart, Lung, and Blood Institute |
Browse (page) | A publicly available tool that allows anyone to search all data on BDC, apply filters, and assess the feasibility of research projects and data |
Concept Path | A unique string or identifier for a variable with a tree-like structure that contains information about the study, dataset, and variable (for example: |
Data Dictionary | A document that defines the structure, content, and meaning of data, providing information about the datasets, variables, and variable values contained within a given study |
Data Frame | A data structure constructed with rows and columns, similar to a spreadsheet; referred to with Python or R |
Dataset ID | A specific identifier or string that is associated with a PIC-SURE query or data export |
Data Access Request | See dbGaP, a process by which investigators request access to participant-level data based on a research question |
dbGaP | Database of Genotypes and Phenotypes: A repository that stores studies and grants access authorization for investigators |
eRA Commons | An NIH account that is used to log into BioData Catalyst |
Explore (page) | A tool that allows authorized users to search data on BDC, apply filters, and export participant-level data for analysis |
Facet | A search filter that helps users refine their search results by narrowing down options in multiple dimensions |
Filter | An action that describes applying criteria to select a subcohort (for example, filtering to "females" using the sex variable or those with BRCA1 high severity genomic variants) |
Obfuscation | In Browse, the process of disguising confidential or sensitive data to protect it from unauthorized access |
Participant-level data | This describes information collected from participants of a study or data that shows the values of each participant for each variable |
Personal Access Token | Refers to the PIC-SURE Personal Access Token: a unique "password" or string that is used by the PIC-SURE API to authenticate and authorize access to data |
PFB | Portable Format for Biomedical Data: |
PIC-SURE | Patient Information Commons - Standard Unification of Research Elements: a search, query, and export platform that integrates clinical and genomic data |
Query | An action that involves extracting information from a database, such as counts or participant-level information (for example, applying a filter and looking at counts of the cohort) |
Search | An action that compares a term to the data dictionary and returns relevant results (for example, typing "sickle cell" into the search bar) |
Seven Bridges | Analysis platform available on BioData Catalyst |
Stigmatizing | Potentially sensitive information that should not be publicly filterable; see this page for more information |
Terra | Analysis platform available on BioData Catalyst |
TOPMed | Trans-Omics for Precision Medicine: an NHLBI-funded program to generate and analyze -omics data for many heart, lung, blood, and sleep studies |
Value | Refers to a variable value: this describes the value that may be associated with a participant for a given variable (for example: "female" for sex or "28.1" for body mass index) |
Variable | Refers to a clinical or study variable: any characteristic or attribute that can be measured and can take on different values (for example: sex, body mass index, and age) |
Variant | Refers to a genomic variant: DNA sequence differences between individuals or populations, which can be applied as a filter in Explore |
Last updated