Data organization in PIC-SURE

PIC-SURE integrates clinical and genomic datasets. Each variable is organized as a concept path that contains information about the study, variable group, and variable. Though the specifics of the concept paths are dependent on the type of study, the overall information included is the same.

Table of Data Fields in PIC-SURE

General organization

Data are organized in groups of like variables, when available. For example, variables like Age, Gender, and Race could be part of the Demographics variable group.

Concept path structure

\study\variable name

Variable ID

Equivalent to variable name

Variable name

Encoded variable name that was used by the original submitters of the data

Variable description

Description of the variable, as available

Dataset ID

Equivalent to dataset name

Dataset name

Name of a group of like variables, as available

Dataset description

Description of a group of variables, as available

Study ID

ID or name of the study

Study description

Description of the study

Note that there are two data types in PIC-SURE: categorical and continuous data. Categorical variables refer to any variables that have categorized values. For example, “Have you ever had asthma?” with values “Yes” and “No” is a categorical variable. Continuous variables refer to any variables that have a numeric range of values. For example, “Age” with a value range from 10 to 90 is a continuous variable. The internal PIC-SURE data load process determines the type of each variable based on the data.

Last updated