Open PIC-SURE Features

Open PIC-SURE allows you to search any clinical variable available in PIC-SURE. Queries will return obfuscated aggregate counts per study and consent. There are some features specific to Open PIC-SURE, which are outlined below.

A. Some variables are not filterable to protect participant data. Open PIC-SURE does not allow the filtering of clinical variables that contain potentially sensitive information. These variables are known as stigmatizing variables which fall into the following categories:

  • Mental health diagnoses, history, and treatment

  • Illicit drug use history

  • Sexually transmitted disease diagnoses, history, and treatment

  • Sexual history

  • Intellectual achievement, ability, and educational attainment

  • Direct or surrogate identifiers of legal status

For more information about stigmatizing variables and the identification process, please refer to the documentation and code on the BioData Catalyst Powered by PIC-SURE Stigmatizing Variables GitHub repository.

Some additional variables are not filterable to protect participant anonymity. You can submit a request for our team to review a variable's status using our helpdesk.

B. Data obfuscation. Because participant-level data are not available in Open PIC-SURE, the aggregate counts are obfuscated to anonymize the data. This means that:

  • If the participant count is between zero and nine, it will be shown as < 10.

  • If the participant count is ten or greater, the true count will be shown.

  • If the participant count is ten or greater, but the count consists of subgroups (such as consent groups) that are between zero and nine, the count will be obfuscated by +/- 3.

C. Tool suite. The Tool Suite contains tools that can be used to explore filtered cohorts of interest.

  • Participant Count by Study: The filtered number of participants that match the query criteria is shown broken down by study and consent group. If there are no filters added to the query, the counts displayed reflect the total number of participants in the study. Additionally, users can see if they do or do not have access to specific studies.

  • Variable Distributions: View the distributions of query variables based on the filtered cohort. Note that there is a limit to the number of variable distributions that can be viewed at a given time. For categorical variables with more than seven categories included in the query, seven categories will be displayed in the bar chart plus an additional "Other" bar that combines all other categories. Additionally, the visualizations will be obfuscated to protect participant-level data. This mimics the data obfuscation applied to the aggregate counts. This means that:

    • If the total count of the cohort is obfuscated by +/- 3:

      • Counts between 0 and 9 are displayed as "< 10".

      • Counts greater than or equal to 10 are obfuscated by +/- 3.

Last updated