Cohort Definition

Concept Set Expression Resolved Mapped Json

Database:

Filter logical values with "T" and "F"

Incidence Rates

Description

A graph showing the incidence rate, optionally stratified by age (in 10-year bins), gender, and calendar year.

The incidence rate is computed as 1000 * the number of people first entering the cohort / the number of years people were eligible to enter the cohort for the first time. The eligible person time is defined as the time when

A person was observed in the data source (based on the observation_period table).
Had the required amount of prior observation time as specified in the cohort entry event criteria. For example, if the cohort definition requires 365 days of observation prior to cohort entry, patients are not eligible to enter the cohort in the first 365 days of their observation period, and this time is not counted in the eligible time.
If the person enters the cohort, then only the time up to cohort entry. Because we only consider the first cohort entry, persons are no longer eligible to enter to cohort after their first entry.

Note: If your cohort definition has an inclusion rule that restricts persons based on prior observation time, then this might lead to underestimation of incidence rate as the same prior observation time restriction would not be applied to the denominator. We recommend that you revise the cohort definition to make prior observation time rule part of entry event criteria.

Options

You can select multiple data sources in the side bar to see graphs from different data sources in the same plot.

Select the cohort to explore in the side bar.

At the top left of the plot, you can choose whether to stratify the data by age, gender, or calendar year.

At the top right of the plot, you can choose whether to use the same y-axis for all data sources.

If you move the mouse over the plot, you can see the precise value.

What to look for

Are the observed incidence rates in line with expectations? For example, if we have an estimate of the population incidence based on an external source, is the incidence rate comparable to that estimate?
Are the age and gender distributions in line with expectations? For example, are contraceptives only prescribed in women?
Is the incidence rate stable over time? If there are sudden peaks or drops, this may indicate coding issues.

Stratify by

Age Sex Calendar Year

Use same y-scale across databases

Limit y-scale range to:

Filter By Age

Filter By Sex

Minimum person years

Minimum subject count

Filter By Calender Year

Time Distributions

Description

Boxplot and a table showing the distribution of time (in days) before and after the cohort index date (cohort start date), and the time between cohort start and end date. The information is shown for all cohort entries, so not limiting to the first per person.

The boxplot shows:

Whiskers: The minimum and maximum observed number of days.
Box: The 25th to 75th percentile.
Line: The median

The table show the same information and more:

Average: the mean of the distribution
SD: Standard Deviation
Min: The minimum
P10: The 10th percentile
P25: The 25th percentile
Median: The median (50th percentile)
P75: The 75th percentile
P90: The 90th percentile
Max: The maximum

Options

You can select multiple data sources in the side bar to see time distributions from different data sources in the same plot and table.

Select the cohort to explore in the side bar.

What to look for

For exposure cohorts: is there sufficient time after index (either within the cohort for on-treatment analyses, or until the end of observation for intent-to-treat type analyses) to observe the outcome of interest?
Are there many cohorts with length = 0 when this is not expected?
Are the distributions comparable across data sources?

Time Distributions

Table Plot

View Time Measures

View Columns

Comments

Add comment

Related Cohorts

Comment :

Write
Preview

Concepts in Data Source

Description

A table showing the concept ids observed in the database that are included in a concept set(s) of the selected cohort. The Subjects column contains the number of subjects in the entire database that have the specific concept. This count is not restricted to people in the cohort - but represents a database level characterization. Source concepts are identified in the _source_concept_id fields of the Common Data Model, (e.g. drug_source_concept_id) and are used to identify the specific source codes used in a database. Standard concepts are found using the _concept_id fields (e.g. drug_concept_id), and use the same coding system across all databases. Note: Per CDM conventions standard concept ids, may be used to populate _source_concept_id fields in domain tables, but non-standard concept ids may not be used to populate the standard fields in those domain tables.

Options

You can select a database in the side bar to see the concepts and counts observed in that database.

Select the cohort and the specific concept set within that cohort to explore in the side bar.

You can switch between Source Concepts and Standard Concepts at the top of the table.

What to look for

Are there source codes included that should not be? For example, in a concept set for hypertensive disorder, are hypotension codes included by accident?
Are all expected codes present? For example, if we have a list of ICD-10 codes that have been used in literature to identify a cohort, are all those codes present?

Source fields Standard fields

Both Persons Records

Comments

Add comment

Related Cohorts

Comment :

Write
Preview

Orphan Concepts

Description

A table showing the concept(s) observed in the datasource that are not included in a concept set of a cohort, but maybe considered. The following logic is used to identify concepts that might be relevant:

Given a concept set expression, find all included concepts.
Find all names of those concepts, including synonyms, and the names of source concepts that map to them.
Search for concepts (standard and source) that contain any of those names as substring.
Filter those concepts to those that are not in the original set of concepts (i.e. orphans).
Restrict the set of orphan concepts to those that appear in the CDM data source as either source concept or standard concept.

The Subjects column contains the number of subjects in the entire data source that have the specific concept, i.e. it is not restricted to people in the cohort. This is a data source level characterization. Source concepts are identified in the _source_concept_id fields of the Common Data Model, (e.g. drug_source_concept_id) and are used to identify the specific source codes used in a data source. Standard concepts are found using the _concept_id fields (e.g. drug_concept_id), and use the same coding system across all databases.

Options

You can select a data source in the side bar to see the concepts and counts observed in that data source.

Select the cohort and the specific concept set within that cohort to explore in the side bar.

What to look for

Are there concepts that are not included in the concept but should be? Note that the provided list likely contains many false positives.

Filters

All Standard Only Non Standard Only

Display

All Persons Records

Comments

Add comment

Related Cohorts

Comment :

Write
Preview

Index Events

Description

A table showing the concepts belonging to the concept sets in the entry event definition that are observed on the index date. In other words, the table lists the concepts that likely triggered the cohort entry. The counts indicate number of cohort entries where the concepts was observed on the index date. Note that multiple concepts can be present on the index date, so the sum of counts might be greater than the cohort entry count.

Options

You can select multiple databases in the side bar to see counts from different databases side-by-side.

Select the cohort to explore in the side bar.

What to look for

Is one concept unexpectedly dominating? For example, if our cohort identifies exposure to drugs in a class, but we notice almost everyone enters the cohort based on a single drug, we may wonder whether our results will generalize to the class.
Are the highest ranking concepts different across databases? For example, is everyone in one database initiating high-dose prescriptions, and everyone in another database low-dose prescriptions?

Concept type

All Standard concepts Non Standard Concepts

Display

Both Records Persons

Show as percentage

Comments

Add comment

Related Cohorts

Comment :

Write
Preview

Visit Context

Description

A table showing the relationship between the cohort start date and visits recorded in the database. For each database, the table shows:

Visits Before: the number of visits recorded before the cohort start date. Note that if a person is in the same cohort twice, visits may be counted twice.
Visits Ongoing: the number of visits that were ongoing (excluding the visit start date) when the cohort started. Note that if a person is in the same cohort twice, visits may be counted twice.
Starting Simultaneous: the number of visits that started on the same day the cohort started.
Visits After: the number of visits recorded after the cohort start date. Note that if a person is in the same cohort twice, visits may be counted twice.

Options

You can select multiple databases in the side bar to see counts from different databases side-by-side.

Select the cohort to explore in the side bar.

What to look for

Are cohorts starting in the right context? E.g. some cohorts may be expected to start predominantly in an inpatient setting.

Display

All Before During Simultaneous After

Display

Persons Records

Comments

Add comment

Related Cohorts

Comment :

Write
Preview

Cohort Overlap (subjects)

Description

Stacked bar graph showing the overlap between two cohorts, and a table listing several overlap statistics.

The stacked bar shows the overlap in terms of subjects. It shows the number of subjects that belong to each cohort and to both. The diagram does not consider whether the subjects were in the different cohorts at the same time.

The table show the same information and more:

Subject in either cohort: The number of subjects that enter one or both cohorts. (The union)
Subject in both cohort: The number of subjects that enter both cohorts, although not necessarily at the same time. (The intersection)
Subject in target not in comparator: The number of subjects that enter the target cohort, but not the comparator cohort. (Subtracting the comparator from the target)
Subject in comparator not in target: The number of subjects that enter the comparator cohort, but not the target cohort. (Subtracting the comparator from the target)
Subject in target before comparator: The number of subjects that enter both cohorts, but enter the target cohort before entering the comparator cohort. This number considers only the first entry per cohort per person.
Subject in comparator before target: The number of subjects that enter both cohorts, but enter the comparator cohort before entering the target cohort. This number considers only the first entry per cohort per person.
Subject in target and comparator on same day: The number of subjects that enter both cohorts on the same date. This number considers only the first entry per cohort per person.
Subject having target start during comparator: The number of subjects that enter the target cohort during the comparator cohort, meaning comparator cohort start date <= target cohort start date <= comparator cohort end date. This number considers only the first entry per cohort per person.
Subject having comparator start during target: The number of subjects that enter the comparator cohort during the target cohort, meaning target cohort start date <= comparator cohort start date <= target cohort end date. This number considers only the first entry per cohort per person.

Options

You can select one or more database in the side bar.

You can select the (target) cohort(s) and comparator cohort(s) in the side bar.

What to look for

Are there many people in both cohorts? For example, if we want to compare two exposures, are there many people that receive both?
Is the overlap of sufficient size for a specific research question? For example, if we wish to study the effect of an exposure on an outcome, we may require a minimum number of outcomes during exposure.

Plot
Table

Percentages Counts

Show As Percentage

Show Cohort Ids

Comments

Add comment

Related Cohorts

Comment :

Write
Preview

Cohort Characterization

Description

A table showing cohort characteristics (covariates). These characteristics are captured on or before the cohort start date. There is a Pretty and a Raw version of this table.

The Pretty table shows the standard OHDSI characteristics table, which includes only covariates that were manually selected to provide a general overview of the comorbidities and medications of the cohort. These are all binary covariates, and the table shows the proportion (%) of the cohort entries having the covariate.

The Raw table shows all captured covariates. These include binary and continuous covariates (e.g. the Charlson comorbidity index). For each covariate the table lists the mean, which for binary covariates is equal to the proportion, and the standard deviation (SD).

Options

You can select multiple databases in the side bar to see cohort characteristics from different databases side-by-side in the same table.

Select the cohort to explore in the side bar.

Select either the Pretty or the Raw table at the top of the table.

What to look for

Are the characteristics of the cohort as expected? For example, do people have the expected comorbidities?
Do the characteristics of the cohort differ much per database?

Table type

Pretty Raw

Select Cohort

Select Database (s)

Temporal Window (s)

Analysis name

Domain name

Covariate type(s)

All Proportion Continuous

Percentage displayed where only proportional data is selected

Display

Mean and Standard Deviation Mean only

Subset to Concept Set

Group by Database
Group by Time ID

Comments

Add comment

Related Cohorts

Comment :

Write
Preview

Compare Cohort Characterization

Description

A table or plot showing cohort characteristics (covariates) for two cohorts side-by-side. These characteristics are captured at different time windows that can be selected

The plot shows all covariates, include binary and continuous covariates. The x-axis represents the mean value in the target cohort, the y-axis the mean value in the comparator cohort. Each dot represents a covariate, and the color indicates the domain of the covariate being plotted. In the plot, domains are fixed (even though additional domains may exist in data) to ensure the color of the domains are consistently applied.

Filters maybe used to limit the number of covariates being visualized/tabulated. Filters are available for analysis names and domain names.

You can either select different cohorts in the same database, the same cohort in different database or different cohorts in different databases

What to look for

Are there major differences between the two cohorts? For example, if we wish to compute a propensity score between two cohorts, concepts that have very high proportion in one cohort and a very low proportion in the other may lead to a perfectly predictive model.
In general, how comparable are two cohorts? If we wish to compare two exposures, but the cohorts differ over many characteristics, we may be able to fit a propensity model and compute an estimate, but we may have concerns over the generalizability of the results.

Compare cohort characterization

Target Cohort

Target Database

Comparator Cohort

Comparator Database

Temporal Window (s)

Analysis name

Domain name

Min Covariate Mean

Plot
Raw Table

Covariate Type

All Proportion Continuous

Display values

Mean Mean and Standard Deviation

Show only covariates found in target and comparator

Temporal Window

Comments

Add comment

Related Cohorts

Comment :

Write
Preview

Execution meta-data

Each entry relates to execution on a given cdm. Results are merged between executions incrementally

Powered by OHDSI Cohort Diagnostics applicationVersion: 3.1.2.Application was last initated on 2025-09-07 14:24:42 EST. Cohort Diagnostics website is at https://ohdsi.github.io/CohortDiagnostics/

Cohort Definition

Cohort Counts

Description

Options

What to look for

Inclusion Rule Statistics

Comments

Add comment

Incidence Rates

Description

Options

What to look for

Time Distributions

Description

Options

What to look for

Time Distributions

Comments

Add comment

Concepts in Data Source

Description

Options

What to look for

Comments

Add comment

Orphan Concepts

Description

Options

What to look for

Comments

Add comment

Index Events

Description

Options

What to look for

Comments

Add comment

Visit Context

Description

Options

What to look for

Comments

Add comment

Cohort Overlap (subjects)

Description

Options

What to look for

Comments

Add comment

Cohort Characterization

Description

Options

What to look for

Comments

Add comment

Compare Cohort Characterization

Description

What to look for

Compare cohort characterization

Comments

Add comment

Execution meta-data

Powered by OHDSI Cohort Diagnostics applicationVersion: 3.1.2.Application was last initated on 2025-09-07 14:24:42 EST. Cohort Diagnostics website is at https://ohdsi.github.io/CohortDiagnostics/