Mother-infant linkage algorithm evaluation

This research developed and evaluated an algorithm to link mothers and infants in two observational US administrative databases to facilitate research on prenatal medication exposure and infant health outcomes.


Background: Administrative healthcare claims databases are used in drug safety research but are limited for investigating the impacts of prenatal exposures on child outcomes without mother-infant pair identification.

Objective: We developed a mother-infant linkage algorithm that builds on other linkage approaches and applied and evaluated it in two, large US commercially insured populations.

Study Design: We used two US commercial health insurance claims databases covering 2000 to 2021. Mother-infant links were constructed where persons of female sex 12-55 years of age with a pregnancy ending in live birth were associated with a person who was 0 years of age at database entry, who share a common insurance plan ID, had overlapping observation time, and whose date of birth was within ±60-days of the mother’s pregnancy episode end date. We compared the characteristics of linked vs non-linked mothers and linked vs non-linked infants to assess similarity.

Results: The algorithm linked 3,477,960 mothers to 4,160,284 infants in the two databases. Linked mothers and linked infants comprised 73.6% of all mothers and 49.1% of all-infants, respectively. 94.9% of linked infants dates of birth were within 4 weeks of the associated mother’s pregnancy episode end dates. Linked mothers were older, had longer pregnancy episodes, and had greater post-pregnancy observation time than non-linked mothers. Linked infants had less observation time and greater healthcare utilization than non-linked infants. All other linked vs non-linked characteristics were similar in mothers and infants.

Conclusion: We applied a novel mother-infant linkage algorithm to two US commercial healthcare claims databases and achieved a high linkage proportion and demonstrated that linked and non-linked mother and infant cohorts were similar. This enables large-scale research on exposures during pregnancy and pediatric outcomes with relevance to drug safety. Linked vs non-linked population differences may be partially attributable to shared healthcare billing practices within insurance plan IDs among linked mothers and infants.

Key findings:

  • This study created the largest known mother-infant linked cohorts.
  • Linked mothers and linked infants comprise a large proportion all mothers and all infants, respectively.
  • Linked vs. non-linked mother and infant cohorts are similar indicating internal validity.
  • Similarity between linked vs. non-linked mother and infant cohorts increases confidence that results from research on linked cohorts also apply mother and infant populations that do not meet linkage algorithm criteria.

  • Below are links for study-related artifacts that have been made available as part of this study:

    Index: pregnancy start

    Index: pregnancy end

    Index: birth