Background
The digitization of medical data and advances in interoperability have opened opportunities for research studies to use more comprehensive, longitudinal patient data from multiple sources. As patients often interact with many providers and payers over time, collecting data across organizations may have critical implications for accuracy and bias in study results. US policy has promoted exchanging health information among providers, payers, and patients, but less attention has focused on facilitating data collection for research, which presents unique challenges.
Objective
This study aimed to identify and evaluate existing and emerging approaches for collecting comprehensive provider and payer data for research in the United States, with the goals of informing researchers of possible methods and generating evidence to inform policy initiatives. Our focus was on electronic approaches to data aggregation for studies requiring patient consent.
Methods
We conducted a landscape analysis through interviews with subject matter experts (SMEs). SMEs were selected based on expertise. We created a list of evaluation criteria, identified existing and emerging approaches, and described the benefits and limitations of each approach by applying the evaluation criteria. We interviewed SMEs until saturation was achieved. Data collection was limited to the United States.
Results
A total of 20 SMEs helped identify 8 distinct approaches: (1) general-purpose smartphone app, (2) commercial app, (3) research community app, (4) structured data export, (5) Trust Exchange Framework and Common Agreement Individual Access Service, (6) regional study query, (7) national study query, and (8) aggregated data source. Participant-mediated exchange approaches (1-5) leveraged patients’ right of access. Three approaches leveraged existing data exchange services (5-7). To evaluate these approaches, we identified 12 criteria, including perspectives of participants, research teams, and broader stakeholders. Each approach had benefits and limitations; no single approach emerged as superior for all use cases. While currently available approaches for participant-mediated exchange bypass the need for complex governance arrangements, they are limited by participant burden, effort needed by research teams, and data gaps, especially from payers. Some regional health information exchanges and aggregated data sources address governance challenges and can provide services such as preparing analytic datasets but are restricted to specific locations and/or data-source coverage. National networks currently do not allow queries for research and confront challenges in establishing trust and enforcing compliance with data-sharing requirements among network sites.
Conclusions
Collecting comprehensive health data from multiple providers and payers in the United States is a complex and evolving process. The suitability of an approach may vary based on the needs of a study. Given the numerous barriers and the lack of a clear dominant method, further exploration and benchmark comparisons of all identified approaches are necessary. Ongoing public policy efforts will likely play an important role in progress.

