Data Catalog
See below for listing of available years and data files contained within each data asset type.
Nationally Representative Samples
Data Asset | Description | Years Included | Details |
|---|
Data Asset | Description | Years Included | Details |
|---|---|---|---|
Medicare 5% National Sample refreshed annually
| Claims for a nationally representative random sample of all fee-for-service (FFS) Medicare beneficiaries. This includes Medicare Part D prescription claims for beneficiaries with Part D coverage. The Part D files do not have information on drugs given during hospitalizations or which are paid under other auspices (e.g., hospice). Please note that 2006 is the first calendar year Medicare Part D claims became available to researchers, and is the first calendar year of Part D claims available through DPHS. | 1991 - 2022 |
Medicare Part D Claims (first available to researchers in calendar year 2006):
|
Medicare 100% Inpatient
| Inpatient hospitalization claims and master beneficiary summary files. (i.e. enrollment, chronic conditions, cost and utilization). Does not include Medicare Part D prescription claims. | 2000 - 2022 |
|
Medicare ACO | Beneficiary claims aligned with providers participating in an Accountable Care Organization (ACO). This includes Medicare Part D prescription claims. The Part D files do not have information on drugs given during hospitalizations or which are paid under other auspices (e.g., hospice). | 2011 - 2014 |
|
Medicare 5% Limited Data Set (LDS) refreshed annually | Claims for a nationally representative random sample of all Medicare beneficiaries; does not include prescription drug claims. Less stringent criteria for CMS DUA approval | 2010 - 2021 |
Note: CMS Limited Data Set Files are not available for Medicare Part D claims.
|
Medicare 100% Limited Data Set (LDS) refreshed annually | Inpatient file claims for 100% of Medicare beneficiaries; does not include prescription drug claims. Less stringent criteria for CMS DUA approval | 2010 - 2021 |
|
Geographic Samples
Data Asset | Description | Years Included | Details |
|---|
Data Asset | Description | Years Included | Details |
|---|---|---|---|
Medicare 100% NC/SC | Claims for beneficiaries in North and South Carolina | 2013 - 2017 |
This cohort includes claims files for Medicare beneficiaries who resided in North Carolina or in South Carolina from 2013-2017. Please note that for beneficiaries who moved into North Carolina or South Carolina in calendar year 2017, we do not have prior year claims. Beneficiaries who resided in NC or SC at any point between 2013-2016 and were alive in 2017 have complete data for all years (2013-2017). Data Asset Details
Medicare Part D Prescription Claims
|
Medicare 100% SEDI | Claims for beneficiaries participating in the SEDI project (Durham/Cabarrus NC, Mingo WV, Quitman MS, plus border counties) | 2009 - 2014 |
|
Medicare 20% Geographic Sample | Per state claims based on 20% beneficiary random sample in Florida, New York, Alabama, Tennessee, Illinois, and Louisiana | 2013 - 2016 |
|
Duke EHR-SEDI Data Mart | Medicare claims linked to Duke University Health System EHR data for patients with a Durham Country, NC address | 2007 - 2014 |
|
Disease Cohorts
Data Asset | Description | Years Included |
|---|
Data Asset | Description | Years Included |
|---|---|---|
100% Medicare Mitral Valve PX | Claims for beneficiaries who have undergone a mitral valve procedure | 2006 - 2014 |
Registry Linkages
Data Asset | Description | Years Included | Details |
|---|
Data Asset | Description | Years Included | Details |
|---|---|---|---|
GWTG-HF | Claims for beneficiaries linked to AHA's Get With the Guidelines-Heart Failure registry of patients hospitalized for heart failure | 2003 - 2016 |
Get With the Guidelines Heart Failure Registry Publications Fonarow GC, Abraham WT, Albert NM, Gattis WA, Gheorghiade M, Greenberg B, O’Connor CM, Yancy CW, Young J. Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients with Heart Failure (OPTIMIZE-HF): rationale and design. Am Heart J 2004;148:43–51. Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA, Curtis LH. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. Am Heart J 2009;157:995–1000.
|
PROSPER | Claims for beneficiaries linked to AHA's Get With the Guidelines-Stroke registry of patients hospitalized for stroke | 2003 - 2015 |
Publication Schwamm LH, Fonarow GC, Reeves MJ, Pan W, Frankel MR, Smith EE, Ellrodt G, Cannon CP, Liang L, Peterson E, Labresh KA. Get With the Guidelines-Stroke is associated with sustained improvement in care for patients hospitalized with acute stroke or transient ischemic attack. Circulation. 2009;119:107-115.
|
Data Asset | Description | Years Included |
|---|
Data Asset | Description | Years Included |
|---|---|---|
NC Medicaid Fee-for-service Claims | NC Medicaid fee-for-service claims data (limited data set) of payments from the NCDHHS to healthcare providers for services rendered. Additional files include member and provider files. | July 1, 2013 - March 31, 2025 |
NC Medicaid Managed Care Encounters | NC Medicaid managed care encounter data (limited data set) of payments from the NCDHHS to healthcare providers for services rendered. Additional files include member and provider files. | July 1, 2021 - March 31, 2025 |
NC Medicaid Babylove Crosswalk | Crosswalk of NC Medicaid beneficiary mother-infant dyads | January 1, 2001 - December 31, 2021 |
Data Asset | Description | Years Included |
|---|
Data Asset | Description | Years Included |
|---|---|---|
Kids’ Inpatient Database (KID) | All-payer pediatric database of hospital stays. | 2000 - 2012 (varies by file); 2016-2022 |
“The NASS is the only all-payer ambulatory surgery database in the United States, yielding national estimates of selected therapeutic ambulatory surgery encounters performed in hospital-owned facilities.” | 2016-2022 | |
“The NEDS is the largest all-payer emergency department (ED) database in the United States, yielding national estimates of hospital-owned ED visits.” | 2016-2022 | |
Created from the largest publicly available all-payer inpatient health care database in the U.S. | 1994 - 2015 (varies by file); 2016-2022 | |
NRD is a nationally representative sample of hospital readmissions for all ages. | 2016-2022 | |
State-specific database of ambulatory surgery data and outpatient services data from hospital-owned facilities. | 2000 - 2023 (varies by state and file) | |
State-specific databases of inpatient discharge records. | 2000 - 2023 (varies by state and file) | |
State-specific databases of emergency visits at hospital-affiliated emergency departments that do not result in hospitalization. | 2005 - 2023 (varies by state and file) |
Data Asset | Description | Years Included |
|---|
Data Asset | Description | Years Included |
|---|---|---|
Annual AHA survey data allow researchers to understand utilization, physician arrangements, organizational structure and more. See the AHA Annual Survey Website for more information. | 2015, 2018, 2020, 2023 |
Data Asset | Description | Years Included |
|---|
Data Asset | Description | Years Included |
|---|---|---|
Get With the Guidelines-Heart Failure (GWTG-HF) is a registry from the American Heart Association that includes data on hospital admissions for heart failure from many hospitals throughout the U.S. DataShare has a method for linking the GWTG-HF data to Medicare 100% inpatient claims using indirect identifiers, originally developed by Brad Hammill. See this paper for details |
Data Asset | Description | Years Included |
|---|
Data Asset | Description | Years Included |
|---|---|---|
DataShare has linked Medicare Fee-for-Service claims data for the JHS cohort. Duke is no longer JHS Vanguard Center (effective 8/12/2024). Researchers must obtain JHS data access from the JHS Coordinating Center (jhsccdc@umc.edu).
| 2014 - 2021 |
Data Asset | Description | Years Included |
|---|
Data Asset | Description | Years Included |
|---|---|---|
The PopHealth DataShare Reference File Library "RefLib" is a collection of publicly available terminologies/vocabularies and data that can be useful additions to healthcare data analyses. For example, the collection includes ICD9/ICD10 terminologies as well as CPT, NDC, and NPI Taxonomy. RefLib contents are sourced from multiple entities including the Unified Medical Language System (UMLS) and the Agency for Healthcare Research and Quality (AHRQ). | See documentation RefLib Documentation (pdf) |
Data Asset | Description | Years Included |
|---|
Data Asset | Description | Years Included |
|---|---|---|
The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the
|