Data Catalog

Data Catalog

See below for listing of available years and data files contained within each data asset type.

 

Nationally Representative Samples

Data Asset

Description

Years Included

Details

Data Asset

Description

Years Included

Details

Medicare 5% National Sample refreshed annually

 

Claims for a nationally representative random sample of all fee-for-service (FFS) Medicare beneficiaries. This includes Medicare Part D prescription claims for beneficiaries with Part D coverage. The Part D files do not have information on drugs given during hospitalizations or which are paid under other auspices (e.g., hospice). Please note that 2006 is the first calendar year Medicare Part D claims became available to researchers, and is the first calendar year of Part D claims available through DPHS.

1991 - 2022

 

Medicare Part D Claims (first available to researchers in calendar year 2006):

 

Medicare 100% Inpatient 
refreshed annually

 

Inpatient hospitalization claims and master beneficiary summary files. (i.e. enrollment, chronic conditions, cost and utilization). Does not include Medicare Part D prescription claims.

2000 - 2022

 

 

Medicare ACO

Beneficiary claims aligned with providers participating in an Accountable Care Organization (ACO). This includes Medicare Part D prescription claims. The Part D files do not have information on drugs given during hospitalizations or which are paid under other auspices (e.g., hospice).

2011 - 2014

 

 

Medicare 5% Limited Data Set (LDS) refreshed annually

Claims for a nationally representative random sample of all Medicare beneficiaries; does not include prescription drug claims. Less stringent criteria for CMS DUA approval

2010 - 2021

 

Note: CMS Limited Data Set Files are not available for Medicare Part D claims.

 

Medicare 100% Limited Data Set (LDS) refreshed annually

Inpatient file claims for 100% of Medicare beneficiaries; does not include prescription drug claims. Less stringent criteria for CMS DUA approval

2010 - 2021

 

  • 2010-2021 100% FFS Inpatient

  • 2010-2015 100% Denominator

  • 2016-2021 100% Master Beneficiary Summary file (MBSF): Base segment

 

Geographic Samples

Data Asset

Description

Years Included

Details

Data Asset

Description

Years Included

Details

Medicare 100% NC/SC 

Claims for beneficiaries in North and South Carolina

2013 - 2017

 

This cohort includes claims files for Medicare beneficiaries who resided in North Carolina or in South Carolina from 2013-2017. Please note that for beneficiaries who moved into North Carolina or South Carolina in calendar year 2017, we do not have prior year claims. Beneficiaries who resided in NC or SC at any point between 2013-2016 and were alive in 2017 have complete data for all years (2013-2017).

Data Asset Details

Medicare Part D Prescription Claims

 

Medicare 100% SEDI

Claims for beneficiaries participating in the SEDI project (Durham/Cabarrus NC, Mingo WV, Quitman MS, plus border counties)

2009 - 2014

 

Medicare 20% Geographic Sample

Per state claims based on 20% beneficiary random sample in Florida, New York, Alabama, Tennessee, Illinois, and Louisiana

2013 - 2016

 

Duke EHR-SEDI Data Mart

Medicare claims linked to Duke University Health System EHR data for patients with a Durham Country, NC address

2007 - 2014

 

Disease Cohorts

Data Asset

Description

Years Included

Data Asset

Description

Years Included

100% Medicare Mitral Valve PX

Claims for beneficiaries who have undergone a mitral valve procedure

2006 - 2014

Registry Linkages

Data Asset

Description

Years Included

Details

Data Asset

Description

Years Included

Details

GWTG-HF

Claims for beneficiaries linked to AHA's Get With the Guidelines-Heart Failure registry of patients hospitalized for heart failure

2003 - 2016

 

Get With the Guidelines Heart Failure Registry

Publications

Fonarow GC, Abraham WT, Albert NM, Gattis WA, Gheorghiade M, Greenberg B, O’Connor CM, Yancy CW, Young J. Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients with Heart Failure (OPTIMIZE-HF): rationale and design. Am Heart J 2004;148:43–51.

Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA, Curtis LH. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. Am Heart J 2009;157:995–1000.

 

PROSPER

Claims for beneficiaries linked to AHA's Get With the Guidelines-Stroke registry of patients hospitalized for stroke

2003 - 2015

 

Publication

Schwamm LH, Fonarow GC, Reeves MJ, Pan W, Frankel MR, Smith EE, Ellrodt G, Cannon CP, Liang L, Peterson E, Labresh KA. Get With the Guidelines-Stroke is associated with sustained improvement in care for patients hospitalized with acute stroke or transient ischemic attack. Circulation. 2009;119:107-115.

 

 

 

 

Data Asset

Description

Years Included

Data Asset

Description

Years Included

NC Medicaid Fee-for-service Claims

NC Medicaid fee-for-service claims data (limited data set) of payments from the NCDHHS to healthcare providers for services rendered. Additional files include member and provider files.

July 1, 2013 - March 31, 2025

NC Medicaid Managed Care Encounters

NC Medicaid managed care encounter data (limited data set) of payments from the NCDHHS to healthcare providers for services rendered. Additional files include member and provider files.

July 1, 2021 - March 31, 2025

NC Medicaid Babylove Crosswalk

Crosswalk of NC Medicaid beneficiary mother-infant dyads

January 1, 2001 - December 31, 2021

NC Medicaid Data Request Process

NC Medicaid Data Dashboard

 

 

Data Asset

Description

Years Included

Data Asset

Description

Years Included

Kids’ Inpatient Database (KID)

All-payer pediatric database of hospital stays.

2000 - 2012 (varies by file);

2016-2022

National Ambulatory Surgery Sample (NASS)

“The NASS is the only all-payer ambulatory surgery database in the United States, yielding national estimates of selected therapeutic ambulatory surgery encounters performed in hospital-owned facilities.”

2016-2022

National Emergency Department Sample (NEDS)

“The NEDS is the largest all-payer emergency department (ED) database in the United States, yielding national estimates of hospital-owned ED visits.”

2016-2022

National Inpatient Sample (NIS)

Created from the largest publicly available all-payer inpatient health care database in the U.S.

1994 - 2015 (varies by file);

2016-2022

National Readmission Database (NRD)

NRD is a nationally representative sample of hospital readmissions for all ages.

2016-2022

State Ambulatory Surgery and Services Databases (SASD)

State-specific database of ambulatory surgery data and outpatient services data from hospital-owned facilities.

2000 - 2023 (varies by state and file)

State Inpatient Databases (SID)

State-specific databases of inpatient discharge records.

2000 - 2023 (varies by state and file)

State Emergency Department Databases (SEDD)

State-specific databases of emergency visits at hospital-affiliated emergency departments that do not result in hospitalization.

2005 - 2023 (varies by state and file)

 

 

 

Data Asset

Description

Years Included

Data Asset

Description

Years Included

American Hospital Association (AHA) Survey

Annual AHA survey data allow researchers to understand utilization, physician arrangements, organizational structure and more. See the AHA Annual Survey Website for more information.

2015, 2018, 2020, 2023

 

 

 

Data Asset

Description

Years Included

Data Asset

Description

Years Included

Get With the Guidelines–Heart Failure (GWTG-HF) Registry

Get With the Guidelines-Heart Failure (GWTG-HF) is a registry from the American Heart Association that includes data on hospital admissions for heart failure from many hospitals throughout the U.S. 

DataShare has a method for linking the GWTG-HF data to Medicare 100% inpatient claims using indirect identifiers, originally developed by Brad Hammill. See this paper for details

See GWTG-HF Data Documentation

 

 

 

Data Asset

Description

Years Included

Data Asset

Description

Years Included

Jackson Heart Study (JHS)

DataShare has linked Medicare Fee-for-Service claims data for the JHS cohort. Duke is no longer JHS Vanguard Center (effective 8/12/2024). Researchers must obtain JHS data access from the JHS Coordinating Center (jhsccdc@umc.edu).

 

2014 - 2021

 

 

 

Data Asset

Description

Years Included

Data Asset

Description

Years Included

Reference File Library - RefLib

The PopHealth DataShare Reference File Library "RefLib" is a collection of publicly available terminologies/vocabularies and data that can be useful additions to healthcare data analyses. For example, the collection includes ICD9/ICD10 terminologies as well as CPT, NDC, and NPI Taxonomy.

RefLib contents are sourced from multiple entities including the Unified Medical Language System (UMLS) and the Agency for Healthcare Research and Quality (AHRQ).

See documentation RefLib Documentation (pdf)

 

 

 

Data Asset

Description

Years Included

Data Asset

Description

Years Included

SEER-Medicare Linked Database

The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the
Surveillance, Epidemiology and End Results (SEER) program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare
claims for covered health care services from the time of a person's Medicare eligibility until death.


The linkage of these two data sources results in a unique population-based source of information that can be used for an array of epidemiological and health services research. For example,
investigators using this combined dataset have conducted studies on patterns of care for persons with cancer before a cancer diagnosis, over the period of initial diagnosis and treatment, and
during long-term follow-up. Investigators have also examined the use of cancer tests and procedures and the costs of cancer treatment.

See SEER Website