Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Current »

Introduction

This page contains details on the PCORNet CDM refresh processes for both the quarterly main extract and weekly mini-CDM extract. The data from both of these extracts is used by queries submitted by the PCORNet Coordinating Center.

Background

Originally, there was only 1 ETL process for the STAR CRN. A quarterly refresh process runs every 3 months in January, April, July, and October. Once the local process is finished the data is submitted for review by the PCORNet Coordinating Center. Once approved, the latest data is then released to the general public for access.

With the COVID19 pandemic, there was a need to have a subset of data available pertaining to COVID19 and any related data. This data needed to be available faster than a quarterly refresh. We re-designed the ETL process to be able to run weekly, using a smaller cohort of data related to COVID19 diagnoses.

Details

Use Cases

The PCORNet CDM data is being used for multiple projects. These projects include:

Technical Information

Programming Languages

The majority of the code being used is in Python, being run under the Airflow scheduling platform.

Connection Information



DevelopmentProduction
Airflow Server Namehttps://plt-nosql01.dhe.duke.edu:8083/admin/airflow/https://plp-nosql01.dhe.duke.edu:8083/admin/airflow/
Exadata DB Environmenteclrrel1eclrprd1
Caboodle DB Environmentmccs-rel.dhe.duke.educaboodleprod.dhe.duke.edu

Processes

  • COVID19 Keys Update
    • Git Location:
  • STAR_ETL_Master
    • Git Location:
  • STAR_Inc_Load
    • Git Location:
  • STAR_Post_ETL
    • Git Location:

References

Troubleshooting / FAQ

  • No labels