Table of Contents |
---|
This page contains details on the PCORNet CDM refresh processes for both the quarterly main extract and weekly mini-CDM extract. The data from both of these extracts is used by queries submitted by the PCORNet Coordinating Center.
Background
Originally, there was only 1 ETL process for the STAR CRN. A quarterly refresh process runs every 3 months in January, April, July, and October. Once the local process is finished the data is submitted for review by the PCORNet Coordinating Center. Once approved, the latest data is then released to the general public for access.
With the COVID19 pandemic, there was a need to have a subset of data available pertaining to COVID19 and any related data. This data needed to be available faster than a quarterly refresh. We re-designed the ETL process to be able to run weekly, using a smaller cohort of data related to COVID19 diagnoses.
Details
Use Cases
The PCORNet CDM data is being used for multiple projects. These projects include:
Technical Information
Programming Languages
The majority of the code being used is in Python, being run under the Airflow scheduling platform.
Connection Information
Development | Production | |
---|---|---|
Airflow Server Name | https://plt-nosql01.dhe.duke.edu:8083/admin/airflow/ | https://plp-nosql01.dhe.duke.edu:8083/admin/airflow/ |
Exadata DB Environment | eclrrel1 | eclrprd1 |
Caboodle DB Environment | mccs-rel.dhe.duke.edu | caboodleprod.dhe.duke.edu |