The Registry of cCREs provides a unified framework for interpreting noncoding regulatory variation by integrating chromatin accessibility, transcription factor binding, functional assays, gene expression, and 3D genome organization. Together with SCREEN, this framework enables systematic exploration of regulatory landscapes and supports hypothesis-driven prioritization of genes and variants at trait-associated loci. To illustrate this approach, we dissected a red blood cell (RBC) trait locus spanning the RTBDN–MAST1 region, demonstrating how a cCRE-centric workflow can clarify candidate genes and nominate variants for functional follow-up.
The Registry is widely used as a shared regulatory reference for diverse analyses. Common applications include selecting regulatory regions as features for predictive models, benchmarking computational methods, anchoring study-specific datasets (e.g., DNA methylation, single-cell ATAC-seq, chromatin states), and guiding experimental design. In the context of human genetics, cCREs are frequently used to interpret GWAS results, prioritize noncoding variants, and aggregate rare variants within functionally relevant regulatory regions. SCREEN enables these analyses by providing interactive access to cCREs and their associated annotations across biosamples.
Identifying relevant cellular contexts is a critical step in interpreting trait-associated loci. To address this, we previously developed a systematic framework—Variant Enrichment and Sample Prioritization Analysis (VESPA)—that quantifies enrichment of GWAS variants within cCREs active in specific biosamples relative to matched genomic controls. Applying this approach to red blood cell traits revealed strong enrichment in K562 cCREs, consistent with the erythroid-like properties of this cell line. This type of enrichment analysis helps nominate relevant tissues or cell types for downstream regulatory and functional interpretation. We've computed cell type enrichments for hundreds of GWAS which are all available on SCREEN.
KLF1 emerged as the top candidate gene through multiple independent lines of evidence. Among all genes linked to trait-associated cCREs via proximity, 3D chromatin interactions, or CRISPR perturbation data, KLF1 is the only gene with highly specific expression in erythroid contexts. It is a well-established master regulator of erythroid differentiation, and coding variants in KLF1 are known to cause red blood cell disorders. In addition, multiple cCREs overlapping RBC-associated variants are linked to KLF1 and show allele-specific chromatin accessibility in blood cells, further supporting a regulatory connection. Together, these features make KLF1 the most parsimonious primary causal gene for the observed trait associations.
The Registry enables variant prioritization by anchoring multiple types of evidence on cCREs. In this locus, we combined information on coding impact, overlap with regulatory elements, allele-specific chromatin accessibility, transcription factor binding, 3D chromatin interactions, and links to prioritized genes. By aggregating these signals using a rank-averaging approach, we identified a small set of variants with strong support for functional relevance, including variants affecting enhancer activity and chromatin architecture. This strategy provides a systematic path from GWAS signal to experimentally testable hypotheses.