Reproduce the applied example in A translational framework of genoproteomic studies for cardiovascular drug discovery
YANG, Zhao
2024-08-15 14:36:09
Codes_for_Paper_1.Rmd
This document contains materials required for reproducing the results of the applied example in Yang, Z., Zhao, J.V., Qi, Y. et al. A translational framework of genoproteomic studies for cardiovascular drug discovery. npj Cardiovasc Health 1, 12 (2024).. More details of the translational framework, please refer to our manuscript.
Note: To run the applied example successfully, you have to edit the scripts with your own data, parameters, and physical paths.
1. Prerequests
1.1 Packages
To run the codes for the applied example, one have to install pkg in R from GitHub, as follows,
# devtools::install_github("yangzhao98/drugTargetScreen",force=TRUE)
library(drugTargetScreen)
path.res <- getwd()
load(paste0(path.res,"/",grep("RealExample_1_causal_bio",list.files(path.res),value=TRUE)))
load(paste0(path.res,"/",grep("RealExample_2_discovery_",list.files(path.res),value=TRUE)))
load(paste0(path.res,"/",grep("RealExample_3_side_",list.files(path.res),value=TRUE)))
load(paste0(path.res,"/",grep("RealExample_4_target",list.files(path.res),value=TRUE)))
Moreover, one may have to install PLINK for LD clumping, with ancestry-specific 1000 Genome Project as the reference, available at MRCIEU.
1.2 Data sources
- GWAS summary statistics for LDL-C are available at the Global Lipids Genetics Consortium
- GWAS summary statistics for CAD are available at the CARDIoGRAMplusC4D Consortium
- GWAS summary statistics for 78 clinical endpoints and LDL-C in the UK Biobank are available at Neale lab
- Summary statistics of cis- and trans-eQTLs are available at the eQTLGen phase 1
2. Reproduce the results of the applied example
This applied example replicates three promising LDL-C-lowering therapeutic targets encoded by APOB Ference et al. (2019), LDLR Ference et al. (2019), and LPA Trinder et al. (2020) loci, which have been in developed in current clinical trials O’Donoghue et al (2022).
datPosCtrl <- data.frame(
`Druggable genes`= c("APOB","LDLR","LPA"),
`chr:pos (GRCh37/hg19)`=c("2:21224301-21266945",
"6:160952515-161087407",
"19:11200038-11244492"),
`OR (95% CI)`=c("1.28 (1.22,1.29)",
"1.45 (1.41,1.50)",
"1.30 (1.25,1.33)")
)
knitr::kable(
datPosCtrl,
captions="Genetically mimicked LDL-C-lowering effects on CAD")
Druggable.genes | chr.pos..GRCh37.hg19. | OR..95..CI. |
---|---|---|
APOB | 2:21224301-21266945 | 1.28 (1.22,1.29) |
LDLR | 6:160952515-161087407 | 1.45 (1.41,1.50) |
LPA | 19:11200038-11244492 | 1.30 (1.25,1.33) |
2.1 The translational framework for cardiovascular drug discovery
Briefly, this framework consists of four main stages,
Causal biomarkers screen
-
Drug-targeting protein discovery and verification
Elucidation of possible heterogeneous genetic effects oriented by polygenic IVs across the genome for the biomarker, utilizing methodologies such as Genome-wide mR Analysis under Pervasive PLEitropy (GRAPPLE) and MR-PATH
Exploration of the putative protein-disease association using cis-acting pQTL as IVs, supplemented by the inclusion of trans-acting variants and cis-position, trans-acting variants, as appropriate
Verification of putative drug-targeting proteins by examining whether the cis-acting variants are shared with the same causal variants as the upstream regulators or downstream intermediates, utilizing colocalization based on gene expression database or conducting cell and animal experiments.
Drug target relevant side effects exploration
Therapeutic efficacy and safety emulation
2.2 Implementations
## preparation
source(paste0(path.res,"/0_precondition_update.R"))
source(paste0(path.res,"/utilities_update.R"))
## codes for stage 1. Causal biomarkers screen
source(paste0(path.res,"/1_cause_biomarkers_screen_update.R"))
## codes for stage 2. Drug-targeting protein discovery and verification
source(paste0(path.res,"/2_discovery_verification_update.R"))
## codes for stage 3. Drug target relevant side effects exploration
source(paste0(path.res,"/3_side_effects_update.R"))
## codes for Stage 4. Therapeutic efficacy and safety emulation
source(paste0(path.res,"/4_target_trial_emulation_update.R"))
2.3 Results of the applied example
2.3.1 Results from Stage 1
Scatter plot of genetic effects on LDL-C and CAD using all SNPs at genome-wide significance with LD clumping at \(r^2<0.3\) and those within either APOB, LDLR, or LPA loci.
plt.woCor
plt.4Gene
2.3.2 Results from Stage 2
- Pleiotropic effects detected using profile likelihood under the GRAPPLE framework
plt.GRAPPLE.profile
- Forest plot of MR estimates obtained from MR based on various kinds of IVs
plt.Forest
- Colocalization of LDL-C, eQTL, and CAD
ggpubr::ggarrange(
plt.4Gene.CAD.runColoc[[1]],
plt.4Gene.CAD.runColoc[[2]],
plt.4Gene.CAD.runColoc[[3]],
labels=c("2d","2e","2f"),nrow=1)
2.3.3 Results from Stage 3
Phenome-wide MR estimates of genetically mimicked LDL-C-lowering effects encoded by APOB, LDLR, and LPA loci on 78 clinical endpoints with ICD-10 diagnoses in the UK Biobank.
ggpubr::ggarrange(
plt.4Gene.Volcano[[1]],
plt.4Gene.Volcano[[2]],
plt.4Gene.Volcano[[3]],
labels=c("3a","3b","3c"),nrow=1)