Aim: run partial correlations and multiple regression on select LD blocks
Given that bivariate local \(r_{g}\) analyses revealed several LD blocks where significant \(r_{g}\)'s were found between multiple trait pairs, we followed up with partial local \(r_{g}\) analyses and multiple regression to obtain conditional genetic associations.
For LD blocks with >= 2 significant bivariate results between the three neuropsychiatric phenotypes, we hypothesised that a shared genetic signal with a third neuropsychiatric phenotype could account for some of the overlap between two neuropyschiatric phenotypes (e.g. local \(r_{g}\) between BIP and SCZ could be partially accounted for by a shared genetic signal with MDD). Thus, we used partial correlations to compute the partial genetic correlation between two phenotypes while accounting for the third.
For select LD blocks with significant bivariate results between a phenotype and at least 2 others, multiple regression was used to determine the extent to which the genetic component of the outcome phenotype could be explained by the genetic components of multiple predictor phenotypes. This permitted exploration of (i) the independent effects of predictor phenotypes on the outcome phenotype and (ii) collinearity between predictor phenotypes.
Following section includes any intermediary code used in this .Rmd
.
This can either be sourced directly:
source(here::here("scripts", "03a_partial_corr.R"))
Alternatively, for many phenotypes/loci it can be worth running this script directly on the cluster, using nohup
, as performed below:
# Have to navigate to root project folder for script to work (as it uses here package)
cd /home/rreynolds/misc_projects/neurodegen-psych-local-corr
nohup Rscript \
/home/rreynolds/misc_projects/neurodegen-psych-local-corr/scripts/03a_run_partial_corr.R \
&>/home/rreynolds/misc_projects/neurodegen-psych-local-corr/logs/03a_run_partial_corr.log&
This can either be sourced directly:
source(here::here("scripts", "03b_run_multi_reg.R"))
Alternatively, for many phenotypes/loci it can be worth running this script directly on the cluster, using nohup
, as performed below:
# Have to navigate to root project folder for script to work (as it uses here package)
cd /home/rreynolds/misc_projects/neurodegen-psych-local-corr
nohup Rscript \
/home/rreynolds/misc_projects/neurodegen-psych-local-corr/scripts/03b_run_multi_reg.R \
&>/home/rreynolds/misc_projects/neurodegen-psych-local-corr/logs/03b_run_multi_reg.log&
Once run, results can be loaded using the following code chunk:
pcorr <-
list.files(
here::here(
"results",
"03_partial_corr_multi_reg"
),
pattern = ".partcorr.lava.rds",
full.names = T
) %>%
readRDS() %>%
lapply(., function(list){
list %>%
qdapTools::list_df2df(col1 = "list_name")
}) %>%
qdapTools::list_df2df(col1 = "list_name_2") %>%
dplyr::select(-contains("list_name")) %>%
as_tibble()
multireg <-
list.files(
here::here(
"results",
"03_partial_corr_multi_reg"
),
pattern = ".multireg.lava.rds",
full.names = T
) %>%
readRDS() %>%
lapply(., function(x){
x %>%
lapply(., function(y){
y %>%
lapply(., function(z){
z %>%
qdapTools::list_df2df(col1 = "model_number")
}) %>%
qdapTools::list_df2df(col1 = "list_name")
}) %>%
qdapTools::list_df2df()
}) %>%
qdapTools::list_df2df(col1 = "locus") %>%
dplyr::select(-X1) %>%
dplyr::mutate(
model_type =
case_when(
locus == 1719 & list_name == "L1" ~ "intermediate",
locus == 1719 & list_name == "L2" ~ "full",
TRUE ~ "full"
)
) %>%
dplyr::select(
locus, contains("model"),
outcome, predictors,
everything(), -contains("list")
)
print("Partial correlation column descriptions:")
## [1] "Partial correlation column descriptions:"
tibble(
column = colnames(pcorr),
description =
c(
"LD block ID",
"LD block chromosome",
"LD block start base pair",
"LD block end base pair",
"The number of SNPs within the LD block",
"The number of PCs retained within the LD block",
"Phenotype 1",
"Phenotype 2",
"Phenotype(s) which the genetic correlation between the target phenotypes were conditioned on",
"The proportion of genetic signal in phenotype 1 explained by z",
"The proportion of genetic signal in phenotype 2 explained by z",
"The partial genetic correlation between phenotype 1 and 2 conditioned on z",
"Lower bound of 95% confidence interval for the partial genetic correlation",
"Upper bound of 95% confidence interval for the partial genetic correlation",
"Simulation p-values for the partial genetic correlation"
)
) %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')
print("Partial correlations (all results):")
## [1] "Partial correlations (all results):"
pcorr %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')
bivar_results <-
readRDS(
list.files(
here::here("results", "02_univar_bivar_test"),
pattern = "bivar.lava.rds",
full.names = T
)
) %>%
purrr::discard(is.null) %>%
qdapTools::list_df2df() %>%
dplyr::select(-X1)
data_to_plot <-
pcorr %>%
dplyr::filter(!is.na(pcor)) %>%
dplyr::select(
locus, phen1, phen2, pcor, p_cond = p
) %>%
dplyr::inner_join(
bivar_results %>%
dplyr::select(
locus, contains("phen"), rho, p_uncond = p
),
by = c("locus", "phen1", "phen2")
) %>%
tidyr::pivot_longer(
cols = c("pcor", "p_cond", "rho", "p_uncond")
) %>%
dplyr::mutate(
model =
case_when(
name %in% c("pcor", "p_cond") ~ "Conditioned",
TRUE ~ "Unconditioned"
) %>%
fct_rev(),
coef =
case_when(
name %in% c("pcor", "rho") ~ "corr",
TRUE ~ "p"
)
) %>%
dplyr::select(-name) %>%
tidyr::pivot_wider(
names_from = c("coef"),
values_from = c("value")
)
data_to_plot %>%
dplyr::mutate(
rg_fill =
dplyr::case_when(
# Different p-value cut-offs depending on conditioning
p < 0.05 & model == "Conditioned" ~ round(corr, 2),
p < 0.05/nrow(bivar_results) & model == "Unconditioned" ~ round(corr, 2)
),
p1 = phen1 %>%
stringr::str_replace_all("[:digit:]", "") %>%
stringr::str_remove("\\..*"),
p2 = phen2 %>%
stringr::str_replace_all("[:digit:]", "") %>%
stringr::str_remove("\\..*")
) %>%
ggplot2::ggplot(
ggplot2::aes(
x = p1,
y = p2 %>% fct_rev(),
fill = rg_fill,
label = round(corr, 2)
)
) +
ggplot2::geom_tile(colour = "black") +
ggplot2::geom_text(
size = 3
) +
ggplot2::facet_grid(rows = vars(model), cols = vars(locus)) +
ggplot2::coord_fixed() +
ggplot2::labs(x = "", y = "", fill = "Local genetic correlation (rg)") +
ggplot2::scale_fill_distiller(
palette = "RdYlBu",
limits = c(-1, 1)
) +
ggplot2::scale_colour_manual(values = c("black", "white")) +
ggplot2::guides(colour = F) +
theme_rhr +
ggplot2::theme(
panel.border = element_blank(),
panel.grid = element_blank(),
axis.ticks = element_blank()
)
print("Multiple regression column descriptions:")
## [1] "Multiple regression column descriptions:"
tibble(
column = colnames(multireg),
description =
c(
"LD block ID",
"For multiple regressions with multiple intermediate models, this is a model identifier",
"Whether the regression model is an intermediate or full model",
"Outcome phenotype",
"Predictor phenotype",
"Standardised multiple regression coefficient",
"Lower bound of 95% confidence interval for gamma",
"Upper bound of 95% confidence interval for gamma",
"Proportion of variance in genetic signal for the outcome phenotype explained by all predictor phenotypes simultaneously",
"Lower bound of 95% confidence interval for r2",
"Upper bound of 95% confidence interval for r2",
"Simulation p-values for the gammas"
)
) %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')
print("Multiple regression (all results):")
## [1] "Multiple regression (all results):"
multireg %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')
multireg_w_signif_pred <-
multireg %>%
dplyr::filter(
p < 0.05,
# locus %in% c(2281,2351)
) %>%
dplyr::select(locus, contains("model"), outcome) %>%
dplyr::distinct()
data_to_plot <-
multireg_w_signif_pred %>%
dplyr::inner_join(
multireg,
c("locus", "model_number", "model_type", "outcome")
) %>%
dplyr::mutate(
predictors = predictors %>%
stringr::str_replace_all("[:digit:]", "") %>%
stringr::str_remove("\\..*"),
outcome = outcome %>%
stringr::str_replace_all("[:digit:]", "") %>%
stringr::str_remove("\\..*"),
p_text = case_when(
p < 0.001 ~ "***",
p < 0.01 ~ "**",
p < 0.05 ~ "*"
),
locus =
as.numeric(locus)
)
data_to_plot <-
data_to_plot %>%
dplyr::inner_join(
data_to_plot %>%
dplyr::group_by(outcome, locus) %>%
dplyr::summarise(predictors = str_c(predictors, collapse = " + ")) %>%
dplyr::mutate(model = str_c(outcome, " ~ ", predictors)) %>%
dplyr::select(outcome, locus, model)
)
a <-
data_to_plot %>%
ggplot2::ggplot(
ggplot2::aes(
x = predictors,
y = gamma,
ymin = gamma.lower,
ymax = gamma.upper
)
) +
ggplot2::geom_pointrange(
colour = "#888888", fatten = 0.25
) +
ggplot2::geom_hline(yintercept = 0, linetype = "dashed") +
ggplot2::geom_text(
aes(
y = gamma.upper + (0.1 * gamma.upper),
label = p_text
)
) +
ggplot2::facet_wrap(
vars(locus, model),
scales = "free_x",
nrow = 1
) +
ggplot2::labs(x = "Predictors", y = "Standardised multiple\nregression coefficient") +
theme_rhr
b <-
data_to_plot %>%
dplyr::mutate(
locus_model =
str_c(locus, model, sep = "\n")
) %>%
dplyr::distinct(
locus, locus_model, r2, r2.lower, r2.upper
) %>%
ggplot2::ggplot(
ggplot2::aes(
x = locus_model %>%
fct_reorder(., .x = as.numeric(locus), .fun = max),
y = r2,
ymin = r2.lower,
ymax = r2.upper
)
) +
geom_col() +
geom_errorbar(width=.2) +
labs(
x = "Locus and model",
y = "Multivariate r2"
) +
theme_rhr
plot <-
cowplot::plot_grid(
a,b,
ncol = 1,
axis = "lr",
align = "v",
labels = letters[1:2]
)
plot
Show/hide
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.5 (2021-03-31)
## os Ubuntu 16.04.7 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_GB.UTF-8
## ctype en_GB.UTF-8
## tz Europe/London
## date 2022-04-26
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## ! package * version date lib source
## P assertthat 0.2.1 2019-03-21 [?] CRAN (R 4.0.5)
## P backports 1.3.0 2021-10-27 [?] CRAN (R 4.0.5)
## P bitops 1.0-7 2021-04-24 [?] CRAN (R 4.0.5)
## P bookdown 0.22 2021-04-22 [?] CRAN (R 4.0.5)
## P broom 0.7.10 2021-10-31 [?] CRAN (R 4.0.5)
## P bslib 0.3.1 2021-10-06 [?] CRAN (R 4.0.5)
## P cellranger 1.1.0 2016-07-27 [?] CRAN (R 4.0.5)
## P chron 2.3-56 2020-08-18 [?] CRAN (R 4.0.5)
## P cli 3.1.0 2021-10-27 [?] CRAN (R 4.0.5)
## P colorspace 2.0-2 2021-06-24 [?] CRAN (R 4.0.5)
## P cowplot 1.1.1 2020-12-30 [?] CRAN (R 4.0.5)
## P crayon 1.4.2 2021-10-29 [?] CRAN (R 4.0.5)
## P crosstalk 1.1.1 2021-01-12 [?] CRAN (R 4.0.5)
## P data.table 1.14.2 2021-09-27 [?] CRAN (R 4.0.5)
## P DBI 1.1.1 2021-01-15 [?] CRAN (R 4.0.5)
## P dbplyr 2.1.1 2021-04-06 [?] CRAN (R 4.0.5)
## P digest 0.6.29 2021-12-01 [?] CRAN (R 4.0.5)
## P dplyr * 1.0.7 2021-06-18 [?] CRAN (R 4.0.5)
## P DT 0.19 2021-09-02 [?] CRAN (R 4.0.5)
## P ellipsis 0.3.2 2021-04-29 [?] CRAN (R 4.0.5)
## P evaluate 0.14 2019-05-28 [?] CRAN (R 4.0.5)
## P fansi 0.5.0 2021-05-25 [?] CRAN (R 4.0.5)
## P farver 2.1.0 2021-02-28 [?] CRAN (R 4.0.5)
## P fastmap 1.1.0 2021-01-25 [?] CRAN (R 4.0.5)
## P forcats * 0.5.1 2021-01-27 [?] CRAN (R 4.0.5)
## P fs 1.5.1 2021-11-30 [?] CRAN (R 4.0.5)
## P generics 0.1.1 2021-10-25 [?] CRAN (R 4.0.5)
## P ggplot2 * 3.3.5 2021-06-25 [?] CRAN (R 4.0.5)
## P glue 1.5.1 2021-11-30 [?] CRAN (R 4.0.5)
## P gtable 0.3.0 2019-03-25 [?] CRAN (R 4.0.5)
## P haven 2.4.3 2021-08-04 [?] CRAN (R 4.0.5)
## P here 1.0.1 2020-12-13 [?] CRAN (R 4.0.5)
## P highr 0.9 2021-04-16 [?] CRAN (R 4.0.5)
## P hms 1.1.1 2021-09-26 [?] CRAN (R 4.0.5)
## P htmltools 0.5.2 2021-08-25 [?] CRAN (R 4.0.5)
## P htmlwidgets 1.5.4 2021-09-08 [?] CRAN (R 4.0.5)
## P httr 1.4.2 2020-07-20 [?] CRAN (R 4.0.5)
## P jquerylib 0.1.4 2021-04-26 [?] CRAN (R 4.0.5)
## P jsonlite 1.7.2 2020-12-09 [?] CRAN (R 4.0.5)
## P knitr 1.36 2021-09-29 [?] CRAN (R 4.0.5)
## P labeling 0.4.2 2020-10-20 [?] CRAN (R 4.0.5)
## P lifecycle 1.0.1 2021-09-24 [?] CRAN (R 4.0.5)
## P lubridate 1.8.0 2021-10-07 [?] CRAN (R 4.0.5)
## P magrittr 2.0.1 2020-11-17 [?] CRAN (R 4.0.5)
## P modelr 0.1.8 2020-05-19 [?] CRAN (R 4.0.5)
## P munsell 0.5.0 2018-06-12 [?] CRAN (R 4.0.5)
## P pillar 1.6.4 2021-10-18 [?] CRAN (R 4.0.5)
## P pkgconfig 2.0.3 2019-09-22 [?] CRAN (R 4.0.5)
## P purrr * 0.3.4 2020-04-17 [?] CRAN (R 4.0.5)
## P qdapTools 1.3.5 2020-04-17 [?] CRAN (R 4.0.5)
## P R6 2.5.1 2021-08-19 [?] CRAN (R 4.0.5)
## P RColorBrewer 1.1-2 2014-12-07 [?] CRAN (R 4.0.5)
## P Rcpp 1.0.7 2021-07-07 [?] CRAN (R 4.0.5)
## P RCurl 1.98-1.5 2021-09-17 [?] CRAN (R 4.0.5)
## P readr * 2.0.2 2021-09-27 [?] CRAN (R 4.0.5)
## P readxl 1.3.1 2019-03-13 [?] CRAN (R 4.0.5)
## P reprex 2.0.0 2021-04-02 [?] CRAN (R 4.0.5)
## P rlang 0.4.12 2021-10-18 [?] CRAN (R 4.0.5)
## P rmarkdown 2.11 2021-09-14 [?] CRAN (R 4.0.5)
## P rprojroot 2.0.2 2020-11-15 [?] CRAN (R 4.0.5)
## P rstudioapi 0.13 2020-11-12 [?] CRAN (R 4.0.5)
## P rvest 1.0.1 2021-07-26 [?] CRAN (R 4.0.5)
## P sass 0.4.0 2021-05-12 [?] CRAN (R 4.0.5)
## P scales 1.1.1 2020-05-11 [?] CRAN (R 4.0.5)
## P sessioninfo * 1.1.1 2018-11-05 [?] CRAN (R 4.0.5)
## P stringi 1.7.6 2021-11-29 [?] CRAN (R 4.0.5)
## P stringr * 1.4.0 2019-02-10 [?] CRAN (R 4.0.5)
## P tibble * 3.1.6 2021-11-07 [?] CRAN (R 4.0.5)
## P tidyr * 1.1.4 2021-09-27 [?] CRAN (R 4.0.5)
## P tidyselect 1.1.1 2021-04-30 [?] CRAN (R 4.0.5)
## P tidyverse * 1.3.1 2021-04-15 [?] CRAN (R 4.0.5)
## P tzdb 0.2.0 2021-10-27 [?] CRAN (R 4.0.5)
## P utf8 1.2.2 2021-07-24 [?] CRAN (R 4.0.5)
## P vctrs 0.3.8 2021-04-29 [?] CRAN (R 4.0.5)
## P withr 2.4.3 2021-11-30 [?] CRAN (R 4.0.5)
## P xfun 0.27 2021-10-18 [?] CRAN (R 4.0.5)
## P xml2 1.3.2 2020-04-23 [?] CRAN (R 4.0.5)
## P yaml 2.2.1 2020-02-01 [?] CRAN (R 4.0.5)
##
## [1] /home/rreynolds/misc_projects/neurodegen-psych-local-corr/renv/library/R-4.0/x86_64-pc-linux-gnu
## [2] /opt/R/4.0.5/lib/R/library
##
## P ── Loaded and on-disk path mismatch.