Last updated: 2023-08-22
Checks: 6 1
Knit directory: m6A_in_disease_genetics/
This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20230331)
was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.
absolute | relative |
---|---|
~/projects/m6A_in_disease_genetics/code/ctwas/ctwas_config_b37.R | code/ctwas/ctwas_config_b37.R |
~/projects/m6A_in_disease_genetics/code/ctwas/qiansheng/locus_plot.R | code/ctwas/qiansheng/locus_plot.R |
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version fb910fa. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish
or
wflow_git_commit
). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .ipynb_checkpoints/
Ignored: analysis/m6A_switch_to_disease_h2g.nb.html
Ignored: data/plots/
Untracked files:
Untracked: HMGCR_locus_gene_tracks.pdf
Untracked: Rplots.pdf
Untracked: analysis/.ipynb_checkpoints/
Untracked: analysis/CD_m6A_output_hg19.Rmd
Untracked: analysis/IBD_E_S_m6A.Rmd
Untracked: analysis/IBD_E_S_m6A_output.Rmd
Untracked: analysis/LDL_E_S_m6A.Rmd
Untracked: analysis/LDL_m6A_output.Rmd
Untracked: analysis/RA_m6A_output.Rmd
Untracked: analysis/WhiteBlood_WholeBlood_E_M.Rmd
Untracked: analysis/asthma_m6A_output_hg19.Rmd
Untracked: analysis/identify_m6A_mechanisms_with_finemapping.Rmd
Untracked: analysis/lymph_m6A_output.Rmd
Untracked: analysis/pre_weights_m6AQTL.txt
Untracked: analysis/rbc_E_S_m6A_output.Rmd
Untracked: analysis/rbc_m6A_output.Rmd
Untracked: analysis/summarize_ctwas_m6A_results.Rmd
Untracked: analysis/wbc_E_S_m6A_output.Rmd
Untracked: code/.ipynb_checkpoints/
Untracked: code/all_m6a_sites_with_paired_cisNATs_summary.csv
Untracked: code/annotating_fine-mapped_m6A_QTLs.Rmd
Untracked: code/check_double_strand.ipynb
Untracked: code/check_double_strand_v2.ipynb
Untracked: code/ctwas/
Untracked: code/figure/
Untracked: code/learn_gviz.Rmd
Untracked: code/learn_gviz.html
Untracked: code/learn_gviz.nb.html
Untracked: code/m6AQTL_finemapping.Rmd
Untracked: code/plot_genomic_tracks_gviz.ipynb
Untracked: code/summary_TWAS_coloc_m6A_2023.Rmd
Untracked: code/test_gviz.ipynb
Untracked: code/twas_genes_PP4_0.3_immune_traits_trackplots.pdf
Untracked: data/.ipynb_checkpoints/
Untracked: data/ADCY7_gwas_input.tsv
Untracked: data/ADCY7_qtl_input.tsv
Untracked: data/Allergy_full_coloc.txt
Untracked: data/Asthma_full_coloc.txt
Untracked: data/CAD_full_coloc.txt
Untracked: data/Eosinophil_count_full_coloc.txt
Untracked: data/GSE125377_jointPeakReadCount.txt
Untracked: data/G_list.Rd
Untracked: data/HMGCR_ctwas_dat.Rd
Untracked: data/IBD_full_coloc.txt
Untracked: data/JointPeaks.bed
Untracked: data/Li2022_dsRNAs.xlsx
Untracked: data/Lupus_full_coloc.txt
Untracked: data/RA_full_coloc.txt
Untracked: data/TABLE1_hg19.txt
Untracked: data/TABLE1_hg19.txt.zip
Untracked: data/__MACOSX/
Untracked: data/coloc_blood_traits.csv
Untracked: data/crohns_disease_full_coloc.txt
Untracked: data/ctwas_m6a_joint_top_PIP.txt
Untracked: data/edit_sites_and_GE_neg_correlated.txt
Untracked: data/edit_sites_and_GE_pos_correlated.txt
Untracked: data/features
Untracked: data/human_EERs.csv
Untracked: data/human_EERs.txt
Untracked: data/lymph_full_coloc.txt
Untracked: data/m6A_TWAS_results.csv
Untracked: data/m6a_TWAS_genes.txt
Untracked: data/m6a_joint_calling_peaks.csv
Untracked: data/nasser_2021_ABC_IBD_genes.txt
Untracked: data/nat_sense_pairs.csv
Untracked: data/plt_full_coloc.txt
Untracked: data/rbc_full_coloc.txt
Untracked: data/rdw_full_coloc.txt
Untracked: data/reported_AS_targets_S1.txt
Untracked: data/reported_AS_wanowska.txt
Untracked: data/sig_coloc_results/
Untracked: data/test_locuscomparer.pdf
Untracked: data/ulcerative_colitis_full_coloc.txt
Untracked: data/wbc_full_coloc.txt
Untracked: data/zhao_silver_genes.csv
Untracked: output/.ipynb_checkpoints/
Untracked: output/HMGCR_gene_track_plot.pdf
Untracked: output/HMGCR_locus_plot.pdf
Untracked: output/IBD_DHX38_plot.pdf
Untracked: output/IBD_DHX38_plot_genetrack.pdf
Untracked: output/all_m6a_sites_with_cisNATs.csv
Untracked: output/all_m6a_sites_with_paired_cisNATs_summary.csv
Untracked: output/all_m6a_sites_with_paired_cisNATs_summary_PP40.3.csv
Untracked: output/all_m6a_sites_with_paired_cisNATs_summary_PP40.5.csv
Untracked: output/all_m6a_sites_with_paired_cis_NATs.csv
Untracked: output/fine_mapped_m6AQTLs_TWAS_genes_highPP4.rds
Untracked: output/gene_summary.csv
Untracked: output/immune_related_m6A_targets.csv
Untracked: output/lupus_MIR210HG_plot.pdf
Untracked: output/lupus_MIR210HG_plot_genetrack.pdf
Untracked: output/m6aQTL_dsRNAs_PPP2R3C_PRORP.pdf
Untracked: output/m6a_QTL_genes.csv
Untracked: output/m6a_genes_PIP_0.6_blood_immune.csv
Untracked: output/m6a_genes_PIP_0.6_blood_immune.txt
Untracked: output/m6a_peaks_nearby_dsRNAs.csv
Untracked: output/m6a_sites_near_all_dsRNAs_twas.csv
Untracked: output/m6a_sites_near_dsRNAs_coloc.csv
Untracked: output/m6a_sites_near_dsRNAs_twas.csv
Untracked: output/m6a_sites_near_dsRNAs_twas_summary.csv
Untracked: output/m6a_sites_overlapping_NAT_twas.csv
Untracked: output/m6a_sites_overlapping_dsRNAs_coloc.csv
Untracked: output/m6a_sites_overlapping_dsRNAs_twas.csv
Untracked: output/m6a_sites_overlapping_dsRegions.csv
Untracked: output/m6a_sites_overlapping_dsRegions_coloc.csv
Untracked: output/negatively_correlated_genes.txt
Untracked: output/postively_correlated_genes.txt
Untracked: output/rs1806261_RABEP1-NUP88_focused_locusview.pdf
Untracked: output/rs1806261_RABEP1-NUP88_locusview.pdf
Untracked: output/rs3177647_MAPKAPK5-AS1-MAPKAPK5_locusview.pdf
Untracked: output/rs3204541_DDX55-EIF2B1_locusview.pdf
Untracked: output/rs7184802_ADCY7-BRD7_locusview.pdf
Untracked: output/rs7184802_ADCY7_locuscompare.pdf
Untracked: output/twas_genes_PP4_0.3_immune_traits_trackplots.pdf
Untracked: output/twas_genes_PP4_0.5_blood_traits_trackplots.pdf
Untracked: output/twas_m6a_sites_with_all_cisNATs.RDS
Untracked: output/twas_m6a_sites_with_cisNATs_range.RDS
Untracked: output/twas_m6a_sites_with_the_nearest_cisNAT.RDS
Untracked: twas_genes_PP4_0.3_immune_traits_trackplots.pdf
Unstaged changes:
Deleted: analysis/learn_ctwas.Rmd
Modified: analysis/lymph_m6A_output_hg19.Rmd
Modified: analysis/m6A_switch_to_disease_h2g.Rmd
Modified: analysis/rbc_m6A_output_hg19.Rmd
Modified: analysis/wbc_m6A_output.Rmd
Modified: analysis/wbc_m6A_output_hg19.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown
(analysis/eosinoph_m6A_output_hg19.Rmd
) and HTML
(docs/eosinoph_m6A_output_hg19.html
) files. If you’ve
configured a remote Git repository (see ?wflow_git_remote
),
click on the hyperlinks in the table below to view the files as they
were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | fb910fa | Jing Gu | 2023-08-22 | blood traits |
# top 1 method
res <- impute_expr_z(z_snp, weight = weight, ld_R_dir = ld_R_dir,
method = NULL, outputdir = outputdir, outname = outname.e,
harmonize_z = T, harmonize_wgt = T, scale_by_ld_variance=F,
strand_ambig_action_z = "recover",
recover_strand_ambig_wgt = T
# lasso/elastic-net method
res <- impute_expr_z(z_snp, weight = weight, ld_R_dir = ld_R_dir,
method = NULL, outputdir = outputdir, outname = outname.e,
harmonize_z = T, harmonize_wgt = T, scale_by_ld_variance=F,
strand_ambig_action_z = "none",
recover_strand_ambig_wgt = F
GWAS: UK Biobank GWAS summary statistics - European individuals
Weights: FUSION weights using top1, lasso, or elastic-net models were converted into PredictDB format and were not needed to do scaling when running ctwas.
cTWAS analysis on m6A alone
Joint analysis of expression, splicing and m6A
[1] "Check convergence for the lasso model when jointly analyzing expression, splicing and m6A:"
[1] "Table of group size before/after matching with UKBB SNPs:"
SNP eQTL sQTL m6AQTL
prior_group_size 9.324e+06 2005.0000 2191.0000 918.0000
group_size 8.604e+06 1998.0000 2173.0000 912.0000
percent_of_overlaps 9.228e-01 0.9965 0.9918 0.9935
SNP eQTL sQTL m6AQTL
estimated_group_prior 9.284e-05 0.01161 0.017250 0.027690
estimated_group_prior_var 2.043e+01 21.82962 7.252449 8.885392
estimated_group_pve 9.474e-02 0.00294 0.001578 0.001302
attributable_group_pve 9.421e-01 0.02923 0.015692 0.012952
$lasso
top1 model
Summing up PIPs for m6A peaks located in the same gene
Top m6A PIPs by genes
For m6A or splicing QTLs, they are assigned to the nearest genes (m6A needs to be confirmed with Kevin).
Top SNPs or genes with PIP > 0.6
$eQTL
genename susie_pip group region_tag
1934 EFEMP2 0.9973 eQTL 11_36
1907 ENSG00000180139 0.9243 eQTL 10_57
1948 RPS26 0.8111 eQTL 12_36
1864 CDPF1 0.7878 eQTL 22_21
1378 POLR2C 0.7516 eQTL 16_31
1459 ENSG00000264538 0.7162 eQTL 17_18
35 FUCA1 0.6637 eQTL 1_17
1367 TBX6 0.6361 eQTL 16_24
56 PHC2 0.6203 eQTL 1_21
635 MDN1 0.6106 eQTL 6_61
$m6AQTL
genename susie_pip group region_tag
5050 ZER1 0.9781 m6AQTL 9_66
5042 PTK2B 0.9591 m6AQTL 8_27
5059 SLC43A3 0.9219 m6AQTL 11_33
5044 SH2D3C 0.7874 m6AQTL 9_66
4487 MEPCE 0.7451 m6AQTL 7_61
4207 TRMT13 0.6013 m6AQTL 1_62
$sQTL
genename susie_pip group region_tag
4059 STAT1 0.9202 sQTL 2_113
4144 BAX 0.8734 sQTL 19_34
2641 IRF1 0.7121 sQTL 5_79
2538 RHOH 0.6456 sQTL 4_33
3868 C19orf54 0.6387 sQTL 19_28
3599 NLRC5 0.6058 sQTL 16_31
genename region_tag susie_pip z
1 ZER1 9_66 0.9781 -4.523
2 PTK2B 8_27 0.9591 5.080
3 SLC43A3 11_33 0.9219 -5.139
4 SH2D3C 9_66 0.7874 3.962
5 MEPCE 7_61 0.7451 3.425
6 TRMT13 1_62 0.6013 -3.538
7 SRPK1 6_29 0.5901 -5.543
8 FBXO34 14_25 0.5776 -5.179
9 RRP1B 21_23 0.5604 -3.859
10 WAC-AS1 10_20 0.5411 6.687
Summing up PIPs for m6A peaks located in the same gene
Top 10 m6A PIPs by genes
# A tibble: 819 × 2
genename total_susie_pip
<chr> <dbl>
1 ZER1 0.978
2 SH2D3C 0.975
3 PTK2B 0.959
4 SLC43A3 0.922
5 MEPCE 0.856
6 RRP1B 0.751
7 WAC-AS1 0.619
8 TRMT13 0.601
9 SRPK1 0.590
10 FBXO34 0.578
# ℹ 809 more rows
peak_id genename pos region_tag susie_pip z
1 chr2:191851794-191855954 STAT1 191770613 2_113 0.9202 -5.123
2 chr19:49458856-49459455 BAX 49442933 19_34 0.8734 -6.336
3 chr5:131823717-131825084 IRF1 131741498 5_79 0.7121 6.240
4 chr4:40198920-40236797 RHOH 40224043 4_33 0.6456 3.582
5 chr19:41247344-41249820 C19orf54 41168196 19_28 0.6387 -3.815
6 chr16:57079404-57081457 NLRC5 57069650 16_31 0.6058 3.446
7 chr17:46211178-46214591 SKAP1 46123642 17_28 0.5769 -4.738
8 chr19:42626728-42628331 POU2F2 42542466 19_29 0.5392 -4.750
9 chr11:65770541-65770706 BANF1 65714925 11_36 0.4871 7.791
10 chr7:75625917-75633076 STYXL1 75588366 7_48 0.4407 4.938
Summing up PIPs for spliced introns located in the same gene
Top 10 splicing PIPs by genes
# A tibble: 10 × 2
genename total_susie_pip
<chr> <dbl>
1 SP140 1.12
2 STAT1 1.04
3 STYXL1 0.930
4 BAX 0.901
5 WARS1 0.897
6 IFI44L 0.880
7 IRF1 0.875
8 MCOLN2 0.866
9 POU2AF1 0.851
10 BCL2A1 0.841
genename combined_pip expression_pip splicing_pip m6A_pip
2799 SP140 1.120 0.000e+00 1.1200 0.0000
2381 PTK2B 1.102 0.000e+00 0.1434 0.9591
2850 STAT1 1.040 0.000e+00 1.0397 0.0000
801 EFEMP2 1.016 9.973e-01 0.0186 0.0000
1552 IFI44L 1.007 0.000e+00 0.8801 0.1265
3301 ZER1 0.978 0.000e+00 0.0000 0.9781
2670 SH2D3C 0.975 0.000e+00 0.0000 0.9751
2870 STYXL1 0.930 1.401e-05 0.9304 0.0000
1790 MCOLN2 0.928 6.141e-02 0.8661 0.0000
874 ENSG00000180139 0.924 9.243e-01 0.0000 0.0000
2729 SLC43A3 0.922 0.000e+00 0.0000 0.9219
3239 WARS1 0.922 2.449e-02 0.8975 0.0000
283 BAX 0.901 0.000e+00 0.9013 0.0000
2294 POU2AF1 0.883 3.193e-02 0.8508 0.0000
1593 IRF1 0.875 1.091e-05 0.8750 0.0000
1813 MEPCE 0.856 0.000e+00 0.0000 0.8555
2295 POU2F2 0.849 3.097e-01 0.5392 0.0000
291 BCL2A1 0.841 0.000e+00 0.8411 0.0000
1518 HNRNPK 0.841 0.000e+00 0.7180 0.1230
3001 TMEM156 0.839 2.107e-01 0.6282 0.0000
2105 OAS1 0.838 0.000e+00 0.6796 0.1588
2577 RPS26 0.811 8.111e-01 0.0000 0.0000
region_tag
2799 2_135
2381 8_27
2850 2_113
801 11_36
1552 1_48
3301 9_66
2670 9_66
2870 7_48
1790 1_52
874 10_57
2729 11_33
3239 14_52
283 19_34
2294 11_66
1593 5_79
1813 7_61
2295 19_29
291 15_37
1518 9_41
3001 4_32
2105 12_68
2577 12_36
Loading required package: grid
Warning: replacing previous import 'utils::download.file' by
'restfulr::download.file' when loading 'rtracklayer'
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C
[4] LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C
attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] biomaRt_2.52.0 Gviz_1.40.1 cowplot_1.1.1
[4] ggplot2_3.4.3 GenomicRanges_1.48.0 GenomeInfoDb_1.32.2
[7] IRanges_2.30.1 S4Vectors_0.34.0 BiocGenerics_0.42.0
[10] ctwas_0.1.38 dplyr_1.1.2 workflowr_1.7.0
loaded via a namespace (and not attached):
[1] colorspace_2.1-0 deldir_1.0-6
[3] rjson_0.2.21 rprojroot_2.0.3
[5] biovizBase_1.44.0 htmlTable_2.4.0
[7] XVector_0.36.0 base64enc_0.1-3
[9] fs_1.6.3 dichromat_2.0-0.1
[11] rstudioapi_0.15.0 farver_2.1.1
[13] bit64_4.0.5 AnnotationDbi_1.58.0
[15] fansi_1.0.4 xml2_1.3.3
[17] codetools_0.2-18 logging_0.10-108
[19] cachem_1.0.8 knitr_1.39
[21] Formula_1.2-4 jsonlite_1.8.7
[23] Rsamtools_2.12.0 cluster_2.1.3
[25] dbplyr_2.3.3 png_0.1-7
[27] compiler_4.2.0 httr_1.4.7
[29] backports_1.4.1 lazyeval_0.2.2
[31] Matrix_1.6-1 fastmap_1.1.1
[33] cli_3.6.1 later_1.3.0
[35] htmltools_0.5.2 prettyunits_1.1.1
[37] tools_4.2.0 gtable_0.3.3
[39] glue_1.6.2 GenomeInfoDbData_1.2.8
[41] rappdirs_0.3.3 Rcpp_1.0.11
[43] Biobase_2.56.0 jquerylib_0.1.4
[45] vctrs_0.6.3 Biostrings_2.64.0
[47] rtracklayer_1.56.0 iterators_1.0.14
[49] xfun_0.30 stringr_1.5.0
[51] ps_1.7.0 lifecycle_1.0.3
[53] ensembldb_2.20.2 restfulr_0.0.14
[55] XML_3.99-0.14 getPass_0.2-2
[57] zlibbioc_1.42.0 scales_1.2.1
[59] BSgenome_1.64.0 VariantAnnotation_1.42.1
[61] ProtGenerics_1.28.0 hms_1.1.3
[63] promises_1.2.0.1 MatrixGenerics_1.8.0
[65] parallel_4.2.0 SummarizedExperiment_1.26.1
[67] AnnotationFilter_1.20.0 RColorBrewer_1.1-3
[69] yaml_2.3.5 curl_5.0.2
[71] memoise_2.0.1 gridExtra_2.3
[73] sass_0.4.1 rpart_4.1.16
[75] latticeExtra_0.6-30 stringi_1.7.12
[77] RSQLite_2.3.1 highr_0.9
[79] BiocIO_1.6.0 foreach_1.5.2
[81] checkmate_2.1.0 GenomicFeatures_1.48.4
[83] filelock_1.0.2 BiocParallel_1.30.3
[85] rlang_1.1.1 pkgconfig_2.0.3
[87] matrixStats_0.62.0 bitops_1.0-7
[89] evaluate_0.15 lattice_0.20-45
[91] htmlwidgets_1.5.4 GenomicAlignments_1.32.0
[93] labeling_0.4.2 bit_4.0.5
[95] processx_3.8.0 tidyselect_1.2.0
[97] magrittr_2.0.3 R6_2.5.1
[99] generics_0.1.3 Hmisc_5.1-0
[101] DelayedArray_0.22.0 DBI_1.1.3
[103] pgenlibr_0.3.6 pillar_1.9.0
[105] whisker_0.4 foreign_0.8-82
[107] withr_2.5.0 KEGGREST_1.36.2
[109] RCurl_1.98-1.7 nnet_7.3-17
[111] tibble_3.2.1 crayon_1.5.2
[113] interp_1.1-4 utf8_1.2.3
[115] BiocFileCache_2.4.0 rmarkdown_2.14
[117] jpeg_0.1-10 progress_1.2.2
[119] data.table_1.14.8 blob_1.2.4
[121] callr_3.7.3 git2r_0.30.1
[123] digest_0.6.33 httpuv_1.6.5
[125] munsell_0.5.0 bslib_0.3.1