Last updated: 2023-08-22

Checks: 6 1

Knit directory: m6A_in_disease_genetics/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20230331) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
~/projects/m6A_in_disease_genetics/code/ctwas/ctwas_config_b37.R code/ctwas/ctwas_config_b37.R
~/projects/m6A_in_disease_genetics/code/ctwas/qiansheng/locus_plot.R code/ctwas/qiansheng/locus_plot.R

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version fb910fa. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .ipynb_checkpoints/
    Ignored:    analysis/m6A_switch_to_disease_h2g.nb.html
    Ignored:    data/plots/

Untracked files:
    Untracked:  HMGCR_locus_gene_tracks.pdf
    Untracked:  Rplots.pdf
    Untracked:  analysis/.ipynb_checkpoints/
    Untracked:  analysis/CD_m6A_output_hg19.Rmd
    Untracked:  analysis/IBD_E_S_m6A.Rmd
    Untracked:  analysis/IBD_E_S_m6A_output.Rmd
    Untracked:  analysis/LDL_E_S_m6A.Rmd
    Untracked:  analysis/LDL_m6A_output.Rmd
    Untracked:  analysis/RA_m6A_output.Rmd
    Untracked:  analysis/WhiteBlood_WholeBlood_E_M.Rmd
    Untracked:  analysis/asthma_m6A_output_hg19.Rmd
    Untracked:  analysis/identify_m6A_mechanisms_with_finemapping.Rmd
    Untracked:  analysis/lymph_m6A_output.Rmd
    Untracked:  analysis/pre_weights_m6AQTL.txt
    Untracked:  analysis/rbc_E_S_m6A_output.Rmd
    Untracked:  analysis/rbc_m6A_output.Rmd
    Untracked:  analysis/summarize_ctwas_m6A_results.Rmd
    Untracked:  analysis/wbc_E_S_m6A_output.Rmd
    Untracked:  code/.ipynb_checkpoints/
    Untracked:  code/all_m6a_sites_with_paired_cisNATs_summary.csv
    Untracked:  code/annotating_fine-mapped_m6A_QTLs.Rmd
    Untracked:  code/check_double_strand.ipynb
    Untracked:  code/check_double_strand_v2.ipynb
    Untracked:  code/ctwas/
    Untracked:  code/figure/
    Untracked:  code/learn_gviz.Rmd
    Untracked:  code/learn_gviz.html
    Untracked:  code/learn_gviz.nb.html
    Untracked:  code/m6AQTL_finemapping.Rmd
    Untracked:  code/plot_genomic_tracks_gviz.ipynb
    Untracked:  code/summary_TWAS_coloc_m6A_2023.Rmd
    Untracked:  code/test_gviz.ipynb
    Untracked:  code/twas_genes_PP4_0.3_immune_traits_trackplots.pdf
    Untracked:  data/.ipynb_checkpoints/
    Untracked:  data/ADCY7_gwas_input.tsv
    Untracked:  data/ADCY7_qtl_input.tsv
    Untracked:  data/Allergy_full_coloc.txt
    Untracked:  data/Asthma_full_coloc.txt
    Untracked:  data/CAD_full_coloc.txt
    Untracked:  data/Eosinophil_count_full_coloc.txt
    Untracked:  data/GSE125377_jointPeakReadCount.txt
    Untracked:  data/G_list.Rd
    Untracked:  data/HMGCR_ctwas_dat.Rd
    Untracked:  data/IBD_full_coloc.txt
    Untracked:  data/JointPeaks.bed
    Untracked:  data/Li2022_dsRNAs.xlsx
    Untracked:  data/Lupus_full_coloc.txt
    Untracked:  data/RA_full_coloc.txt
    Untracked:  data/TABLE1_hg19.txt
    Untracked:  data/TABLE1_hg19.txt.zip
    Untracked:  data/__MACOSX/
    Untracked:  data/coloc_blood_traits.csv
    Untracked:  data/crohns_disease_full_coloc.txt
    Untracked:  data/ctwas_m6a_joint_top_PIP.txt
    Untracked:  data/edit_sites_and_GE_neg_correlated.txt
    Untracked:  data/edit_sites_and_GE_pos_correlated.txt
    Untracked:  data/features
    Untracked:  data/human_EERs.csv
    Untracked:  data/human_EERs.txt
    Untracked:  data/lymph_full_coloc.txt
    Untracked:  data/m6A_TWAS_results.csv
    Untracked:  data/m6a_TWAS_genes.txt
    Untracked:  data/m6a_joint_calling_peaks.csv
    Untracked:  data/nasser_2021_ABC_IBD_genes.txt
    Untracked:  data/nat_sense_pairs.csv
    Untracked:  data/plt_full_coloc.txt
    Untracked:  data/rbc_full_coloc.txt
    Untracked:  data/rdw_full_coloc.txt
    Untracked:  data/reported_AS_targets_S1.txt
    Untracked:  data/reported_AS_wanowska.txt
    Untracked:  data/sig_coloc_results/
    Untracked:  data/test_locuscomparer.pdf
    Untracked:  data/ulcerative_colitis_full_coloc.txt
    Untracked:  data/wbc_full_coloc.txt
    Untracked:  data/zhao_silver_genes.csv
    Untracked:  output/.ipynb_checkpoints/
    Untracked:  output/HMGCR_gene_track_plot.pdf
    Untracked:  output/HMGCR_locus_plot.pdf
    Untracked:  output/IBD_DHX38_plot.pdf
    Untracked:  output/IBD_DHX38_plot_genetrack.pdf
    Untracked:  output/all_m6a_sites_with_cisNATs.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary_PP40.3.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary_PP40.5.csv
    Untracked:  output/all_m6a_sites_with_paired_cis_NATs.csv
    Untracked:  output/fine_mapped_m6AQTLs_TWAS_genes_highPP4.rds
    Untracked:  output/gene_summary.csv
    Untracked:  output/immune_related_m6A_targets.csv
    Untracked:  output/lupus_MIR210HG_plot.pdf
    Untracked:  output/lupus_MIR210HG_plot_genetrack.pdf
    Untracked:  output/m6aQTL_dsRNAs_PPP2R3C_PRORP.pdf
    Untracked:  output/m6a_QTL_genes.csv
    Untracked:  output/m6a_genes_PIP_0.6_blood_immune.csv
    Untracked:  output/m6a_genes_PIP_0.6_blood_immune.txt
    Untracked:  output/m6a_peaks_nearby_dsRNAs.csv
    Untracked:  output/m6a_sites_near_all_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_near_dsRNAs_coloc.csv
    Untracked:  output/m6a_sites_near_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_near_dsRNAs_twas_summary.csv
    Untracked:  output/m6a_sites_overlapping_NAT_twas.csv
    Untracked:  output/m6a_sites_overlapping_dsRNAs_coloc.csv
    Untracked:  output/m6a_sites_overlapping_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_overlapping_dsRegions.csv
    Untracked:  output/m6a_sites_overlapping_dsRegions_coloc.csv
    Untracked:  output/negatively_correlated_genes.txt
    Untracked:  output/postively_correlated_genes.txt
    Untracked:  output/rs1806261_RABEP1-NUP88_focused_locusview.pdf
    Untracked:  output/rs1806261_RABEP1-NUP88_locusview.pdf
    Untracked:  output/rs3177647_MAPKAPK5-AS1-MAPKAPK5_locusview.pdf
    Untracked:  output/rs3204541_DDX55-EIF2B1_locusview.pdf
    Untracked:  output/rs7184802_ADCY7-BRD7_locusview.pdf
    Untracked:  output/rs7184802_ADCY7_locuscompare.pdf
    Untracked:  output/twas_genes_PP4_0.3_immune_traits_trackplots.pdf
    Untracked:  output/twas_genes_PP4_0.5_blood_traits_trackplots.pdf
    Untracked:  output/twas_m6a_sites_with_all_cisNATs.RDS
    Untracked:  output/twas_m6a_sites_with_cisNATs_range.RDS
    Untracked:  output/twas_m6a_sites_with_the_nearest_cisNAT.RDS
    Untracked:  twas_genes_PP4_0.3_immune_traits_trackplots.pdf

Unstaged changes:
    Deleted:    analysis/learn_ctwas.Rmd
    Modified:   analysis/lymph_m6A_output_hg19.Rmd
    Modified:   analysis/m6A_switch_to_disease_h2g.Rmd
    Modified:   analysis/rbc_m6A_output_hg19.Rmd
    Modified:   analysis/wbc_m6A_output.Rmd
    Modified:   analysis/wbc_m6A_output_hg19.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/granulo_m6A_output_hg19.Rmd) and HTML (docs/granulo_m6A_output_hg19.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd fb910fa Jing Gu 2023-08-22 blood traits

Load ctwas results

# top 1 method
res <- impute_expr_z(z_snp, weight = weight, ld_R_dir = ld_R_dir,
                         method = NULL, outputdir = outputdir, outname = outname.e,
                         harmonize_z = T, harmonize_wgt = T, scale_by_ld_variance=F,
                         strand_ambig_action_z = "recover", 
                         recover_strand_ambig_wgt = T
# lasso/elastic-net method
res <- impute_expr_z(z_snp, weight = weight, ld_R_dir = ld_R_dir,
                         method = NULL, outputdir = outputdir, outname = outname.e,
                         harmonize_z = T, harmonize_wgt = T, scale_by_ld_variance=F,
                         strand_ambig_action_z = "none", 
                         recover_strand_ambig_wgt = F

GWAS: UK Biobank GWAS summary statistics - European individuals

Weights: FUSION weights using top1, lasso, or elastic-net models were converted into PredictDB format and were not needed to do scaling when running ctwas.

Check convergence of parameters

cTWAS analysis on m6A alone

Joint analysis of expression, splicing and m6A

[1] "Check convergence for the lasso model when jointly analyzing expression, splicing and m6A:"
[1] "Table of group size before/after matching with UKBB SNPs:"
                          SNP      eQTL      sQTL   m6AQTL
prior_group_size    9.324e+06 2005.0000 2191.0000 918.0000
group_size          8.604e+06 1998.0000 2173.0000 912.0000
percent_of_overlaps 9.228e-01    0.9965    0.9918   0.9935
                                SNP     eQTL     sQTL    m6AQTL
estimated_group_prior     1.305e-04 0.020898 0.016981 1.505e-03
estimated_group_prior_var 1.376e+01 8.665924 8.302996 7.340e+00
estimated_group_pve       9.096e-02 0.002131 0.001804 5.934e-05
attributable_group_pve    9.579e-01 0.022440 0.019001 6.250e-04
$lasso

cTWAS results for individual analysis with m6A

top1 model

Summing up PIPs for m6A peaks located in the same gene

Top m6A PIPs by genes

cTWAS results for joint analysis using a lasso model

Top m6A modification pip

Top expression/splicing/m6A units

For m6A or splicing QTLs, they are assigned to the nearest genes (m6A needs to be confirmed with Kevin).

Top SNPs or genes with PIP > 0.6

$eQTL
            genename susie_pip group region_tag
1953         CSNK1G1    0.9704  eQTL      15_29
1945 ENSG00000256092    0.8922  eQTL      12_75
1968          TRIM47    0.8243  eQTL      17_42
1980           CCDC9    0.7865  eQTL      19_34
1914 ENSG00000224032    0.7738  eQTL       5_66
251           TGOLN2    0.7546  eQTL       2_54
225             XPO1    0.7397  eQTL       2_40
272             KYNU    0.6859  eQTL       2_85
1507          RNF135    0.6708  eQTL      17_18
143           NDUFS2    0.6626  eQTL       1_81

$m6AQTL
[1] genename   susie_pip  group      region_tag
<0 rows> (or 0-length row.names)

$sQTL
     genename susie_pip group region_tag
4083    MYO1G    0.8975  sQTL       7_33
4122 C15orf40    0.7757  sQTL      15_38
3199    FNBP4    0.7561  sQTL      11_29
3093   PDLIM1    0.7198  sQTL      10_61
2370    ARPC2    0.6074  sQTL      2_129

Top m6A modification pip

   genename region_tag susie_pip      z
1     TAPBP       6_28   0.55284 -5.227
2     S1PR2      19_10   0.23315  5.169
3    AKAP13      15_39   0.17981  4.118
4    TGOLN2       2_54   0.13742 -4.766
5    DENND3       8_92   0.11619  4.474
6  KIAA0556      16_23   0.10229  4.080
7    SQSTM1      5_108   0.10079 -3.661
8      GFI1       1_56   0.09763  4.444
9     TAPBP       6_28   0.07979  4.769
10  ZKSCAN5       7_61   0.07455  3.258

Summing up PIPs for m6A peaks located in the same gene

Top 10 m6A PIPs by genes

# A tibble: 819 × 2
   genename total_susie_pip
   <chr>              <dbl>
 1 TAPBP             0.633 
 2 S1PR2             0.233 
 3 AKAP13            0.185 
 4 TGOLN2            0.137 
 5 DENND3            0.116 
 6 KIAA0556          0.102 
 7 SQSTM1            0.101 
 8 GFI1              0.0976
 9 ZKSCAN5           0.0746
10 TRAF2             0.0676
# ℹ 809 more rows

Top splicing PIPs

                    peak_id genename       pos region_tag susie_pip      z
1    chr7:45009474-45009639    MYO1G  44925489       7_33    0.8975 -6.709
2   chr15:83660798-83677300 C15orf40  83585402      15_38    0.7757 -3.758
3   chr11:47761655-47765505    FNBP4  47684908      11_29    0.7561  5.384
4   chr10:97007123-97023621   PDLIM1  97001124      10_61    0.7198 -4.569
5  chr2:219093573-219110143    ARPC2 219007060      2_129    0.6074  6.548
6    chr9:86593367-86595418   HNRNPK  86592026       9_41    0.3987  4.652
7   chr21:35276325-35284593   ATP5PO  35268649      21_15    0.3821  3.888
8  chr3:119582452-119624602    GSK3B 119503971       3_74    0.3786  3.340
9   chr13:76143643-76164074    UCHL3  76084334      13_37    0.3730  3.170
10 chr7:105902153-105903838    NAMPT 105865405       7_66    0.3640  3.468

Summing up PIPs for spliced introns located in the same gene

Top 10 splicing PIPs by genes

# A tibble: 10 × 2
   genename total_susie_pip
   <chr>              <dbl>
 1 SYNCRIP            0.972
 2 CD46               0.913
 3 MYO1G              0.898
 4 MCOLN2             0.867
 5 C15orf40           0.776
 6 HNRNPK             0.762
 7 FNBP4              0.756
 8 NCAPG2             0.747
 9 PDLIM1             0.720
10 WARS1              0.688

Top genes by combined PIP

            genename combined_pip expression_pip splicing_pip m6A_pip
624          CSNK1G1        0.977        0.97038      0.00000 0.00645
2887         SYNCRIP        0.972        0.00000      0.97221 0.00000
1790          MCOLN2        0.966        0.09932      0.86690 0.00000
471             CD46        0.913        0.00000      0.91258 0.00000
1957           MYO1G        0.898        0.00000      0.89752 0.00000
1095 ENSG00000256092        0.892        0.89223      0.00000 0.00000
2955          TGOLN2        0.892        0.75461      0.00000 0.13742
252           ATP5PO        0.845        0.19256      0.65272 0.00000
3092          TRIM47        0.824        0.82429      0.00000 0.00000
437            CCDC9        0.786        0.78649      0.00000 0.00000
1518          HNRNPK        0.778        0.00000      0.76220 0.01584
351         C15orf40        0.776        0.00000      0.77573 0.00000
935  ENSG00000224032        0.774        0.77375      0.00000 0.00000
1340           FNBP4        0.756        0.00000      0.75607 0.00000
1987          NCAPG2        0.747        0.00000      0.74699 0.00000
3239           WARS1        0.741        0.05314      0.68821 0.00000
3268            XPO1        0.740        0.73973      0.00000 0.00000
1663            KYNU        0.727        0.68590      0.04079 0.00000
2182          PDLIM1        0.720        0.00000      0.71980 0.00000
2066            NSA2        0.703        0.20269      0.50080 0.00000
     region_tag
624       15_29
2887       6_58
1790       1_52
471       1_107
1957       7_33
1095      12_75
2955       2_54
252       21_15
3092      17_42
437       19_34
1518       9_41
351       15_38
935        5_66
1340      11_29
1987       7_99
3239      14_52
3268       2_40
1663       2_85
2182      10_61
2066       5_44
Loading required package: grid
Warning: replacing previous import 'utils::download.file' by
'restfulr::download.file' when loading 'rtracklayer'

Locus plots for specific examples


R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
 [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] biomaRt_2.52.0       Gviz_1.40.1          cowplot_1.1.1       
 [4] ggplot2_3.4.3        GenomicRanges_1.48.0 GenomeInfoDb_1.32.2 
 [7] IRanges_2.30.1       S4Vectors_0.34.0     BiocGenerics_0.42.0 
[10] ctwas_0.1.38         dplyr_1.1.2          workflowr_1.7.0     

loaded via a namespace (and not attached):
  [1] colorspace_2.1-0            deldir_1.0-6               
  [3] rjson_0.2.21                rprojroot_2.0.3            
  [5] biovizBase_1.44.0           htmlTable_2.4.0            
  [7] XVector_0.36.0              base64enc_0.1-3            
  [9] fs_1.6.3                    dichromat_2.0-0.1          
 [11] rstudioapi_0.15.0           farver_2.1.1               
 [13] bit64_4.0.5                 AnnotationDbi_1.58.0       
 [15] fansi_1.0.4                 xml2_1.3.3                 
 [17] codetools_0.2-18            logging_0.10-108           
 [19] cachem_1.0.8                knitr_1.39                 
 [21] Formula_1.2-4               jsonlite_1.8.7             
 [23] Rsamtools_2.12.0            cluster_2.1.3              
 [25] dbplyr_2.3.3                png_0.1-7                  
 [27] compiler_4.2.0              httr_1.4.7                 
 [29] backports_1.4.1             lazyeval_0.2.2             
 [31] Matrix_1.6-1                fastmap_1.1.1              
 [33] cli_3.6.1                   later_1.3.0                
 [35] htmltools_0.5.2             prettyunits_1.1.1          
 [37] tools_4.2.0                 gtable_0.3.3               
 [39] glue_1.6.2                  GenomeInfoDbData_1.2.8     
 [41] rappdirs_0.3.3              Rcpp_1.0.11                
 [43] Biobase_2.56.0              jquerylib_0.1.4            
 [45] vctrs_0.6.3                 Biostrings_2.64.0          
 [47] rtracklayer_1.56.0          iterators_1.0.14           
 [49] xfun_0.30                   stringr_1.5.0              
 [51] ps_1.7.0                    lifecycle_1.0.3            
 [53] ensembldb_2.20.2            restfulr_0.0.14            
 [55] XML_3.99-0.14               getPass_0.2-2              
 [57] zlibbioc_1.42.0             scales_1.2.1               
 [59] BSgenome_1.64.0             VariantAnnotation_1.42.1   
 [61] ProtGenerics_1.28.0         hms_1.1.3                  
 [63] promises_1.2.0.1            MatrixGenerics_1.8.0       
 [65] parallel_4.2.0              SummarizedExperiment_1.26.1
 [67] AnnotationFilter_1.20.0     RColorBrewer_1.1-3         
 [69] yaml_2.3.5                  curl_5.0.2                 
 [71] memoise_2.0.1               gridExtra_2.3              
 [73] sass_0.4.1                  rpart_4.1.16               
 [75] latticeExtra_0.6-30         stringi_1.7.12             
 [77] RSQLite_2.3.1               highr_0.9                  
 [79] BiocIO_1.6.0                foreach_1.5.2              
 [81] checkmate_2.1.0             GenomicFeatures_1.48.4     
 [83] filelock_1.0.2              BiocParallel_1.30.3        
 [85] rlang_1.1.1                 pkgconfig_2.0.3            
 [87] matrixStats_0.62.0          bitops_1.0-7               
 [89] evaluate_0.15               lattice_0.20-45            
 [91] htmlwidgets_1.5.4           GenomicAlignments_1.32.0   
 [93] labeling_0.4.2              bit_4.0.5                  
 [95] processx_3.8.0              tidyselect_1.2.0           
 [97] magrittr_2.0.3              R6_2.5.1                   
 [99] generics_0.1.3              Hmisc_5.1-0                
[101] DelayedArray_0.22.0         DBI_1.1.3                  
[103] pgenlibr_0.3.6              pillar_1.9.0               
[105] whisker_0.4                 foreign_0.8-82             
[107] withr_2.5.0                 KEGGREST_1.36.2            
[109] RCurl_1.98-1.7              nnet_7.3-17                
[111] tibble_3.2.1                crayon_1.5.2               
[113] interp_1.1-4                utf8_1.2.3                 
[115] BiocFileCache_2.4.0         rmarkdown_2.14             
[117] jpeg_0.1-10                 progress_1.2.2             
[119] data.table_1.14.8           blob_1.2.4                 
[121] callr_3.7.3                 git2r_0.30.1               
[123] digest_0.6.33               httpuv_1.6.5               
[125] munsell_0.5.0               bslib_0.3.1