Software Resources

In our recent work, From Chemoproteomic-Detected Amino Acids to Genomic Coordinates: Insights into Precise Multi-omic Data Integration, we investigate potential sources of mapping errors in large-scale data integration pipelines and developed useful guidelines for future chemoproteomic-genetic studies. With an optimized workflow in hand, we then mapped publicly available Chemoproteomic Detected Amino Acids (CpDAA) and equivalent undetected amino acids in 3,840 proteins to genome-based predictions of missense deleteriousness and known disease-associated mutations from the ClinVar database. Our analysis revealed detected lysines to be enriched for harmful mutations compared and undetected lysines, and the opposite to be true for cysteine residues. Interestingly, higher cysteine reactivity was found to be associated with higher deleteriousness scores compared to less reactive CpD cysteine. Lastly, functional validation with the cysteine protease caspase-8 showcases how chemoproteomic measurements can complement genetic-based annotations to accurately identify functional amino acids in the human proteome.

Final Manuscript available here

https://www.biorxiv.org/content/10.1101/2020.07.03.186007v2

To explore the data:

http://mfpalafox.shinyapps.io/CpDAA

GitHub: https://mfpfox.github.io/MAPPING/


We developed the method DMRscaler to identify regions of differential methylation (DMRs) across the full range of genomic scale. Biologically relevant epigenetic features occur at all levels of genomic scale, from modifications of single basepairs that affect transcription factor binding to genome-wide epigenetic effects in gametogenesis and early development as well as anywhere in between these extremes. We study rare genetic diseases caused by mutations in chromatin modifier genes, where the size of the downstream genomic region of affect is potentially hypervariable. This makes methods that accurately identify the scale of differential epigenetic features critical to developing our understanding of these diseases. In our recent work, DMRscaler: A Scale-Aware Method to Identify Regions of Differential DNA Methylation Spanning Basepair to Muli-Megabase Features, we demonstrate in simulation and real data how DMRscaler outperforms existing methods in the task of identifying the scale of DMRs < 100 bp in length up to > 100 Mb in length, and show in analyses of KAT6A, Sotos, and Weaver syndromes how this can be leveraged to identify higher level features of genome organization, such as gene clusters, that act as the unit of altered epigenetic state.

PrePrint Paper: https://www.biorxiv.org/content/10.1101/2021.02.03.428187v2

Code: https://leroybondhus.github.io/DMRscaler/