MalariaSED (malaria parasite sequence decoder) is a sequence-based Deep Learning (DL) framework for malaria parasites to understand the contribution of noncoding variants to epigenetic profiles. The current version can predict the chromatin impacts, including open chromatin accessibility, H3K9ac, and six TFs, including PfAP2-G, PfAP2-I, PfBDP1, PfAP2-G5, PbAP2-O, and PbAP2-G2, covering different parasite living environments like the mosquito host, the human liver, and human blood cells.

We provide two input formats for users to compute the chromatin profile changes:

The VCF format requests that the beginning four columns of the user input file include chromosome ID, genomic variant location, reference nucleotide (the nucleotide sequence for insertion or deletion) and alternative nucleotide (‘*for deletion). The nucleotide sequence length should be shorter than 1kb, and we only support 200 rows each time.

Users can upload two Fasta files for reference and alternation sequences. Multiple sequences are allowed, and MalaraiSED will calculate chromatin effects between two Fasta sequences with the same row ID in the reference and alternation files. The length of both Fasta files should be equal to 1kb. We only allow 200 fasta sequences each time.

A command line version is in GitHub (https://github.com/CharleyWang/MalariaSED).