Skip to content

Latest commit

 

History

History
49 lines (30 loc) · 1.12 KB

Mapping-To-Concatenate-FASTA.md

File metadata and controls

49 lines (30 loc) · 1.12 KB

Mapping to concatenated samples

If you have created a concatenated FASTA file as in the main tutorial (using SemiBin2 concatenate_fasta), you can use your favorite pipeline to map the reads.

For completeness, we demonstrate here how to do it with NGLess.

The script is provided (map-to-concatenated.ngl), so you can just use it.

  1. Install NGLess:
conda install -c conda-forge -c bioconda ngless
  1. Run it 5 times

We need to run it once per sample:

for i in $(seq 5) ; do
    ngless map-to-concatenated.ngl
done

The map-to-concatenated.ngl script should be self-explanatory:

ngless "1.5"
import "parallel" version "1.1"
import "samtools" version "1.0"

FAFILE = "multi_output/concatenated.fa.gz"
SAMPLES = ["S1", "S2", "S3", "S4", "S5"]

sample = run_for_all(SAMPLES)

input = fastq("multi_sample_binning" </> sample + ".fq.gz")
input = preprocess(input) using |read|:
    read = substrim(read, min_quality=25)

mapped = map(input, fafile=FAFILE)

sorted_mapped = samtools_sort(mapped)

write(sorted_mapped, ofile="multi_output/mapped_" + sample + ".sorted.bam")