Homework_2020Bioinformatics

Mapping

Posted by Yajin on October 6, 2020

YajinMo_Homework_2020/09/28

Task 01

将 THA2.fa map 到 BowtieIndex/YeastGenome 上,得到 THA2.sam; 将 e_coli_500.fq map 到 bowtie-src/indexes/e_coli 上,得到 e_coli_500.sam

将作业文件和比对数据以及其他需要文件拷贝至个人文件夹下:

1
2
3
$ ls
BowtieIndex  e_coli_500_1.sam  sam2bed.pl  THA2.sam   wt1X_thout  wt2X_thout
bowtie-src   e_coli_500.fq     THA2.fa     wt1_thout  wt2_thout

对THA2.fa进行mapping得到对应.sam文件

1
2
3
4
5
6
$ bowtie -v 2 -m 10 --best --strata BowtieIndex/YeastGenome -f THA2.fa -S THA2.sam
# reads processed: 1250
# reads with at least one reported alignment: 1158 (92.64%)
# reads that failed to align: 77 (6.16%)
# reads with alignments suppressed due to -m: 15 (1.20%)
Reported 1158 alignments to 1 output stream(s)

同样的,对e_coli_500.fq进行mapping得到对应的.sam文件

1
2
3
4
5
6
$bowtie -v 1 -m 10 --best --strata bowtie-src/indexes/e_coli -q e_coli_500.fq -S e_coli_500.sam
# reads processed: 500
# reads with at least one reported alignment: 499 (99.80%)
# reads that failed to align: 0 (0.00%)
# reads with alignments suppressed due to -m: 1 (0.20%)
Reported 499 alignments to 1 output stream(s)

Task 02

将上面两个结果均转换为 .bed 文件

1
2
$ perl sam2bed.pl THA2.sam > THA2.bed
$ perl sam2bed.pl e_coli_500.sam > e_coli_500.bed

Task 03

从上一步得到的 THA2.bed 中筛选出一个不含chromosome V 的文件和一个只有chromosome XII 的文件

1
2
3
4
5
6
7
$ grep -v $'chrV\t' THA2.bed > THA2_withoutchrV.bed    
$ grep $'chrXII\t' THA2.bed > THA2_onlychrXII.bed    

$ wc -l THA2_withoutchrV.bed
1125 THA2_withoutchrV.bed
$ wc -l THA2_onlychrXII.bed
169 THA2_onlychrXII.bed