基因数据分析软件_全基因组测序分析流程

基因数据分析软件_全基因组测序分析流程基因数据处理9之BWA小数据集测试(成功)hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test2016

基因数据分析软件_全基因组测序分析流程"

基因数据处理9之BWA小数据集测试(成功)

1.fastq为20条,即reads为5条:

hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ cat SRR003161.fastq |head -20 >SRR003161h20.fastq
hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.fastq >SRR003161h20.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 0.01 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 5 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.fastq
[main] Real time: 824.956 sec; CPU: 8.711 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ wc -l SRR003161h20.*
  20 SRR003161h20.fastq
   2 SRR003161h20.sai
  22 total
hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ cat SRR003161h20.sai 
SAI 
    
 
գ=ÿÿÿÿ Xshellhadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ Xshell

但是sai文件只有两行?

2.fastq为1000条,即reads为250条:

hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.fastq >SRR003161h1000.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 1.04 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 250 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.fastq
[main] Real time: 327.850 sec; CPU: 5.880 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ wc -l SRR003161h1000
SRR003161h10000.fastq  SRR003161h10000.sai    SRR003161h1000.fastq   SRR003161h1000.sai     
hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ wc -l SRR003161h1000*
  10000 SRR003161h10000.fastq
      0 SRR003161h10000.sai
   1000 SRR003161h1000.fastq
      2 SRR003161h1000.sai
  11002 total
hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ cat SRR003161h1000.sai 
SAI 
    
 
գ=ÿÿÿÿ @-`\x-`\xhadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ Xshell
ll

2500条reads:

hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ bwa aln -t 4 GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.fastq >SRR003161h10000.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 14.03 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 2500 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa aln -t 4 GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.fastq
[main] Real time: 1443.877 sec; CPU: 19.653 sec

在另外149的节点很快:

5条reads:

hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.fastq >SRR003161h20.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 0.00 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 5 sequences have been processed.
[main] Version: 0.7.13-r1126
[main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.fastq
[main] Real time: 47.452 sec; CPU: 3.616 sec
hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.sai SRR003161h20.fastq >SRR003161h20bwa.sam
[bwa_aln_core] convert to sequence coordinate... 5.85 sec
[bwa_aln_core] refine gapped alignments... 0.66 sec
[bwa_aln_core] print alignments... 0.00 sec
[bwa_aln_core] 5 sequences have been processed.
[main] Version: 0.7.13-r1126
[main] CMD: bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.sai SRR003161h20.fastq
[main] Real time: 100.443 sec; CPU: 6.523 sec

250条reads:

hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.fastq >SRR003161h1000.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 0.19 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 250 sequences have been processed.
[main] Version: 0.7.13-r1126
[main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.fastq
[main] Real time: 40.316 sec; CPU: 4.745 sec

hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.sai SRR003161h1000.fastq >SRR003161h1000bwa.sam
[bwa_aln_core] convert to sequence coordinate... 5.96 sec
[bwa_aln_core] refine gapped alignments... 0.73 sec
[bwa_aln_core] print alignments... 0.00 sec
[bwa_aln_core] 250 sequences have been processed.
[main] Version: 0.7.13-r1126
[main] CMD: bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.sai SRR003161h1000.fastq
[main] Real time: 156.192 sec; CPU: 6.695 sec

2500条reads:

hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.fastq >SRR003161h10000.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 6.23 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 2500 sequences have been processed.
[main] Version: 0.7.13-r1126
[main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.fastq
[main] Real time: 43.496 sec; CPU: 10.163 sec
hadoop@Master:~/cloud/adam/xubo/data/test20160310$ ll -h
total 2.2G
drwxrwxr-x 3 hadoop hadoop 4.0K  3月 13 14:45 ./
drwxrwxr-x 3 hadoop hadoop 4.0K  3月 12 14:51 ../
drwxrwxr-x 2 hadoop hadoop 4.0K  3月 12 15:24 GCA_000001405.15_GRCh38/
-rw-rw-r-- 1 hadoop hadoop    0  3月 12 15:49 SRR003161a.sam
-rw-rw-r-- 1 hadoop hadoop  12K  3月 12 16:15 SRR003161b.sam
-rw-rw-r-- 1 hadoop hadoop    0  3月 12 22:50 SRR003161c.sam
-rw-rw-r-- 1 hadoop hadoop 1.6G  3月 12 15:49 SRR003161.fastq
-rw-rw-r-- 1 hadoop hadoop 527M  3月 12 16:10 SRR003161.fastq.gz
-rw-rw-r-- 1 hadoop hadoop 3.1M  3月 12 22:50 SRR003161h10000.fastq
-rw-rw-r-- 1 hadoop hadoop  11K  3月 13 14:46 SRR003161h10000.sai
-rw-rw-r-- 1 hadoop hadoop 3.3M  3月 13 00:50 SRR003161h10000.sam
-rw-rw-r-- 1 hadoop hadoop 336K  3月 12 22:08 SRR003161h1000.fastq
-rw-rw-r-- 1 hadoop hadoop 1.1K  3月 13 14:41 SRR003161h1000.sai
-rw-rw-r-- 1 hadoop hadoop    0  3月 13 14:45 SRR003161h1000.sam
-rw-rw-r-- 1 hadoop hadoop 5.7K  3月 12 21:56 SRR003161h20.fastq
-rw-rw-r-- 1 hadoop hadoop  25K  3月 12 22:02 SRR003161h20.sam
-rw-rw-r-- 1 hadoop hadoop 1.1M  3月 12 15:49 SRR003161.sai

hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.sai SRR003161h10000.fastq >SRR003161h10000bwa.sam
[bwa_aln_core] convert to sequence coordinate... 6.17 sec
[bwa_aln_core] refine gapped alignments... 0.70 sec
[bwa_aln_core] print alignments... 0.01 sec
[bwa_aln_core] 2500 sequences have been processed.
[main] Version: 0.7.13-r1126
[main] CMD: bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.sai SRR003161h10000.fastq
[main] Real time: 131.808 sec; CPU: 6.896 sec

25000条reads:

hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h100000.fastq >SRR003161h100000.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 65.66 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 25000 sequences have been processed.
[main] Version: 0.7.13-r1126
[main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h100000.fastq
[main] Real time: 105.888 sec; CPU: 70.249 sec
hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h100000.sai SRR003161h100000.fastq >SRR003161h100000bwa.sam
[bwa_aln_core] convert to sequence coordinate... 6.44 sec
[bwa_aln_core] refine gapped alignments... 0.89 sec
[bwa_aln_core] print alignments... 0.07 sec
[bwa_aln_core] 25000 sequences have been processed.
[main] Version: 0.7.13-r1126
[main] CMD: bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h100000.sai SRR003161h100000.fastq
[main] Real time: 177.881 sec; CPU: 7.552 sec

明天再测试

今天的文章基因数据分析软件_全基因组测序分析流程分享到此就结束了,感谢您的阅读。

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
如需转载请保留出处:http://bianchenghao.cn/64838.html

(0)
编程小号编程小号

相关推荐

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注