下载测试数据

1
2
curl -L -o pacbio.fastq http://gembox.cbcb.umd.edu/mhap/raw/ecoli_p6_25x.filtered.fastq
curl -L -o oxford.fasta http://nanopore.s3.climb.ac.uk/MAP006-PCR-1_2D_pass.fasta

一步执行

1
2
3
4
canu \
 -p ecoli -d ecoli-pacbio \
 genomeSize=4.8m \
 -pacbio-raw pacbio.fastq

canu可分步骤执行

Correct

1
2
3
4
canu -correct \
  -p ecoli -d ecoli \
  genomeSize=4.8m \
  -pacbio-raw  pacbio.fastq

Trim

1
2
3
4
canu -trim \
  -p ecoli -d ecoli \
  genomeSize=4.8m \
  -pacbio-corrected ecoli/ecoli.correctedReads.fasta.gz

Assemble

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
canu -assemble \
  -p ecoli -d ecoli-erate-0.039 \
  genomeSize=4.8m \
  correctedErrorRate=0.039 \
  -pacbio-corrected ecoli/ecoli.trimmedReads.fasta.gz

canu -assemble \
  -p ecoli -d ecoli-erate-0.075 \
  genomeSize=4.8m \
  correctedErrorRate=0.075 \
  -pacbio-corrected ecoli/ecoli.trimmedReads.fasta.gz

应补足的参数

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
useGrid=false \

gnuplotImageFormat=canvas \ or
gnuplotTested=true ##跳过gnuplot检测,在运行过程中不再使用gnuplot生成图片

## 如不添加上方gnuplot参数,则容易在程序启动前自检中出现如下报错
##(此问题似乎在1.7.1版本修复)
##ERROR:  Failed to detect a suitable output format for gnuplot.
##ERROR:  Looked for png, svg and gif, found none of them.

## java SE 8 

Consensus Accuracy

Canu consensus sequences are typically well above 99% identity for PacBio datasets. Nanopore accuracy varies depending on pore and basecaller version, but is typically above 98% for recent data. Accuracy can be improved by polishing the contigs with tools developed specifically for that task. We recommend Quiver for PacBio and Nanopolish for Oxford Nanpore data. When Illumina reads are available, Pilon can be used to polish either PacBio or Oxford Nanopore assemblies.

For high coverage:

  • For more than 60X coverage, decrease the allowed difference in overlaps (from 4.5% to 4.0% with correctedErrorRate=0.040 for PacBio, from 14.4% to 12% with correctedErrorRate=0.12 for Nanopore), so that only the better corrected reads are used. This is primarily an optimization for speed and generally does not change assembly continuity.

参考来源

http://canu.readthedocs.io/en/latest/faq.html#what-parameters-should-i-use-for-my-reads

https://github.com/marbl/canu/issues/939