TotalOmics
선도 (de novo) 유전체 생명정보 분석
1. Data filtering: Data filtering includes removing adaptors, contamination and low-quality reads from raw reads.
2. Genome survey: Genome sequencing with no less than 20X sequencing depth would be performed with K-mer analysis to obtain the information of genome features such as GC contents, repeats, genome size and heterozygosity rate, evaluate the difficulty of genome sequencing for the specific species and figure out the following work plan.
Common genome: Haploid, Homozygous diploid (or heterozygosity <0.5% of the diploid), Repeat < 50%, GC content between 35% and 65%.
Complex genome: The diploid with a more than 0.5% heterozygosity rate, Polyploid, repeat> 50%, GC content in the abnormal range (< 35% or > 65%), Oversize animal and plant genomes.
If the heterozygosity rate < 0.5%,the Whole Genome Sequencing strategy (WGS) will be chosen;
If the heterozygosity rate > 0.5%,the WGS plus BAC to BAC/Fosmid to Fosmid strategy will be utilized.
3. Genome assembly:
1. Genome assembly
2. Analysis of genome GC content
3. Sequence depth distributions
4. Evaluation of assembled genome
4.1 Evaluation of autosomal regional coverage
4.2 Evaluation of gene region coverage
4. Standard bioinformatics analysis:
1. Annotation
1.1 Repeat annotation
1.2 Gene prediction
1.3 Gene function
1.4 ncRNA annotation
2. Evolution
2.1 Orthologous gene clusters (clustering procedure could be used including treefam, orthomcl, tribemcl, Inpapranoid and Multiparanoid)
2.2 Phylogenetic analysis
2.3 Estimation of divergence time and substitution rate
2.4 Whole genome alignment (Genome synteny)
2.5 Segmental duplication
2.6 Conserved element
5.Draft map criteria
Constructing different insert libraries, the whole genome sequence depth is no less than 40X,genome assembly and obtaining draft map.The euchromatic region coverage must be over 90%, the gene region coverage over 95%,contig N50>5 Kb,Scaffold N50>20 Kb.
6. Fine map criteria
Constructing different insert libraries, the whole genome sequence depth is no less than 60X, genome assembly and obtaining draft map.The euchromatic region coverage must be over 95%, the gene region coverage over 98% contig N50>20 Kb,Scaffold N50>300 Kb
7. Personalized analysis
We can also perform customized analysis to meet requirements of specific projects.
