# 基于中华家系1号的全基因组数据质量控制报告 ###1. 基本信息 **机构名称:**复旦大学 **建库方法:**TruSeq **测序仪器:**Illumina Novaseq **Read长度:**双端150bp **参考基因组:**GRCh38 **回收数据:** | 样本 | Fastq | Bam | VCF | | :------: | :---: | :--: | :--: | | Fudan D5 | 3 | 3 | 3 | | Fudan D6 | 3 | 3 | 3 | | Fudan F7 | 3 | 3 | 3 | | Fudan M8 | 3 | 3 | 3 | **分析流程:** | 步骤 | 软件 | 版本号 | | :---------------: | :---------: | :----: | | Preprocessing | Trimmomatic | | | Mapping | BWA-MEM | | | Remove Duplicates | Picard | | | BQSR | GATK | | | Calling | Haplotyper | | | Filtration | VQSR | | ###2. 原始数据质量评估 | 项目名称 | Fudan D5 | Fudan D6 | Fudan F7 | Fudan M8 | ... | 参考值 | | :-------------------: | :------: | :------: | :------: | :------: | :--: | ------ | | 原始数据量(Million) | | | | | | | | Reads重复率(%) | | | | | | | | 原始数据测序深度 | | | | | | | | 碱基质量 | | | | | | | | ATGC含量 (%) | | | | | | | | GC含量分布(%) | | | | | | | | 重复序列(%) | | | | | | | | 接头序列含量(%) | | | | | | | | 污染物 | | | | | | | ###3. 比对后数据质量评估 | 项目名称 | Fudan D5 | Fudan D6 | Fudan F7 | Fudan M8 | ... | 参考值 | | :--------------------: | :------: | :------: | -------- | -------- | :--: | ------ | | 比对率(%) | | | | | | | | 高质量Reads比对率(%) | | | | | | | | 错配率(%) | | | | | | | | 去重后测序深度 | | | | | | | | Insert size (bp) | | | | | | | ###4. 突变数据质量评估 ####(1)突变统计 | Sample | Fudan D5 | Fudan D6 | Fudan F7 | Fudan M8 | ... | | :---------------------------: | :------: | :------: | :------: | :------: | :--: | | 突变总数 | | | | | | | SNVs | | | | | | | Insertions | | | | | | | Deletions | | | | | | | SNV Transitions/Transversions | | | | | | | Total Het/Hom ratio | | | | | | | SNV Het/Hom ratio | | | | | | | Insertions Het/Hom ratio | | | | | | | Deletions Het/Hom ratio | | | | | | ####(2)技术重复一致性 | 技术重复的一致性 | Fudan D5 | Fudan D6 | Fudan F7 | Fudan M8 | ... | | ---------------- | -------- | -------- | -------- | -------- | ---- | | SNVs | | | | | | | Indels | | | | | | ![](./pictures/density.png) ####(3)突变检测准确性(Presicion, recall, F1-score) | 基因环境 | 类型 | Fudan D5 | Fudan D6 | Fudan F7 | Fudan M8 | | ------------------------------------------------------------ | ----- | -------- | -------- | -------- | -------- | | All | SNV | | | | | | | Indel | | | | | | Not in homopolymers or TRs | SNV | | | | | | | Indel | | | | | | In homopolymers or TRs | SNV | | | | | | | Indel | | | | | | GC content | SNV | | | | | | | Indel | | | | | | Tier 1 (supported by all replicates, sequencing sites, platforms and bioinformatic pipelines) | SNV | | | | | | | Indel | | | | | | Tier 2 (supported by majority of replicates, sequencing sites, platforms and bioinformatic pipelines ) | SNV | | | | | | | Indel | | | | | | Tier 3 (supported by only one platform and multiple bioinformatic pipelines) | SNV | | | | | | | Indel | | | | | | In high confidence bed | SNV | | | | | | | Indel | | | | | | Not in high confidence bed | SNV | | | | | | | Indel | | | | | | In structural variantion region | SNV | | | | | | | Indel | | | | | | Clinical relevant mutations | SNV | | | | | | | Indel | | | | | | De novo | SNV | | | | | | | Indel | | | | | ![](./pictures/Screen Shot 2019-07-30 at 12.14.00 AM.png) #### 下载 **SNV** - True Positive - True Negative - False Positive - False Negative **INDEL** - True Positive - True Negative - False Positive - False Negative ###5. 声明 ​ 本质量检测报告,仅适用于此次实验测试数据,不代表对测序公司业务水平的评估。本质量检测报告,仅用于科学项目研究,请勿用于临床或商业。任何单位或个人因使用此检测报告结果造成的任何利益或损失(包括直接和间接损失),本单位不承担任何经济和法律责任。