Germline & Somatic short variant discovery (SNVs + Indels) for WGS & WES.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 2.3KB

4 years ago
4 years ago
4 years ago
3 years ago
4 years ago
3 years ago
4 years ago
3 years ago
4 years ago
3 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
  1. ### Variant Calling
  2. This APP developed for germline and somatic short variant discovery (SNVs + Indels).
  3. **Accepted data**
  4. * TN matched or tumor-only WES/WGS for somatic variant calling
  5. * Normal-only WES/WGS for germline variant calling
  6. The datatype is judged by whether the bed file is set (i.e. the `regions` in inputs).
  7. * Both <u>**FASTQ and BAM data are acceptable**</u>.
  8. * Please set the parameter `input_fastq` to `true` when the input is FASTQ, or set the parameter `input_bam` to `true` when the input is BAM.
  9. **Supported variant callers and annotation tools**
  10. * Variant calling: `haplotyper`, `pindel` (germline); `tnseq`, `tnscope`, `varscan` (somatic; `varscan` don't support the variant calling of tumor-only data).
  11. * Annotation: `annovar`, `vep`.
  12. * The above tools are <u>**not activated by default**</u>, which means the default setting is `false`. You need to manually set the caller to `true` in the submitted sample.csv.
  13. ### New Releases
  14. * Two annotation tools have been added.
  15. * TNhaplotyper, named as TNseq in `v0.1.0`, has beed substituted by TNhaplotyper2.
  16. * The `corealigner` step has been removed.
  17. * Some parameters' details have been changed, such as the `interval_list` has turned into `interval`.
  18. ### Getting Started
  19. We recommend using choppy system and Aliyun OSS service. The command will look like this:
  20. ```
  21. # Activate the choppy environment
  22. $ open-choppy-env
  23. # Install the APP
  24. $ choppy install YaqingLiu/variant-calling [-f]
  25. # List the parameters
  26. $ choppy samples YaqingLiu/variant-calling-latest [--no-default]
  27. # Submit you task with the `samples.csv file` and `project name`
  28. $ choppy batch YaqingLiu/variant-calling-latest samples.csv -p Project [-l project:Label]
  29. # Query the status of all tasks in the project
  30. $ choppy query -L project:Label | grep "status"
  31. ```
  32. **Please note:** The `defaults` can be forcibly replaced by the settings in `samples.csv`. Therefore, there is no need to contact me over this issue.
  33. The parameters that must need contains: sample_id,normal_fastq_1,normal_fastq_2,tumor_fastq_1,tumor_fastq_2
  34. **Please carefully check**
  35. * the reference genome you want to use is hg38 or hg19.
  36. * bed file.
  37. * the caller you want to use.
  38. * PoN VCFs for TNseq and TNscope is supported, but are need to be generated in advance.
  39. * interval padding is default 0, and you can change it.
  40. * usually only one annotation tool is sufficient.