### Variant Calling
This APP developed for germline and somatic short variant discovery (SNVs + Indels).
**Accepted data**
* TN matched or tumor-only WES/WGS for somatic variant calling
* Normal-only WES/WGS for germline variant calling
The datatype is judged by whether the bed file is set (i.e. the `regions` in inputs).
* Both **FASTQ and BAM data are acceptable**.
* Please set the parameter `input_fastq` to `true` when the input is FASTQ, or set the parameter `input_bam` to `true` when the input is BAM.
**Supported variant callers and annotation tools**
* Variant calling: `haplotyper`, `pindel` (germline); `tnseq`, `tnscope`, `varscan` (somatic; `varscan` don't support the variant calling of tumor-only data).
* Annotation: `annovar`, `vep`.
* The above tools are **not activated by default**, which means the default setting is `false`. You need to manually set the caller to `true` in the submitted sample.csv.
### New Releases
* Two annotation tools have been added.
* TNhaplotyper, named as TNseq in `v0.1.0`, has beed substituted by TNhaplotyper2.
* The `corealigner` step has been removed.
* Some parameters' details have been changed, such as the `interval_list` has turned into `interval`.
### Getting Started
We recommend using choppy system and Aliyun OSS service. The command will look like this:
```
# Activate the choppy environment
$ open-choppy-env
# Install the APP
$ choppy install YaqingLiu/variant-calling [-f]
# List the parameters
$ choppy samples YaqingLiu/variant-calling-latest [--no-default]
# Submit you task with the `samples.csv file` and `project name`
$ choppy batch YaqingLiu/variant-calling-latest samples.csv -p Project [-l project:Label]
# Query the status of all tasks in the project
$ choppy query -L project:Label | grep "status"
```
**Please note:** The `defaults` can be forcibly replaced by the settings in `samples.csv`. Therefore, there is no need to contact me over this issue.
The parameters that must need contains: sample_id,normal_fastq_1,normal_fastq_2,tumor_fastq_1,tumor_fastq_2
**Please carefully check**
* the reference genome you want to use is hg38 or hg19.
* bed file.
* the caller you want to use.
* PoN VCFs for TNseq and TNscope is supported, but are need to be generated in advance.
* interval padding is default 0, and you can change it.
* usually only one annotation tool is sufficient.