### NGSCheckMate A C program, ngscheckmate_fastq, can be directly called to generate a VAF file from one FASTQ file (single-end sequencing) or two FASTQ files(paired-end sequencing). Then, another script, vaf_ncm.py is used to read a set of VAF files to complete the downstream analysis. When you need to analyze many FASTQ files, the first VAF file generation using ngscheckmate_fastq can be parallelized. **If you want to analyze the correlation of samples over multiple runs, I suggest you to save the historical `vaf` files and download the NGSCheckMate from https://github.com/parklab/NGSCheckMate and then run vaf_ncm.py locally.** ### Getting Started We recommend using choppy system and Aliyun OSS service. The command will look like this: ``` # Activate the choppy environment $ open-choppy-env # Install the APP $ choppy install YaqingLiu/NGSCheckMate_parallel-latest [-f] # List the parameters $ choppy samples YaqingLiu/NGSCheckMate_parallel-latest [--no-default] # Submit you task with the `samples.json file` and `project name` $ choppy batch YaqingLiu/NGSCheckMate_parallel-latest samples.json -p Project [-l project:Label] # Query the status of all tasks in the project $ choppy query -L project:Label | grep "status" ``` #### samples.json ``` json { "sample_id": "test", "fastq1": ["fq1_1", "fq1_2", ..., "fq1_n"], "fastq2": ["fq2_1", "fq2_2", ..., "fq2_n"], "output_id": ["out_id1", "out_id2", ..., "out_idn"] } ``` #### other parameters - subsampling_rate: The default subsampling rate is 1. The speed is not very slow. - -f in vaf_ncm.wdl: Use strict VAF correlation cutoffs. Recommended when your data may include related individuals (parents-child, siblings).