Generating a VAF file from one FASTQ file parallelly. And then parallelized read the set of VAF files by vaf_ncm.py.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
YaqingLiu da65279e92 alter -p 3 년 전
picture readme 5 년 전
tasks alter -p 3 년 전
.DS_Store add maxthread 3 년 전
README.md Update README 3 년 전
defaults Update README and -f 3 년 전
inputs Add output_id into fastq_ncm.wdl 3 년 전
workflow.wdl Alter output_id 3 년 전

README.md

NGSCheckMate

A C program, ngscheckmate_fastq, can be directly called to generate a VAF file from one FASTQ file (single-end sequencing) or two FASTQ files(paired-end sequencing).

Then, another script, vaf_ncm.py is used to read a set of VAF files to complete the downstream analysis. When you need to analyze many FASTQ files, the first VAF file generation using ngscheckmate_fastq can be parallelized.

If you want to analyze the correlation of samples over multiple runs, I suggest you to save the historical vaf files and download the NGSCheckMate from https://github.com/parklab/NGSCheckMate and then run vaf_ncm.py locally.

Getting Started

We recommend using choppy system and Aliyun OSS service. The command will look like this:

# Activate the choppy environment
$ open-choppy-env

# Install the APP
$ choppy install YaqingLiu/NGSCheckMate_parallel-latest [-f]

# List the parameters
$ choppy samples YaqingLiu/NGSCheckMate_parallel-latest [--no-default]

# Submit you task with the `samples.json file` and `project name`
$ choppy batch YaqingLiu/NGSCheckMate_parallel-latest samples.json -p Project [-l project:Label]

# Query the status of all tasks in the project
$ choppy query -L project:Label | grep "status"

samples.json

{
  "sample_id": "test", 
  "fastq1": ["fq1_1", "fq1_2", ..., "fq1_n"],
  "fastq2": ["fq2_1", "fq2_2", ..., "fq2_n"],
  "output_id": ["out_id1", "out_id2", ..., "out_idn"]
}

other parameters

  • subsampling_rate: The default subsampling rate is 1. The speed is not very slow.
  • -f in vaf_ncm.wdl: Use strict VAF correlation cutoffs. Recommended when your data may include related individuals (parents-child, siblings).