|
|
@@ -46,7 +46,7 @@ graph LR; |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 代码与参数
|
|
|
|
### 代码参数
|
|
|
|
|
|
|
|
```shell
|
|
|
|
Usage: Rscript RNAseq_1_ballgown.R [options]
|
|
|
@@ -121,7 +121,7 @@ Rscript RNAseq_1_ballgown.R -o /home/yuying/rnaseqreport_test -i ./ballgown/ -l |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 代码与参数
|
|
|
|
### 代码参数
|
|
|
|
|
|
|
|
```shell
|
|
|
|
Usage: Rscript RNAseq_2_pca.R [options]
|
|
|
@@ -200,7 +200,7 @@ Rscript RNAseq_2_pca.R -o -i ballgown_geneexp_log2fpkm_floor0p01_c3r58395_2019-0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 代码与参数
|
|
|
|
### 代码参数
|
|
|
|
|
|
|
|
```shell
|
|
|
|
Usage: Rscript RNAseq_3_cor.R [options]
|
|
|
@@ -397,6 +397,90 @@ Rscript RNAseq_4_pwDEG.R -i example_geneexp_log2fpkm_floor0p01_c13r58395_2019-0 |
|
|
|
|
|
|
|
## RNAseq_5_pwGSEA.R
|
|
|
|
|
|
|
|
### 功能简介
|
|
|
|
|
|
|
|
利用fgsea包对不同比较的基因进行GSEA通路分析。GSEA原理请参看:<https://www.pnas.org/content/102/43/15545>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 代码参数
|
|
|
|
|
|
|
|
```shell
|
|
|
|
Usage: Rscript RNAseq_5_pwGSEA.R [options]
|
|
|
|
|
|
|
|
|
|
|
|
Options:
|
|
|
|
-o OUT_DIR, --out_dir=OUT_DIR
|
|
|
|
The output directory [default ./]
|
|
|
|
|
|
|
|
-i INPUT, --input=INPUT
|
|
|
|
The input expression files. Required!
|
|
|
|
|
|
|
|
-e TYPE_GENE_ID, --type_gene_id=TYPE_GENE_ID
|
|
|
|
The type of gene symbol. Could be either of EnsemblGID/EntrezID/GeneSymbol [default: EnsemblGID]
|
|
|
|
|
|
|
|
-g SAMPLE_GROUP, --sample_group=SAMPLE_GROUP
|
|
|
|
File for sample group infomation.The input file containing sample name and group infomation. note colname must be like: sample group1 group2... Required!
|
|
|
|
|
|
|
|
-f NUMBER, --padjvalueCutoff=NUMBER
|
|
|
|
Cutoff value of adjusted p value. [default: 0.2]
|
|
|
|
|
|
|
|
-p PROJECT_CODE, --project_code=PROJECT_CODE
|
|
|
|
Project code, which is used as prefix of output file. [default: rnaseq]
|
|
|
|
|
|
|
|
-h, --help
|
|
|
|
Show this help message and exit
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 参数 | 取值类型 | 解释 | 例如 |
|
|
|
|
| -------------------------------------------- | --------- | ------------------------------------------------------------ | ----------- |
|
|
|
|
| -o OUT_DIR, --out_dir=OUT_DIR | character | 输出路径,默认为./。可加“/”也可不加“/” | ./ |
|
|
|
|
| -i INPUT, --input=INPUT | character | 输入文件名,**必须输入。**输入表达谱必须是log scaled的tab分隔的表达谱,可以是RNAseq_1_ballgown.R的输出文件。 | example.txt |
|
|
|
|
| -e TYPE_GENE_ID, --type_gene_id=TYPE_GENE_ID | character | 基因ID类型,可以是:Ensembl gene ID (EnsemblGID)、Entrez Gene ID (EntrezID)或Gene Symbol (GeneSymbol)。[default: EnsemblGID] | EnsemblGID |
|
|
|
|
| -g SAMPLE_GROUP, --sample_group=SAMPLE_GROUP | character | 有tab分隔的样本的分组信息,**必须输入**。格式为:每行一个样本,每列为分组信息。分组信息可以是多列。 | group.txt |
|
|
|
|
| -q NUMBER, --padjvalueCutoff=NUMBER | number | 富集分析adjust p值 cutoff | 0.2 |
|
|
|
|
| -p PROJECT_CODE, --project_code=PROJECT_CODE | character | project代号,输出文件的前缀,默认rnaseq | rnaseq |
|
|
|
|
| -h, --help | | 查看帮助文档并退出 | -h |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 运行示例
|
|
|
|
|
|
|
|
```shell
|
|
|
|
Rscript RNAseq_5_pwGSEA.R -o /home/yuying/rnaseqreport_test -i ballgown_geneexp_log2fpkm_floor0p01_c3r58395_2019-04-29.txt -g group13_2.txt
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 输出结果
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1. rnaseq_gsea_curatedgenesets.csv 基于curated gene sets的GSEA富集结果。
|
|
|
|
|
|
|
|
2. rnaseq_gsea_go.csv 基于GO 功能的GSEA富集结果。
|
|
|
|
|
|
|
|
|
|
|
|
> "","versus","pathway","pval","padj","ES","NES","nMoreExtreme","size","leadingEdge"
|
|
|
|
> "1","test1 vs test2","GO_REGULATION_OF_CELL_ACTIVATION",0.001007,0.1232,-0.4512,-1.458,0,458,"2302, 3127, 3123, 6352, 3119, 912, 972, 2625, 3122, 958, 8808, 53833, 284021, 10451, 154, 301, 55024, 3109, 923, 348, 8456, 84433, 51237, 4323, 7409, 919, 11005, 3606, 727897, 3929, 7056, 114548, 857, 10673, 695, 163747, 3120, 683, 124912, 29126, 114771, 2150, 3113, 3956, 22914, 4773, 83639, 634, 1029, 1236, 29108, 90865, 441478, 3600, 5896, 51083, 89780, 10148, 3965, 9173, 2323, 84106, 51744, 282618, 2852, 2056, 1269, 5592, 5724, 3623, 3567, 11314, 23529, 7474, 558, 10461, 283234, 11148, 5341, 3273, 9466, 22890, 22806, 917, 8832, 5588, 3952, 282616, 2207, 3111, 3574, 84807, 2268, 282617, 6869, 3659, 6441, 8772, 3575, 8546, 1948, 246778, 1178, 5585, 84959, 2064"
|
|
|
|
> "2","test1 vs test2","GO_IMMUNE_EFFECTOR_PROCESS",0.001007,0.1232,-0.4688,-1.515,0,455,"5473, 6374, 3127, 1380, 3123, 8519, 3119, 972, 2625, 10581, 958, 722, 284021, 3437, 3627, 8284, 56892, 10451, 3075, 1755, 3428, 51191, 4353, 644150, 28984, 4939, 55061, 114836, 717, 117157, 9245, 8809, 566, 7409, 919, 154064, 3929, 23705, 340061, 55601, 114548, 4599, 9844, 730, 2633, 3433, 10410, 3764, 2150, 7098, 91607, 60489, 3956, 3439, 3426, 29108, 90865, 10584, 725, 10417, 3078, 715, 3383, 84871, 3434, 10964, 51744, 64218, 1621, 282618, 23586, 3665, 4600, 78989, 710, 733, 3815, 3936, 1636"
|
|
|
|
|
|
|
|
每列的含义:
|
|
|
|
|
|
|
|
A table with GSEA results. Each row corresponds to a tested pathway. The columns are the following:
|
|
|
|
|
|
|
|
- versus - compared group
|
|
|
|
- pathway – name of the pathway as in 'names(pathway)';
|
|
|
|
- pval – an enrichment p-value;
|
|
|
|
- padj – a BH-adjusted p-value;
|
|
|
|
- ES – enrichment score, same as in Broad GSEA implementation;
|
|
|
|
- NES – enrichment score normalized to mean enrichment of random samples of the same size;
|
|
|
|
- nMoreExtreme' – a number of times a random gene set had a more extreme enrichment score value;
|
|
|
|
- size – size of the pathway after removing genes not present in 'names(stats)'.
|
|
|
|
- leadingEdge – vector with indexes of leading edge genes that drive the enrichment, see <http://software.broadinstitute.org/gsea/doc/GSEAUserGuideTEXT.htm#_Running_a_Leading>.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## RNAseq_6_enrichfunc.R
|
|
|
@@ -407,7 +491,11 @@ Rscript RNAseq_4_pwDEG.R -i example_geneexp_log2fpkm_floor0p01_c13r58395_2019-0 |
|
|
|
|
|
|
|
输入:RNAseq_4_pwDEG.R输出的差异基因清单表(PROJECT_CODE_deg_acrossgroups.csv)。
|
|
|
|
|
|
|
|
#注意联网
|
|
|
|
*注意联网*
|
|
|
|
|
|
|
|
本功能较慢,每组的分析约需5分钟。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 代码参数
|
|
|
|
|
|
|
@@ -460,6 +548,11 @@ Options: |
|
|
|
|
|
|
|
GO和KEGG通路结果。
|
|
|
|
|
|
|
|
内容如下:
|
|
|
|
|
|
|
|
> "","versus","ID","Description","GeneRatio","BgRatio","pvalue","p.adjust","qvalue","geneID","Count"
|
|
|
|
> "1","A vs B","hsa05168","Herpes simplex virus 1 infection","39/185","492/7847",9.625e-12,2.117e-09,2.057e-09,"256051/684/7568/10172/84765/80095/162963/84436/55762/55769/55786/90594/148268/342908/30832/348327/100129543/81931/390927/147837/57573/388566/91120/113835/163059/84671/65251/79973/126017/147949/374900/7594/3111/3135/728927/100129842/59348/26974/7772",39
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 运行示例
|