6 years ago · 25fcba7e8a
--- a/README.md
+++ b/README.md
@@ -46,7 +46,7 @@ graph LR;



 ### 代码与参数
 ### 代码参数

 ```shell
 Usage: Rscript RNAseq_1_ballgown.R [options]
@@ -121,7 +121,7 @@ Rscript RNAseq_1_ballgown.R -o /home/yuying/rnaseqreport_test -i ./ballgown/ -l



 ### 代码与参数
 ### 代码参数

 ```shell
 Usage: Rscript RNAseq_2_pca.R [options]
@@ -200,7 +200,7 @@ Rscript RNAseq_2_pca.R -o -i ballgown_geneexp_log2fpkm_floor0p01_c3r58395_2019-0



 ### 代码与参数
 ### 代码参数

 ```shell
 Usage: Rscript RNAseq_3_cor.R [options]
@@ -397,6 +397,90 @@ Rscript RNAseq_4_pwDEG.R  -i example_geneexp_log2fpkm_floor0p01_c13r58395_2019-0

 ## RNAseq_5_pwGSEA.R

 ### 功能简介

 利用fgsea包对不同比较的基因进行GSEA通路分析。GSEA原理请参看：<https://www.pnas.org/content/102/43/15545>



 ### 代码参数

 ```shell
 Usage: Rscript RNAseq_5_pwGSEA.R [options]


 Options:
 	-o OUT_DIR, --out_dir=OUT_DIR
 		The output directory [default ./]

 	-i INPUT, --input=INPUT
 		The input expression files. Required!

 	-e TYPE_GENE_ID, --type_gene_id=TYPE_GENE_ID
 		The type of gene symbol. Could be either of EnsemblGID/EntrezID/GeneSymbol [default: EnsemblGID]

 	-g SAMPLE_GROUP, --sample_group=SAMPLE_GROUP
 		File for sample group infomation.The input file containing sample name and group infomation. note colname must be like: sample	group1	group2... Required! 

 	-f NUMBER, --padjvalueCutoff=NUMBER
 		Cutoff value of adjusted p value. [default: 0.2]

 	-p PROJECT_CODE, --project_code=PROJECT_CODE
 		Project code, which is used as prefix of output file. [default: rnaseq]

 	-h, --help
 		Show this help message and exit
 ```



 | 参数                                         | 取值类型  | 解释                                                         | 例如        |
 | -------------------------------------------- | --------- | ------------------------------------------------------------ | ----------- |
 | -o OUT_DIR, --out_dir=OUT_DIR                | character | 输出路径，默认为./。可加“/”也可不加“/”                       | ./          |
 | -i INPUT, --input=INPUT                      | character | 输入文件名，**必须输入。**输入表达谱必须是log scaled的tab分隔的表达谱，可以是RNAseq_1_ballgown.R的输出文件。 | example.txt |
 | -e TYPE_GENE_ID, --type_gene_id=TYPE_GENE_ID | character | 基因ID类型，可以是：Ensembl gene ID (EnsemblGID)、Entrez Gene ID (EntrezID)或Gene Symbol (GeneSymbol)。[default: EnsemblGID] | EnsemblGID  |
 | -g SAMPLE_GROUP, --sample_group=SAMPLE_GROUP | character | 有tab分隔的样本的分组信息，**必须输入**。格式为：每行一个样本，每列为分组信息。分组信息可以是多列。 | group.txt   |
 | -q NUMBER, --padjvalueCutoff=NUMBER          | number    | 富集分析adjust p值 cutoff                                    | 0.2         |
 | -p PROJECT_CODE, --project_code=PROJECT_CODE | character | project代号，输出文件的前缀，默认rnaseq                      | rnaseq      |
 | -h, --help                                   |           | 查看帮助文档并退出                                           | -h          |



 ### 运行示例

 ```shell
 Rscript RNAseq_5_pwGSEA.R -o /home/yuying/rnaseqreport_test -i ballgown_geneexp_log2fpkm_floor0p01_c3r58395_2019-04-29.txt  -g group13_2.txt 
 ```



 ### 输出结果



 1. rnaseq_gsea_curatedgenesets.csv 基于curated gene sets的GSEA富集结果。

 2. rnaseq_gsea_go.csv 基于GO 功能的GSEA富集结果。


 > "","versus","pathway","pval","padj","ES","NES","nMoreExtreme","size","leadingEdge"
 > "1","test1 vs test2","GO_REGULATION_OF_CELL_ACTIVATION",0.001007,0.1232,-0.4512,-1.458,0,458,"2302, 3127, 3123, 6352, 3119, 912, 972, 2625, 3122, 958, 8808, 53833, 284021, 10451, 154, 301, 55024, 3109, 923, 348, 8456, 84433, 51237, 4323, 7409, 919, 11005, 3606, 727897, 3929, 7056, 114548, 857, 10673, 695, 163747, 3120, 683, 124912, 29126, 114771, 2150, 3113, 3956, 22914, 4773, 83639, 634, 1029, 1236, 29108, 90865, 441478, 3600, 5896, 51083, 89780, 10148, 3965, 9173, 2323, 84106, 51744, 282618, 2852, 2056, 1269, 5592, 5724, 3623, 3567, 11314, 23529, 7474, 558, 10461, 283234, 11148, 5341, 3273, 9466, 22890, 22806, 917, 8832, 5588, 3952, 282616, 2207, 3111, 3574, 84807, 2268, 282617, 6869, 3659, 6441, 8772, 3575, 8546, 1948, 246778, 1178, 5585, 84959, 2064"
 > "2","test1 vs test2","GO_IMMUNE_EFFECTOR_PROCESS",0.001007,0.1232,-0.4688,-1.515,0,455,"5473, 6374, 3127, 1380, 3123, 8519, 3119, 972, 2625, 10581, 958, 722, 284021, 3437, 3627, 8284, 56892, 10451, 3075, 1755, 3428, 51191, 4353, 644150, 28984, 4939, 55061, 114836, 717, 117157, 9245, 8809, 566, 7409, 919, 154064, 3929, 23705, 340061, 55601, 114548, 4599, 9844, 730, 2633, 3433, 10410, 3764, 2150, 7098, 91607, 60489, 3956, 3439, 3426, 29108, 90865, 10584, 725, 10417, 3078, 715, 3383, 84871, 3434, 10964, 51744, 64218, 1621, 282618, 23586, 3665, 4600, 78989, 710, 733, 3815, 3936, 1636"

 每列的含义：

 A table with GSEA results. Each row corresponds to a tested pathway. The columns are the following: 

 - versus - compared group
 - pathway – name of the pathway as in 'names(pathway)'; 
 -  pval – an enrichment p-value; 
 -  padj – a BH-adjusted p-value; 
 -  ES – enrichment score, same as in Broad GSEA implementation; 
 -  NES – enrichment score normalized to mean enrichment of random samples of the same size; 
 -  nMoreExtreme' – a number of times a random gene set had a more extreme enrichment score value; 
 -  size – size of the pathway after removing genes not present in 'names(stats)'. 
 -  leadingEdge – vector with indexes of leading edge genes that drive the enrichment, see <http://software.broadinstitute.org/gsea/doc/GSEAUserGuideTEXT.htm#_Running_a_Leading>. 



 ## RNAseq_6_enrichfunc.R
@@ -407,7 +491,11 @@ Rscript RNAseq_4_pwDEG.R  -i example_geneexp_log2fpkm_floor0p01_c13r58395_2019-0

 输入：RNAseq_4_pwDEG.R输出的差异基因清单表（PROJECT_CODE_deg_acrossgroups.csv）。

 #注意联网
 *注意联网*

 本功能较慢，每组的分析约需5分钟。



 ### 代码参数

@@ -460,6 +548,11 @@ Options:

 GO和KEGG通路结果。

 内容如下：

 > "","versus","ID","Description","GeneRatio","BgRatio","pvalue","p.adjust","qvalue","geneID","Count"
 > "1","A vs B","hsa05168","Herpes simplex virus 1 infection","39/185","492/7847",9.625e-12,2.117e-09,2.057e-09,"256051/684/7568/10172/84765/80095/162963/84436/55762/55769/55786/90594/148268/342908/30832/348327/100129543/81931/390927/147837/57573/388566/91120/113835/163059/84671/65251/79973/126017/147949/374900/7594/3111/3135/728927/100129842/59348/26974/7772",39



 ### 运行示例