Browse Source

上传文件至 '.'

master
yingyu 6 years ago
parent
commit
25fcba7e8a
1 changed files with 97 additions and 4 deletions
  1. +97
    -4
      README.md

+ 97
- 4
README.md View File

@@ -46,7 +46,7 @@ graph LR;
### 代码参数
### 代码参数
```shell
Usage: Rscript RNAseq_1_ballgown.R [options]
@@ -121,7 +121,7 @@ Rscript RNAseq_1_ballgown.R -o /home/yuying/rnaseqreport_test -i ./ballgown/ -l
### 代码参数
### 代码参数
```shell
Usage: Rscript RNAseq_2_pca.R [options]
@@ -200,7 +200,7 @@ Rscript RNAseq_2_pca.R -o -i ballgown_geneexp_log2fpkm_floor0p01_c3r58395_2019-0
### 代码参数
### 代码参数
```shell
Usage: Rscript RNAseq_3_cor.R [options]
@@ -397,6 +397,90 @@ Rscript RNAseq_4_pwDEG.R -i example_geneexp_log2fpkm_floor0p01_c13r58395_2019-0
## RNAseq_5_pwGSEA.R
### 功能简介
利用fgsea包对不同比较的基因进行GSEA通路分析。GSEA原理请参看:<https://www.pnas.org/content/102/43/15545>
### 代码参数
```shell
Usage: Rscript RNAseq_5_pwGSEA.R [options]
Options:
-o OUT_DIR, --out_dir=OUT_DIR
The output directory [default ./]
-i INPUT, --input=INPUT
The input expression files. Required!
-e TYPE_GENE_ID, --type_gene_id=TYPE_GENE_ID
The type of gene symbol. Could be either of EnsemblGID/EntrezID/GeneSymbol [default: EnsemblGID]
-g SAMPLE_GROUP, --sample_group=SAMPLE_GROUP
File for sample group infomation.The input file containing sample name and group infomation. note colname must be like: sample group1 group2... Required!
-f NUMBER, --padjvalueCutoff=NUMBER
Cutoff value of adjusted p value. [default: 0.2]
-p PROJECT_CODE, --project_code=PROJECT_CODE
Project code, which is used as prefix of output file. [default: rnaseq]
-h, --help
Show this help message and exit
```
| 参数 | 取值类型 | 解释 | 例如 |
| -------------------------------------------- | --------- | ------------------------------------------------------------ | ----------- |
| -o OUT_DIR, --out_dir=OUT_DIR | character | 输出路径,默认为./。可加“/”也可不加“/” | ./ |
| -i INPUT, --input=INPUT | character | 输入文件名,**必须输入。**输入表达谱必须是log scaled的tab分隔的表达谱,可以是RNAseq_1_ballgown.R的输出文件。 | example.txt |
| -e TYPE_GENE_ID, --type_gene_id=TYPE_GENE_ID | character | 基因ID类型,可以是:Ensembl gene ID (EnsemblGID)、Entrez Gene ID (EntrezID)或Gene Symbol (GeneSymbol)。[default: EnsemblGID] | EnsemblGID |
| -g SAMPLE_GROUP, --sample_group=SAMPLE_GROUP | character | 有tab分隔的样本的分组信息,**必须输入**。格式为:每行一个样本,每列为分组信息。分组信息可以是多列。 | group.txt |
| -q NUMBER, --padjvalueCutoff=NUMBER | number | 富集分析adjust p值 cutoff | 0.2 |
| -p PROJECT_CODE, --project_code=PROJECT_CODE | character | project代号,输出文件的前缀,默认rnaseq | rnaseq |
| -h, --help | | 查看帮助文档并退出 | -h |
### 运行示例
```shell
Rscript RNAseq_5_pwGSEA.R -o /home/yuying/rnaseqreport_test -i ballgown_geneexp_log2fpkm_floor0p01_c3r58395_2019-04-29.txt -g group13_2.txt
```
### 输出结果
1. rnaseq_gsea_curatedgenesets.csv 基于curated gene sets的GSEA富集结果。
2. rnaseq_gsea_go.csv 基于GO 功能的GSEA富集结果。
> "","versus","pathway","pval","padj","ES","NES","nMoreExtreme","size","leadingEdge"
> "1","test1 vs test2","GO_REGULATION_OF_CELL_ACTIVATION",0.001007,0.1232,-0.4512,-1.458,0,458,"2302, 3127, 3123, 6352, 3119, 912, 972, 2625, 3122, 958, 8808, 53833, 284021, 10451, 154, 301, 55024, 3109, 923, 348, 8456, 84433, 51237, 4323, 7409, 919, 11005, 3606, 727897, 3929, 7056, 114548, 857, 10673, 695, 163747, 3120, 683, 124912, 29126, 114771, 2150, 3113, 3956, 22914, 4773, 83639, 634, 1029, 1236, 29108, 90865, 441478, 3600, 5896, 51083, 89780, 10148, 3965, 9173, 2323, 84106, 51744, 282618, 2852, 2056, 1269, 5592, 5724, 3623, 3567, 11314, 23529, 7474, 558, 10461, 283234, 11148, 5341, 3273, 9466, 22890, 22806, 917, 8832, 5588, 3952, 282616, 2207, 3111, 3574, 84807, 2268, 282617, 6869, 3659, 6441, 8772, 3575, 8546, 1948, 246778, 1178, 5585, 84959, 2064"
> "2","test1 vs test2","GO_IMMUNE_EFFECTOR_PROCESS",0.001007,0.1232,-0.4688,-1.515,0,455,"5473, 6374, 3127, 1380, 3123, 8519, 3119, 972, 2625, 10581, 958, 722, 284021, 3437, 3627, 8284, 56892, 10451, 3075, 1755, 3428, 51191, 4353, 644150, 28984, 4939, 55061, 114836, 717, 117157, 9245, 8809, 566, 7409, 919, 154064, 3929, 23705, 340061, 55601, 114548, 4599, 9844, 730, 2633, 3433, 10410, 3764, 2150, 7098, 91607, 60489, 3956, 3439, 3426, 29108, 90865, 10584, 725, 10417, 3078, 715, 3383, 84871, 3434, 10964, 51744, 64218, 1621, 282618, 23586, 3665, 4600, 78989, 710, 733, 3815, 3936, 1636"
每列的含义:
A table with GSEA results. Each row corresponds to a tested pathway. The columns are the following:
- versus - compared group
- pathway – name of the pathway as in 'names(pathway)';
- pval – an enrichment p-value;
- padj – a BH-adjusted p-value;
- ES – enrichment score, same as in Broad GSEA implementation;
- NES – enrichment score normalized to mean enrichment of random samples of the same size;
- nMoreExtreme' – a number of times a random gene set had a more extreme enrichment score value;
- size – size of the pathway after removing genes not present in 'names(stats)'.
- leadingEdge – vector with indexes of leading edge genes that drive the enrichment, see <http://software.broadinstitute.org/gsea/doc/GSEAUserGuideTEXT.htm#_Running_a_Leading>.
## RNAseq_6_enrichfunc.R
@@ -407,7 +491,11 @@ Rscript RNAseq_4_pwDEG.R -i example_geneexp_log2fpkm_floor0p01_c13r58395_2019-0
输入:RNAseq_4_pwDEG.R输出的差异基因清单表(PROJECT_CODE_deg_acrossgroups.csv)。
#注意联网
*注意联网*
本功能较慢,每组的分析约需5分钟。
### 代码参数
@@ -460,6 +548,11 @@ Options:
GO和KEGG通路结果。
内容如下:
> "","versus","ID","Description","GeneRatio","BgRatio","pvalue","p.adjust","qvalue","geneID","Count"
> "1","A vs B","hsa05168","Herpes simplex virus 1 infection","39/185","492/7847",9.625e-12,2.117e-09,2.057e-09,"256051/684/7568/10172/84765/80095/162963/84436/55762/55769/55786/90594/148268/342908/30832/348327/100129543/81931/390927/147837/57573/388566/91120/113835/163059/84671/65251/79973/126017/147949/374900/7594/3111/3135/728927/100129842/59348/26974/7772",39
### 运行示例

Loading…
Cancel
Save