浏览代码

First Commit.

tags/v0.1.0
YJC 3 年前
当前提交
4beb5d1062
共有 9 个文件被更改,包括 774 次插入0 次删除
  1. +11
    -0
      .gitignore
  2. +277
    -0
      LICENSE.md
  3. +42
    -0
      README.md
  4. +29
    -0
      project.clj
  5. +9
    -0
      resources/quartet_rnaseq_report.yaml
  6. +15
    -0
      resources/tservice-plugin.yaml
  7. +322
    -0
      src/tservice/plugins/quartet_rnaseq_report.clj
  8. +33
    -0
      src/tservice/plugins/quartet_rnaseq_report/exp2qcdt.clj
  9. +36
    -0
      src/tservice/plugins/quartet_rnaseq_report/merge_exp.clj

+ 11
- 0
.gitignore 查看文件

@@ -0,0 +1,11 @@
/target
/classes
/checkouts
pom.xml
pom.xml.asc
*.jar
*.class
/.lein-*
/.nrepl-port
.hgignore
.hg/

+ 277
- 0
LICENSE.md 查看文件

@@ -0,0 +1,277 @@
Eclipse Public License - v 2.0

THE ACCOMPANYING PROGRAM IS PROVIDED UNDER THE TERMS OF THIS ECLIPSE
PUBLIC LICENSE ("AGREEMENT"). ANY USE, REPRODUCTION OR DISTRIBUTION
OF THE PROGRAM CONSTITUTES RECIPIENT'S ACCEPTANCE OF THIS AGREEMENT.

1. DEFINITIONS

"Contribution" means:

a) in the case of the initial Contributor, the initial content
Distributed under this Agreement, and

b) in the case of each subsequent Contributor:
i) changes to the Program, and
ii) additions to the Program;
where such changes and/or additions to the Program originate from
and are Distributed by that particular Contributor. A Contribution
"originates" from a Contributor if it was added to the Program by
such Contributor itself or anyone acting on such Contributor's behalf.
Contributions do not include changes or additions to the Program that
are not Modified Works.

"Contributor" means any person or entity that Distributes the Program.

"Licensed Patents" mean patent claims licensable by a Contributor which
are necessarily infringed by the use or sale of its Contribution alone
or when combined with the Program.

"Program" means the Contributions Distributed in accordance with this
Agreement.

"Recipient" means anyone who receives the Program under this Agreement
or any Secondary License (as applicable), including Contributors.

"Derivative Works" shall mean any work, whether in Source Code or other
form, that is based on (or derived from) the Program and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship.

"Modified Works" shall mean any work in Source Code or other form that
results from an addition to, deletion from, or modification of the
contents of the Program, including, for purposes of clarity any new file
in Source Code form that contains any contents of the Program. Modified
Works shall not include works that contain only declarations,
interfaces, types, classes, structures, or files of the Program solely
in each case in order to link to, bind by name, or subclass the Program
or Modified Works thereof.

"Distribute" means the acts of a) distributing or b) making available
in any manner that enables the transfer of a copy.

"Source Code" means the form of a Program preferred for making
modifications, including but not limited to software source code,
documentation source, and configuration files.

"Secondary License" means either the GNU General Public License,
Version 2.0, or any later versions of that license, including any
exceptions or additional permissions as identified by the initial
Contributor.

2. GRANT OF RIGHTS

a) Subject to the terms of this Agreement, each Contributor hereby
grants Recipient a non-exclusive, worldwide, royalty-free copyright
license to reproduce, prepare Derivative Works of, publicly display,
publicly perform, Distribute and sublicense the Contribution of such
Contributor, if any, and such Derivative Works.

b) Subject to the terms of this Agreement, each Contributor hereby
grants Recipient a non-exclusive, worldwide, royalty-free patent
license under Licensed Patents to make, use, sell, offer to sell,
import and otherwise transfer the Contribution of such Contributor,
if any, in Source Code or other form. This patent license shall
apply to the combination of the Contribution and the Program if, at
the time the Contribution is added by the Contributor, such addition
of the Contribution causes such combination to be covered by the
Licensed Patents. The patent license shall not apply to any other
combinations which include the Contribution. No hardware per se is
licensed hereunder.

c) Recipient understands that although each Contributor grants the
licenses to its Contributions set forth herein, no assurances are
provided by any Contributor that the Program does not infringe the
patent or other intellectual property rights of any other entity.
Each Contributor disclaims any liability to Recipient for claims
brought by any other entity based on infringement of intellectual
property rights or otherwise. As a condition to exercising the
rights and licenses granted hereunder, each Recipient hereby
assumes sole responsibility to secure any other intellectual
property rights needed, if any. For example, if a third party
patent license is required to allow Recipient to Distribute the
Program, it is Recipient's responsibility to acquire that license
before distributing the Program.

d) Each Contributor represents that to its knowledge it has
sufficient copyright rights in its Contribution, if any, to grant
the copyright license set forth in this Agreement.

e) Notwithstanding the terms of any Secondary License, no
Contributor makes additional grants to any Recipient (other than
those set forth in this Agreement) as a result of such Recipient's
receipt of the Program under the terms of a Secondary License
(if permitted under the terms of Section 3).

3. REQUIREMENTS

3.1 If a Contributor Distributes the Program in any form, then:

a) the Program must also be made available as Source Code, in
accordance with section 3.2, and the Contributor must accompany
the Program with a statement that the Source Code for the Program
is available under this Agreement, and informs Recipients how to
obtain it in a reasonable manner on or through a medium customarily
used for software exchange; and

b) the Contributor may Distribute the Program under a license
different than this Agreement, provided that such license:
i) effectively disclaims on behalf of all other Contributors all
warranties and conditions, express and implied, including
warranties or conditions of title and non-infringement, and
implied warranties or conditions of merchantability and fitness
for a particular purpose;

ii) effectively excludes on behalf of all other Contributors all
liability for damages, including direct, indirect, special,
incidental and consequential damages, such as lost profits;

iii) does not attempt to limit or alter the recipients' rights
in the Source Code under section 3.2; and

iv) requires any subsequent distribution of the Program by any
party to be under a license that satisfies the requirements
of this section 3.

3.2 When the Program is Distributed as Source Code:

a) it must be made available under this Agreement, or if the
Program (i) is combined with other material in a separate file or
files made available under a Secondary License, and (ii) the initial
Contributor attached to the Source Code the notice described in
Exhibit A of this Agreement, then the Program may be made available
under the terms of such Secondary Licenses, and

b) a copy of this Agreement must be included with each copy of
the Program.

3.3 Contributors may not remove or alter any copyright, patent,
trademark, attribution notices, disclaimers of warranty, or limitations
of liability ("notices") contained within the Program from any copy of
the Program which they Distribute, provided that Contributors may add
their own appropriate notices.

4. COMMERCIAL DISTRIBUTION

Commercial distributors of software may accept certain responsibilities
with respect to end users, business partners and the like. While this
license is intended to facilitate the commercial use of the Program,
the Contributor who includes the Program in a commercial product
offering should do so in a manner which does not create potential
liability for other Contributors. Therefore, if a Contributor includes
the Program in a commercial product offering, such Contributor
("Commercial Contributor") hereby agrees to defend and indemnify every
other Contributor ("Indemnified Contributor") against any losses,
damages and costs (collectively "Losses") arising from claims, lawsuits
and other legal actions brought by a third party against the Indemnified
Contributor to the extent caused by the acts or omissions of such
Commercial Contributor in connection with its distribution of the Program
in a commercial product offering. The obligations in this section do not
apply to any claims or Losses relating to any actual or alleged
intellectual property infringement. In order to qualify, an Indemnified
Contributor must: a) promptly notify the Commercial Contributor in
writing of such claim, and b) allow the Commercial Contributor to control,
and cooperate with the Commercial Contributor in, the defense and any
related settlement negotiations. The Indemnified Contributor may
participate in any such claim at its own expense.

For example, a Contributor might include the Program in a commercial
product offering, Product X. That Contributor is then a Commercial
Contributor. If that Commercial Contributor then makes performance
claims, or offers warranties related to Product X, those performance
claims and warranties are such Commercial Contributor's responsibility
alone. Under this section, the Commercial Contributor would have to
defend claims against the other Contributors related to those performance
claims and warranties, and if a court requires any other Contributor to
pay any damages as a result, the Commercial Contributor must pay
those damages.

5. NO WARRANTY

EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, AND TO THE EXTENT
PERMITTED BY APPLICABLE LAW, THE PROGRAM IS PROVIDED ON AN "AS IS"
BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR
IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF
TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE. Each Recipient is solely responsible for determining the
appropriateness of using and distributing the Program and assumes all
risks associated with its exercise of rights under this Agreement,
including but not limited to the risks and costs of program errors,
compliance with applicable laws, damage to or loss of data, programs
or equipment, and unavailability or interruption of operations.

6. DISCLAIMER OF LIABILITY

EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, AND TO THE EXTENT
PERMITTED BY APPLICABLE LAW, NEITHER RECIPIENT NOR ANY CONTRIBUTORS
SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST
PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE
EXERCISE OF ANY RIGHTS GRANTED HEREUNDER, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.

7. GENERAL

If any provision of this Agreement is invalid or unenforceable under
applicable law, it shall not affect the validity or enforceability of
the remainder of the terms of this Agreement, and without further
action by the parties hereto, such provision shall be reformed to the
minimum extent necessary to make such provision valid and enforceable.

If Recipient institutes patent litigation against any entity
(including a cross-claim or counterclaim in a lawsuit) alleging that the
Program itself (excluding combinations of the Program with other software
or hardware) infringes such Recipient's patent(s), then such Recipient's
rights granted under Section 2(b) shall terminate as of the date such
litigation is filed.

All Recipient's rights under this Agreement shall terminate if it
fails to comply with any of the material terms or conditions of this
Agreement and does not cure such failure in a reasonable period of
time after becoming aware of such noncompliance. If all Recipient's
rights under this Agreement terminate, Recipient agrees to cease use
and distribution of the Program as soon as reasonably practicable.
However, Recipient's obligations under this Agreement and any licenses
granted by Recipient relating to the Program shall continue and survive.

Everyone is permitted to copy and distribute copies of this Agreement,
but in order to avoid inconsistency the Agreement is copyrighted and
may only be modified in the following manner. The Agreement Steward
reserves the right to publish new versions (including revisions) of
this Agreement from time to time. No one other than the Agreement
Steward has the right to modify this Agreement. The Eclipse Foundation
is the initial Agreement Steward. The Eclipse Foundation may assign the
responsibility to serve as the Agreement Steward to a suitable separate
entity. Each new version of the Agreement will be given a distinguishing
version number. The Program (including Contributions) may always be
Distributed subject to the version of the Agreement under which it was
received. In addition, after a new version of the Agreement is published,
Contributor may elect to Distribute the Program (including its
Contributions) under the new version.

Except as expressly stated in Sections 2(a) and 2(b) above, Recipient
receives no rights or licenses to the intellectual property of any
Contributor under this Agreement, whether expressly, by implication,
estoppel or otherwise. All rights in the Program not expressly granted
under this Agreement are reserved. Nothing in this Agreement is intended
to be enforceable by any entity that is not a Contributor or Recipient.
No third-party beneficiary rights are created under this Agreement.

Exhibit A - Form of Secondary Licenses Notice

"This Source Code may also be made available under the following
Secondary Licenses when the conditions for such availability set forth
in the Eclipse Public License, v. 2.0 are satisfied: {name license(s),
version(s), and exceptions or additional permissions here}."

Simply including a copy of this Agreement, including this Exhibit A
is not sufficient to license the Source Code under Secondary Licenses.

If it is not possible or desirable to put the notice in a particular
file, then You may include the notice in a location (such as a LICENSE
file in a relevant directory) where a recipient would be likely to
look for such a notice.

You may add additional accurate notices of copyright ownership.

+ 42
- 0
README.md 查看文件

@@ -0,0 +1,42 @@
# quartet-rnaseq-report

FIXME: description

## Installation

Download from http://example.com/FIXME.

## Usage

FIXME: explanation

## Options

FIXME: listing of options this app accepts.

## Examples

...

### Bugs

...

### Any Other Sections
### That You Think
### Might be Useful

## License

Copyright © 2021 FIXME

This program and the accompanying materials are made available under the
terms of the Eclipse Public License 2.0 which is available at
http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary
Licenses when the conditions for such availability set forth in the Eclipse
Public License, v. 2.0 are satisfied: GNU General Public License as published by
the Free Software Foundation, either version 2 of the License, or (at your
option) any later version, with the GNU Classpath Exception which is available
at https://www.gnu.org/software/classpath/license.html.

+ 29
- 0
project.clj 查看文件

@@ -0,0 +1,29 @@
(defproject tservice/quartet-rnaseq-report "v0.1.0"
:description "FIXME: write description"
:url "http://example.com/FIXME"
:license {:name "Eclipse Public License"
:url "http://www.eclipse.org/legal/epl-v10.html"}
:min-lein-version "2.5.0"
:deployable false

:dependencies
[[org.clojure/data.csv "1.0.0"]
[me.raynes/fs "1.4.6"]
[org.clojure/tools.logging "1.1.0"]
[org.clojure/core.async "0.4.500"
:exclusions [org.clojure/tools.reader]]]

:profiles
{:provided
{:dependencies
[[org.clojure/clojure "1.10.1"]
[tservice "0.3.1"]]}

:uberjar
{:auto-clean true
:aot :all
:omit-source true
:javac-options ["-target" "1.8", "-source" "1.8"]
:target-path "target/%s"
:resource-paths ["resources"]
:uberjar-name "quartet-rnaseq-report.tservice-plugin.jar"}})

+ 9
- 0
resources/quartet_rnaseq_report.yaml 查看文件

@@ -0,0 +1,9 @@
run_modules:
- rnaseq_data_generation_information
- rnaseq_performance_assessment
- rnaseq_raw_qc
- rnaseq_post_alignment_qc
- rnaseq_quantification_qc
- rnaseq_supplementary

skip_generalstats: true

+ 15
- 0
resources/tservice-plugin.yaml 查看文件

@@ -0,0 +1,15 @@
info:
name: FIXME: write name
version: v0.1.0
description: FIXME: write description
plugin:
name: quartet-rnaseq-report
display-name: FIXME: write display name
lazy-load: false
init:
- step: load-namespace
namespace: tservice.plugins.quartet-rnaseq-report
- step: register-plugin
entrypoint: tservice.plugins.quartet-rnaseq-report/metadata
- step: init-event
entrypoint: tservice.plugins.quartet-rnaseq-report/events-init

+ 322
- 0
src/tservice/plugins/quartet_rnaseq_report.clj 查看文件

@@ -0,0 +1,322 @@
(ns tservice.plugins.quartet-rnaseq-report
(:require [clojure.core.async :as async]
[clojure.data.json :as json]
[clojure.spec.alpha :as s]
[clojure.tools.logging :as log]
[tservice.lib.commons :as comm :refer [get-path-variable]]
[tservice.plugins.quartet-rnaseq-report.exp2qcdt :as exp2qcdt]
[tservice.plugins.quartet-rnaseq-report.merge-exp :as me]
[spec-tools.core :as st]
[spec-tools.json-schema :as json-schema]
[tservice.config :refer [get-workdir env]]
[tservice.events :as events]
[tservice.lib.filter-files :as ff]
[tservice.lib.fs :as fs-lib]
[tservice.util :as u]
[tservice.util.files :as files]
[clojure.java.io :as io]
[tservice.db.handler :as db-handler]
[clojure.java.shell :as shell :refer [sh]]
[tservice.vendor.multiqc :as mq]))

;;; ------------------------------------------------ Event Specs ------------------------------------------------
(s/def ::name
(st/spec
{:spec string?
:type :string
:description "The name of the report"
:swagger/default ""
:reason "Not a valid report name"}))

(s/def ::description
(st/spec
{:spec string?
:type :string
:description "Description of the report"
:swagger/default ""
:reason "Not a valid description."}))

(s/def ::project_id
(st/spec
{:spec #(some? (re-matches #"[a-f0-9]{8}(-[a-f0-9]{4}){4}[a-f0-9]{8}" %))
:type :string
:description "project-id"
:swagger/default "40644dec-1abd-489f-a7a8-1011a86f40b0"
:reason "Not valid a project-id."}))

(s/def ::filepath
(st/spec
{:spec (s/and string? #(re-matches #"^[a-zA-Z0-9]+:\/\/(\/|\.\/)?[a-zA-Z0-9_\/]+.*" %))
:type :string
:description "File path for covertor."
:swagger/default nil
:reason "The filepath must be string."}))

(s/def ::group
(st/spec
{:spec string?
:type :string
:description "A group name which is matched with library."
:swagger/default []
:reason "The group must a string."}))

(s/def ::library
(st/spec
{:spec string?
:type :string
:description "A library name."
:swagger/default []
:reason "The library must a string."}))

(s/def ::sample
(st/spec
{:spec string?
:type :string
:description "A sample name."
:swagger/default []
:reason "The sample name must a string."}))

(s/def ::metadat-item
(s/keys :req-un [::library
::group
::sample]))

(s/def ::metadata
(s/coll-of ::metadat-item))

(s/def ::lab
(st/spec
{:spec string?
:type :string
:description "Lab name."
:swagger/default []
:reason "The lab_name must be string."}))

(s/def ::sequencing_platform
(st/spec
{:spec string?
:type :string
:description "Sequencing Platform."
:swagger/default []
:reason "The sequencing_platform must be string."}))

(s/def ::sequencing_method
(st/spec
{:spec string?
:type :string
:description "Sequencing Method"
:swagger/default []
:reason "The sequencing_method must be string."}))

(s/def ::library_protocol
(st/spec
{:spec string?
:type :string
:description "Library protocol."
:swagger/default []
:reason "The library_protocol must be string."}))

(s/def ::library_kit
(st/spec
{:spec string?
:type :string
:description "Library kit."
:swagger/default []
:reason "The library_kit must be string."}))

(s/def ::read_length
(st/spec
{:spec string?
:type :string
:description "Read length"
:swagger/default []
:reason "The read_length must be string."}))

(s/def ::date
(st/spec
{:spec string?
:type :string
:description "Date"
:swagger/default []
:reason "The date must be string."}))

(s/def ::parameters
(s/keys :req-un [::lab
::sequencing_platform
::sequencing_method
::library_protocol
::library_kit
::read_length
::date]))

(def quartet-rna-report-params-body
"A spec for the body parameters."
(s/keys :req-un [::name ::description ::filepath ::metadata ::parameters]
:opt-un [::project_id]))

;;; ------------------------------------------------ Event Metadata ------------------------------------------------
(def metadata
{:route ["/report/quartet-rnaseq-report"
{:tags ["Report"]
:post {:summary "Parse the results of the quartet-rnaseq-qc app and generate the report."
:parameters {:body quartet-rna-report-params-body}
:responses {201 {:body {:results string? :log string? :report string? :id string?}}}
:handler (fn [{{{:keys [name description project_id filepath metadata parameters]} :body} :parameters}]
(let [workdir (get-workdir)
from-path (u/replace-path filepath workdir)
uuid (u/uuid)
relative-dir (fs-lib/join-paths "download" uuid)
to-dir (fs-lib/join-paths workdir relative-dir)
log-path (fs-lib/join-paths to-dir "log")]
(fs-lib/create-directories! to-dir)
(spit log-path (json/write-str {:status "Running" :msg ""}))
(events/publish-event! :quartet_rnaseq_report-convert
{:datadir from-path
:parameters parameters
:metadata metadata
:dest-dir to-dir})
(db-handler/create-report! {:id (u/uuid)
:report_name name
:project_id project_id
:app_name "quartet_rnaseq_report"
:description description
:report_path (fs-lib/join-paths relative-dir "multiqc.html")
:started_time (u/time->int (u/now))
:finished_time nil
:archived_time nil
:report_type "multireport"
:status "Started"
:log (fs-lib/join-paths relative-dir "log")})
{:status 201
:body {:results (fs-lib/join-paths relative-dir)
:report (fs-lib/join-paths relative-dir "multiqc_report.html")
:log (fs-lib/join-paths relative-dir "log")
:id uuid}}))}
:get {:summary "A json shema for quartet-rnaseq-report."
:parameters {}
:responses {200 {:body map?}}
:handler (fn [_]
{:status 200
:body (json-schema/transform quartet-rna-report-params-body)})}}]
:manifest {:description "Parse the results of the quartet-rna-qc app and generate the report."
:category "Report"
:home "https://github.com/clinico-omics/quartet-rnaseq-report"
:name "Quartet RNA-Seq Report"
:source "PGx"
:short_name "quartet-rnaseq-report"
:icons [{:src "", :type "image/png", :sizes "192x192"}
{:src "", :type "image/png", :sizes "192x192"}]
:author "Jun Shang"
:hidden false
:id "f65d87fd3dd2213d91bb15900ba57c11"
:app_name "shangjun/quartet-rnaseq-report"}})

(def ^:const quartet-rnaseq-report-topics
"The `Set` of event topics which are subscribed to for use in quartet-rnaseq-report tracking."
#{:quartet_rnaseq_report-convert})

(def ^:private quartet-rnaseq-report-channel
"Channel for receiving event quartet-rnaseq-report we want to subscribe to for quartet-rnaseq-report events."
(async/chan))

;;; ------------------------------------------------ Event Processing ------------------------------------------------

(defn- decompression-tar
[filepath]
(shell/with-sh-env {:PATH (get-path-variable)
:LC_ALL "en_US.utf-8"
:LANG "en_US.utf-8"}
(let [command ["bash" "-c"
(format "tar -xvf %s -C %s" filepath (fs-lib/parent-path filepath))]
result (apply sh command)
status (if (= (:exit result) 0) "Success" "Error")
msg (str (:out result) "\n" (:err result))]
{:status status
:msg msg})))

(defn- filter-mkdir-copy
[fmc-datadir fmc-patterns fmc-destdir fmc-newdir]
(let [files-keep (ff/batch-filter-files fmc-datadir fmc-patterns)
files-keep-dir (fs-lib/join-paths fmc-destdir fmc-newdir)]
(fs-lib/create-directories! files-keep-dir)
(ff/copy-files! files-keep files-keep-dir {:replace-existing true})))

(defn- extract-config-file
[filename dest-path]
(when (io/resource filename)
(files/with-open-path-to-resource [filepath filename]
(files/copy-files! filepath dest-path))))

(defn- quartet-rnaseq-report!
"Chaining Pipeline: filter-files -> copy-files -> merge_exp_file -> exp2qcdt -> multiqc."
[datadir parameters metadata dest-dir]
(log/info "Generate quartet rnaseq report: " datadir parameters metadata dest-dir)
(let [metadata-file (fs-lib/join-paths dest-dir
"results"
"metadata.csv")
parameters-file (fs-lib/join-paths dest-dir
"results"
"general-info.json")
ballgown-dir (fs-lib/join-paths dest-dir "ballgown")
count-dir (fs-lib/join-paths dest-dir "count")
exp-fpkm-filepath (fs-lib/join-paths dest-dir "fpkm.csv")
exp-count-filepath (fs-lib/join-paths dest-dir "count.csv")
result-dir (fs-lib/join-paths dest-dir "results")
log-path (fs-lib/join-paths dest-dir "log")
config-path (fs-lib/join-paths dest-dir "results", "quartet_rnaseq_report.yaml")]
(try
(fs-lib/create-directories! result-dir)
(log/info "Merge these files")
(log/info "Merge gene experiment files from ballgown directory to a experiment table: " ballgown-dir exp-fpkm-filepath)
(log/info "Merge these files")
(log/info "Merge gene experiment files from count directory to a experiment table: " count-dir exp-count-filepath)
(filter-mkdir-copy datadir [".*call-ballgown/.*.txt"] dest-dir "ballgown")
(filter-mkdir-copy datadir [".*call-count/.*gene_count_matrix.csv"] dest-dir "count")
(filter-mkdir-copy datadir [".*call-qualimapBAMqc/.*tar.gz"] dest-dir "results/post_alignment_qc/bam_qc")
(filter-mkdir-copy datadir [".*call-qualimapRNAseq/.*tar.gz"] dest-dir "results/post_alignment_qc/rnaseq_qc")
(filter-mkdir-copy datadir [".*call-fastqc/.*.zip"] dest-dir "results/rawqc/fastqc")
(filter-mkdir-copy datadir [".*call-fastqscreen/.*.txt"] dest-dir "results/rawqc/fastq_screen")
(me/merge-exp-files! (ff/list-files ballgown-dir {:mode "file"}) exp-fpkm-filepath)
(me/merge-exp-files! (ff/list-files count-dir {:mode "file"}) exp-count-filepath)
(spit parameters-file (json/write-str parameters))
(comm/write-csv! metadata-file metadata)
;;(decompression-tar files-qualimap-bam)
;;(decompression-tar files-qualimap-RNA)
(doseq [files-qualimap-bam-tar (ff/batch-filter-files (fs-lib/join-paths dest-dir "results/post_alignment_qc/bam_qc") [".*tar.gz"])]
(decompression-tar files-qualimap-bam-tar))
(doseq [files-qualimap-RNA-tar (ff/batch-filter-files (fs-lib/join-paths dest-dir "results/post_alignment_qc/rnaseq_qc") [".*tar.gz"])]
(decompression-tar files-qualimap-RNA-tar))
(extract-config-file "quartet_rnaseq_report.yaml" config-path)
(let [exp2qcdt-result (exp2qcdt/call-exp2qcdt! exp-fpkm-filepath exp-count-filepath metadata-file result-dir)
multiqc-result (if (= (:status exp2qcdt-result) "Success")
(mq/multiqc result-dir dest-dir {:config config-path :template "default" :title "Quartet RNA report"})
;;(mq/multiqc result-dir)
exp2qcdt-result)
result {:status (:status multiqc-result)
:msg (:msg multiqc-result)}
log (json/write-str result)]
(log/info "Status: " result)
(spit log-path log))
(catch Exception e
(let [log (json/write-str {:status "Error" :msg (.toString e)})]
(log/info "Status: " log)
(spit log-path log))))))

(defn- process-quartet-rnaseq-report-event!
"Handle processing for a single event notification received on the quartet-rnaseq-report-channel"
[quartet-rnaseq-report-event]
;; try/catch here to prevent individual topic processing exceptions from bubbling up. better to handle them here.
(try
(when-let [{topic :topic object :item} quartet-rnaseq-report-event]
;; TODO: only if the definition changed??
(case (events/topic->model topic)
"quartet_rnaseq_report" (quartet-rnaseq-report! (:datadir object) (:parameters object) (:metadata object) (:dest-dir object))))
(catch Throwable e
(log/warn (format "Failed to process quartet-rnaseq-report event. %s" (:topic quartet-rnaseq-report-event)) e))))

;;; --------------------------------------------------- Lifecycle ----------------------------------------------------

(defn events-init
"Automatically called during startup; start event listener for quartet-rnaseq-report events."
[]
(events/start-event-listener! quartet-rnaseq-report-topics quartet-rnaseq-report-channel process-quartet-rnaseq-report-event!))

+ 33
- 0
src/tservice/plugins/quartet_rnaseq_report/exp2qcdt.clj 查看文件

@@ -0,0 +1,33 @@
(ns tservice.plugins.quartet-rnaseq-report.exp2qcdt
"A wrapper for exp2qcdt tool."
(:require [tservice.lib.commons :refer [get-path-variable]]
[tservice.config :refer [get-plugin-dir]]
[tservice.lib.fs :as fs]
[clojure.java.shell :as shell :refer [sh]]))

(defn add-env-to-path
"Add the path of the plugin's environment into the PATH and return the PATH"
[]
(let [env-bin-path (fs/join-paths (get-plugin-dir) "plugins" "repository" "quartet-rnaseq-report" "bin")
path (get-path-variable)]
(str env-bin-path ":" path)))

(defn call-exp2qcdt!
"Call exp2qcdt bash script.
exp-file: FPKM file , each row is the expression values of a gene and each column is a library.
cnt-file: Count file, each row is the expression values of a gene and each column is a library.
meta-file: Need to contain three columns: library, group, and sample and library names must be matched with the column names in the `exp-file`.
result-dir: A directory for result files.
"
[exp-file cnt-file meta-file result-dir]
(shell/with-sh-env {:PATH (add-env-to-path)
:LC_ALL "en_US.utf-8"
:LANG "en_US.utf-8"}
(let [command ["bash" "-c"
(format "exp2qcdt -e %s -c %s -m %s -o %s" exp-file cnt-file meta-file result-dir)]
result (apply sh command)
status (if (= (:exit result) 0) "Success" "Error")
msg (str (:out result) "\n" (:err result))]
{:status status
:msg msg})))


+ 36
- 0
src/tservice/plugins/quartet_rnaseq_report/merge_exp.clj 查看文件

@@ -0,0 +1,36 @@
(ns tservice.plugins.quartet-rnaseq-report.merge-exp
"Merge expression files."
(:require [tservice.lib.commons :refer [read-csv write-csv! vec-remove write-csv-by-cols!]]))

(set! *warn-on-reflection* true)

(defn sort-exp-data
[coll]
(sort-by :GENE_ID coll))

(defn read-csvs
[files]
(map #(sort-exp-data (read-csv %)) files))

(defn reorder
[data]
(let [cols (vec (sort (keys (first data))))]
(cons :GENE_ID (vec-remove (.indexOf cols :GENE_ID) cols))))

(defn merge-exp
"[[{:GENE_ID 'XXX0' :YYY0 1.2} {:GENE_ID 'XXX1' :YYY1 1.3}]
[{:GENE_ID 'XXX0' :YYY2 1.2} {:GENE_ID 'XXX1' :YYY3 1.3}]]"
[all-exp-data]
(apply map merge all-exp-data))

(defn write-csv-by-ordered-cols!
[path row-data]
(let [cols (reorder row-data)]
(write-csv-by-cols! path row-data cols)))

(defn merge-exp-files!
"Assumption: all files have the same GENE_ID list, no matter what order."
[files path]
(->> (read-csvs files)
(merge-exp)
(write-csv-by-ordered-cols! path)))

正在加载...
取消
保存