Alignment QC比对质控
rnaseq-alignment-qc-flow
This flow evaluates BAM-level RNA-seq evidence with SAMtools, RSeQC, Qualimap, and MultiQC. It asks whether the alignment branch is technically trustworthy before BAM-derived results are interpreted.
这个流程结合 SAMtools、RSeQC、Qualimap 和 MultiQC 评估 BAM 层面的 RNA-seq 证据。它回答的是:在解释 BAM 衍生结果前,比对分支在技术上是否可信。
Typical command
典型命令
taf-rnaseq-alignment-qc-flow \
--bams align-out/04_reports/bam_files.tsv \
--gtf ref-out/03_results/annotation/genes.gtf \
--outdir alignment-qc-out \
--threads 8 \
--sequencing-protocol non-strand-specific
Input requirements
输入要求
Use the BAM table from alignment-flow and the matching GTF from index-flow. The BAM table must contain sample_id and bam; bai is recommended and otherwise an existing BAM.bai must be present. If you already have a curated RSeQC BED, pass it with --annotation-bed; otherwise the flow derives one from the GTF.
使用比对流程生成的 BAM 表,以及参考流程生成的匹配 GTF。BAM 表必须包含 sample_id 和 bam;推荐提供 bai,否则输入 BAM 旁边必须已有 BAM.bai。如果已有人工确认的 RSeQC BED,可通过 --annotation-bed 传入;否则流程会从 GTF 生成。
bam_files.tsv
sample_id bam bai
WT_01 bam/WT_01.sorted.bam bam/WT_01.sorted.bam.bai
SNF2_01 bam/SNF2_01.sorted.bam bam/SNF2_01.sorted.bam.bai
Annotation inputs
注释输入
--gtf ref-out/03_results/annotation/genes.gtf
# optional:
--annotation-bed curated_rseqc_genes.bed
Complete parameter reference
完整参数说明
| Parameter | 参数 | Required | 是否必需 | Default | 默认值 | Meaning and when to change it | 含义与选择建议 |
|---|
--bams | yes是 | none | BAM sample table with sample_id and bam; bai is recommended.BAM 样本表,必须包含 sample_id 和 bam;推荐提供 bai。 |
--gtf | yes是 | none | Gene annotation in GTF format. Qualimap uses it directly; RSeQC BED can be derived from it.GTF 格式基因注释。Qualimap 直接使用;RSeQC 需要的 BED 可由它生成。 |
--outdir, -o | yes是 | none | Dedicated output directory. Existing directories are refused unless --force is used.专用输出目录。目录已存在时默认拒绝运行,除非使用 --force。 |
--annotation-bed | no否 | derived | Optional BED gene model for RSeQC. Provide a curated BED when automatic GTF conversion is not suitable.RSeQC 使用的 BED gene model。自动 GTF 转换不适合时可提供人工确认的 BED。 |
--threads, -t | no否 | 1 | Recorded thread count. r1 runs per-sample QC commands serially.记录用线程数。r1 的逐样本 QC 命令仍按串行方式运行。 |
--java-mem-size | no否 | 4G | Qualimap Java memory setting. Increase for large BAM files when Qualimap reports memory errors.Qualimap Java 内存设置。大 BAM 导致 Qualimap 内存不足时调高。 |
--mapq | no否 | 30 | MAPQ cutoff for RSeQC bam_stat.py and infer_experiment.py.RSeQC bam_stat.py 和 infer_experiment.py 使用的 MAPQ 阈值。 |
--infer-sample-size | no否 | 200000 | Reads sampled by RSeQC strandedness inference. Increase for noisy or very large datasets.RSeQC 推断链特异性时抽样的 reads 数量。数据噪音大或规模很大时可调高。 |
--sequencing-protocol | no否 | non-strand-specific | Qualimap protocol: non-strand-specific, strand-specific-forward, or strand-specific-reverse. Match the library protocol.Qualimap 文库协议:non-strand-specific、strand-specific-forward 或 strand-specific-reverse。应与真实文库协议匹配。 |
--paired | no否 | off | Enable Qualimap paired-end mode when BAMs came from paired-end reads.BAM 来自双端 reads 时启用 Qualimap 双端模式。 |
--force | no否 | off | Replace standard outputs inside an existing output directory.允许替换已有输出目录中的标准结果。 |
How it connects
如何接上下游
It is collected by report-flow with --alignment-qc-out alignment-qc-out. In standard-flow, it runs only when --route both is enabled.
report-flow 通过 --alignment-qc-out alignment-qc-out 收集它。在 standard-flow 中,它只会在启用 --route both 时运行。
Key outputs and limits
关键输出与边界
Outputs include SAMtools stats, RSeQC results, Qualimap HTML reports, rnaseq_qc_summary.tsv, and MultiQC. This flow does not quantify expression; it explains the reliability of the alignment evidence.
输出包括 SAMtools stats、RSeQC 结果、Qualimap HTML 报告、rnaseq_qc_summary.tsv 和 MultiQC。这个流程不做表达定量;它说明比对证据是否可靠。