Gene counting基因计数
rnaseq-count-flow
This flow uses featureCounts to convert aligned BAM files into a gene-level count matrix. It is part of the optional alignment/count route and can feed DESeq2 when explicitly selected.
这个流程使用 featureCounts 将比对后的 BAM 文件转换成基因层面计数矩阵。它属于可选的比对/计数路线,并可在显式选择时作为 DESeq2 输入。
Typical command
典型命令
taf-rnaseq-count-flow \
--bams align-out/04_reports/bam_files.tsv \
--annotation ref-out/03_results/annotation/genes.gtf \
--outdir count-out \
--threads 8 \
--strand 0
Input requirements
输入要求
--bams should be the bam_files.tsv written by alignment-flow. The annotation should match the reference used to build the HISAT2 index and should usually be ref-out/03_results/annotation/genes.gtf. BAM files should be coordinate-sorted; BAI indexes are recommended.
--bams 应该是比对流程写出的 bam_files.tsv。注释应与 HISAT2 索引使用的参考一致,通常使用 ref-out/03_results/annotation/genes.gtf。BAM 应该已经按坐标排序;推荐同时提供 BAI 索引。
bam_files.tsv
sample_id bam bai
WT_01 bam/WT_01.sorted.bam bam/WT_01.sorted.bam.bai
SNF2_01 bam/SNF2_01.sorted.bam bam/SNF2_01.sorted.bam.bai
Optional sample columns
可选样本列
sample_id bam condition batch strandedness
WT_01 bam/WT_01.sorted.bam WT b1 unstranded
SNF2_01 bam/SNF2_01.sorted.bam SNF2KO b1 unstranded
Relative BAM and BAI paths are resolved from the directory containing bam_files.tsv. The flow does not modify input BAM files; generated summaries and temporary files go under --outdir.
BAM 和 BAI 的相对路径按 bam_files.tsv 所在目录解释。流程不会修改输入 BAM;所有摘要和临时文件都会写入 --outdir。
Complete parameter reference
完整参数说明
| Parameter | 参数 | Required | 是否必需 | Default | 默认值 | Meaning and when to change it | 含义与选择建议 |
|---|
--bams | yes是 | none | BAM sample table with sample_id and bam; optional bai, condition, batch, and strandedness columns are accepted.BAM 样本表,必须包含 sample_id 和 bam;可选 bai、condition、batch、strandedness 等列。 |
--annotation | yes是 | none | GTF/GFF annotation for featureCounts. Use the same reference release as the alignments.featureCounts 使用的 GTF/GFF 注释。应与比对使用的参考版本一致。 |
--outdir, -o | yes是 | none | Dedicated output directory. Existing directories are refused unless --force is used.专用输出目录。目录已存在时默认拒绝运行,除非使用 --force。 |
--threads, -t | no否 | 1 | featureCounts thread count. Increase for many BAM files or large genomes.featureCounts 使用的线程数。BAM 多或基因组较大时可以调高。 |
--strand | no否 | 0 | featureCounts strand mode: 0 unstranded, 1 stranded, 2 reversely stranded. Match the library protocol.featureCounts 链特异性:0 非链特异,1 正向链特异,2 反向链特异。应与文库协议匹配。 |
--feature-type | no否 | exon | Feature type passed to featureCounts -t. For gene-level RNA-seq, exon is the usual choice.传给 featureCounts -t 的特征类型。基因层面 RNA-seq 通常使用 exon。 |
--attribute | no否 | gene_id | Annotation attribute passed to featureCounts -g. Change when the annotation uses another gene identifier key.传给 featureCounts -g 的注释属性。注释使用其他基因 ID 字段时才修改。 |
--min-mapq | no否 | 0 | featureCounts MAPQ threshold when greater than zero.大于 0 时作为 featureCounts 的 MAPQ 过滤阈值。 |
--paired | no否 | off | Enable paired-end counting with -p --countReadPairs. Use only for paired-end BAMs when fragment-level counting is desired.启用 -p --countReadPairs 双端计数。仅在 BAM 来自双端数据且希望按片段/配对计数时使用。 |
--min-assigned-reads | no否 | 0 | Fail if total assigned reads are below this value. Useful for smoke tests or strict QC gates.若总 assigned reads 低于该值则失败。适合 smoke 测试或严格 QC 门槛。 |
--force | no否 | off | Replace standard outputs inside an existing output directory.允许替换已有输出目录中的标准结果。 |
How it connects
如何接上下游
taf-rnaseq-de-flow \
--counts count-out/03_results/matrices/gene_counts.tsv \
--metadata metadata.tsv \
--design '~ condition' \
--contrast condition:treated:control \
--outdir de-out
In standard-flow, use --route both --de-source featurecounts to select this matrix for DE.
在标准流程中,使用 --route both --de-source featurecounts 可让差异表达分析使用这个矩阵。
Key outputs and limits
关键输出与边界
The main output is 03_results/matrices/gene_counts.tsv, plus featureCounts summaries and assignment summaries. Counting depends strongly on annotation choice, strandedness, and paired-end settings.
主要输出是 03_results/matrices/gene_counts.tsv,以及 featureCounts 汇总和 reads 分配摘要。计数结果强烈依赖注释选择、链特异性和双端参数设置。