Inputs输入
Prepare analysis-ready inputs
准备可分析输入
The sample table names biological samples and points to local FASTQ files. The same table feeds reference expression, reference alignment, and de novo assembly/expression routes.
样本表给出生物学样本名并指向本地 FASTQ 文件。同一张表可以用于有参表达、有参比对,以及无参组装/表达路线。
sample_id read1 read2 condition
WT_01 reads/WT_01_R1.fq.gz reads/WT_01_R2.fq.gz WT
KO_01 reads/KO_01_R1.fq.gz reads/KO_01_R2.fq.gz treated
Metadata元数据
Metadata drives the statistical design. Keep sample IDs identical to the expression matrix sample names and make the contrast explicit.
样本元数据决定统计设计。样本 ID 需要和表达矩阵中的样本名一致,并明确写出比较组。
sample condition batch
WT_01 control A
KO_01 treated A
Reference genome and annotation
参考基因组和注释
Use genome FASTA and annotation from the same release. Sequence IDs in the annotation must match the FASTA headers used by the index flow. These files are used only in reference mode.
基因组 FASTA 和注释文件应来自同一个数据库版本。注释中的序列 ID 必须和参考流程读取的 FASTA header 对齐。这些文件只用于有参模式。
Gene sets and background
Gene sets 和背景基因
Reference-mode enrichment uses offline GMT files. A background gene list is recommended for ORA so the tested universe matches the experiment and annotation space.
有参模式富集分析使用离线 GMT 文件。ORA 推荐提供背景基因列表,让被测试背景和实验/注释空间一致。
De novo protein database and GO map
无参蛋白数据库和 GO 映射
De novo analysis does not receive known gene models from a genome annotation. Functional interpretation therefore depends on homology resources selected by the analyst. --protein-db should be a local protein FASTA. --go-map is a two-column or tabular mapping from protein IDs to GO IDs, used to build transcript-space GMT and background files after best-hit annotation.
无参分析没有来自基因组注释的已知 gene model,因此功能解释依赖分析者选择的同源资源。--protein-db 应是本地蛋白 FASTA。--go-map 是 protein ID 到 GO ID 的两列或表格映射,用于在 best-hit 注释之后构建转录本空间 GMT 和 background。
Protein FASTA
蛋白 FASTA
>P12345
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPR...GO map
GO 映射
protein_id go_id
P12345 GO:0006412
P12345 GO:0005737These files are not raw sequencing outputs. They are biological reference resources, so their species, database version, license, and provenance should be recorded in the project notes.
这些文件不是测序下机输出,而是生物学参考资源。因此它们的物种、数据库版本、许可和来源应记录在项目说明中。