【单细胞分析方法】VeTra：基于RNA速度的轨迹推断工具

VeTra: a tool for trajectory inference based on RNA velocity

文章阅读目录大纲

估计阅读时长: 5 分钟

单细胞轨迹可以揭示基因调控如何控制细胞命运：大多数细胞状态转变，无论是在发育，重编程或者是疾病异常状态，都以基因表达变化的级联为特征。

单细胞数据拟时序分析生物学假设原理

在获取得到的单细胞数据或者空间组数据，都是不同发展时期的细胞的混合数据。例如，即使是某一种类型的细胞，其在单细胞数据中也是由不同时期或者状态的细胞所组成的混合数据。对于空间组数据而言，这种情况很可能就是同一种细胞在正常状态与癌变状态的混合。

因为许多生物学过程中，细胞并不是完全同步的。在细胞分化等过程的单细胞表达研究中，捕获的细胞在分化方面可能分布广泛。所以为了阐述这种混合状态的变化，我们需要开发伪时间序列技术用于分析单细胞数据或者空间数据中的这种状态变化轨迹。

使用轨迹推断（TI，trajectory inference）的方法可以根据测序的细胞（瞬时状态）之间表达模式的相似性对单细胞沿着轨迹进行排序，以此来模拟细胞动态变化的过程。也就是重建分化轨迹或者拟时间轴。

RNA 速率（RNA velocity）

基于RNA velocity的轨迹推断：

特定基因的转录诱导导致(新转录的)前体未剪接mRNA的增加，
转录的抑制或缺失导致未剪接mRNA的减少。

因此，通过将未剪接的mRNA与成熟的剪接mRNA进行区分，可以近似地得到mRNA丰度的变化。其时间导数，即为RNA速度。基因组内不同基因表达的不同的mRNA速度组合可以用来分类单个细胞的状态。

VeTra的计算方法原理

Guangzheng Weng, Junil Kim, Kyoung Jae Won, VeTra: a tool for trajectory inference based on RNA velocity, Bioinformatics, Volume 37, Issue 20, 15 October 2021, Pages 3509–3513, https://doi.org/10.1093/bioinformatics/btab364

VeTra groups the cells belonging to the same stream of trajectory

VeTra performs lineage tracing from the root to the terminal states by grouping cells based on the similarity in direction of cell transition.

所以基于前面所描述的RNA速率在不同细胞间的差异， VeTra可以通过基于组成相似度的计算方法进行轨迹建模

This enables VeTra to perform TI without prior knowledge or predefined lineage topology.

1. 算法描述

VeTra reconstructs the pseudo-temporal order of cells based on the coordinates and the velocity vector of cells in the low-dimensional embedding.

The velocity vectors are estimated by extrapolating the spliced/unspliced read ratio to the local neighboring cells (La Manno et al., 2018).

RNA速率向量就是某一个细胞A的剪切/未剪切mRNA的比例值与其某一个最邻近细胞B的剪切/未剪切mRNA的比例值构成的向量

对最邻近细胞的定义：将单细胞数据进行UMAP降维至二维空间之后，细胞的状态位置可以通过两个UMAP分量构成的平面坐标来表示，则可以基于低维度嵌入结果上的KNN搜索得到某一个细胞A的最邻近细胞B

任意两个最邻近细胞的两个mRNA比例值既可以构成一个从零出发的二维向量，这个二维向量就是RNA velocity向量

Given velocity vectors, VeTra reconstructs multiple directed graphs.
To link cells based on transition, k nearest neighbors of a cell with similar direction are selected
using cosine similarity (cos1) .
Among them, the nearby cell located upstream with the highest cosine similarity (cos2) is selected

这样基于任意两个RNA velocity二维向量，就可以进行cos相似度计算
基于cos相似度作为距离量度，进行KNN搜索，KNN搜索得到的细胞集合之间就
可以产生一个从搜索点细胞指向K个最邻近细胞的有向图

Once all cells were investigated for their next transition, multiple directed graphs are
obtained. To find a coarse-grained structure of the directed graph, VeTra identifies WCCs where every cell is reachable from every other cell regardless of the direction of relationships

在前面所得到的有向图基础之上，算法会将图中的弱连接子图(WCCs, weak connection components)识别出来。产生网络集群社区划分结果。即可以将整个网络划分为不同颜色的聚类簇。

对划分出来的网络之中的聚类簇，使用层次聚类法进行相互关联关系的建立。簇与簇之间的关联关系即为细胞状态转换关系，即伪时间序列

The grouped WCCs using a hierarchical clustering algorithm.

2. VeTra reconstructs single-cell trajectories for multiple cell lineages

Author
Recent Posts

谢桂纲

高级数据科学家 at 苏州帕诺米克

Working on Engineered bacteria CAD design on its genome from scratch. Writing scientific computing software for Tianhe & Sunway TaihuLight supercomputer. Do scientific computing programming in R/R# language, he is also the programming language designer of the R# language on the .NET runtime.

Attachments

vec • 722 kB • 702 click
2022年3月17日
Slide10 • 14 MB • 804 click
2022年3月17日
Slide3 • 2 MB • 712 click
2022年3月17日
VeTra: a tool for trajectory inference based on RNA velocity • 652 kB • 725 click
2022年3月17日

VeTra: a tool for trajectory inference based on RNA velocity

打赏赞(2)

algorithm Bioinformatics single-cell

2 Responses

XMC.PL says:

2025年5月2日 at 3:27 AM

You have the gift of turning abstract thoughts into something tangible, allowing the reader to grasp concepts with clarity.

来自捷克

Reply
【单细胞分析方法】单细胞状态排序 – この中二病に爆焔を！ says:

2022年4月25日 at 10:23 AM

[…] 【单细胞分析方法】VeTra：基于RNA速度的轨迹推断工具 […]

来自中国

Reply

Leave a Reply to 【单细胞分析方法】单细胞状态排序 – この中二病に爆焔を！ Cancel reply

March 2026
S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

单细胞视角下的微生物基因组代谢酶嵌入分析 – この中二病に爆焔を！ on 基因组功能注释（EC Number）的向量化嵌入2026年2月25日
[…] 我们在基于前面所论述的《通过diamond软件进行blastp搜索》对大规模的基因组数据进行了代谢酶的EC number的注释以及按照文章《基因组功能注释（EC Number）的向量化嵌入》的方法，得到了一个比较大的基因组代谢酶TF-IDF嵌入丰度矩阵后，如果将这里所得到的嵌入结果矩阵中的基因组，基于Family层级的物种分类分组看作为单细胞转录数据中的细胞分群结果，能否基于单细胞数据分析方法来分析和可视化我的基因组功能嵌入的结果矩阵呢？ […]
单细胞视角下的微生物基因组代谢酶嵌入分析 – この中二病に爆焔を！ on 通过diamond软件进行blastp搜索2026年2月25日
[…] 我们在基于前面所论述的《通过diamond软件进行blastp搜索》对大规模的基因组数据进行了代谢酶的EC number的注释以及按照文章《基因组功能注释（EC Number）的向量化嵌入》的方法，得到了一个比较大的基因组代谢酶TF-IDF嵌入丰度矩阵后，如果将这里所得到的嵌入结果矩阵中的基因组，基于Family层级的物种分类分组看作为单细胞转录数据中的细胞分群结果，能否基于单细胞数据分析方法来分析和可视化我的基因组功能嵌入的结果矩阵呢？ […]
基因组代谢酶层级嵌入 – この中二病に爆焔を！ on 酶EC编号结构解析2026年2月23日
[…] 对于基于ec number来生成层级数据，我们直接使用《酶EC编号结构解析》文章末尾所展示的层级数据生成函数来实现。 […]
二叉树聚类可视化微生物群落代谢差异 – この中二病に爆焔を！ on 基因组功能注释（EC Number）的向量化嵌入2026年2月15日
[…] 在前面的一篇《基因组功能注释（EC Number）的向量化嵌入》博客文章中，针对所注释得到的微生物基因组代谢信息，进行基于TF-IDF的向量化嵌入之后。为了可视化向量化嵌入的效果，通过UMAP进行降维，然后基于降维的结果进行散点图可视化。通过散点图可视化可以发现向量化的嵌入结果可以比较好的将不同物种分类来源的微生物基因组区分开来。 […]
谢桂纲 on 通过diamond软件进行blastp搜索2026年2月15日
😲啊？

【单细胞分析方法】VeTra：基于RNA速度的轨迹推断工具

单细胞数据拟时序分析生物学假设原理

RNA 速率（RNA velocity）