バイオインフォマティクス

TOP 統計学獣医学 IT 生命工学バイオインフォマティクスケモインフォマティクス機械学習数学（統計以外）趣味ブログについて

作成日:2024年4月23日

元論文

CellOracleのcell identityの遷移シュミレーション

目的：CellOracleのアルゴリズムと実装について理解する。

制御遺伝子の摂動に伴うcell identityのシュミレーション
1. データの前処理
2. ネットワーク内のシグナル伝播
3. 遷移確率の推定
4. cell identityの遷移シュミレーションの解析
この4つのプロセスからシュミレーションを行う。データの前処理：
Velocyto.pyを修正したファイルで前処理を行った。ネットワーク内でのシグナル伝播：
TFの摂動がcell identityに及ぼす影響を推定するために行う。CellOracleは入力のTF発現のシフトがどのように標的遺伝子発現のシフトにつながるかをシミュレートし、偏微分∂x/∂jを用いる。線形モデルの場合∂x/∂jは定数となる。また、遺伝子jが遺伝子iによって直接制御されている場合には、前のステップでbi,jは計算済みとなる。 $\frac{\partial x_{j}}{\partial x_{i}} = b_{i, j} .$ 以下は制御遺伝子のシフトに対する標的遺伝子の制御を求める計算である。 ${Δ x}_{j} = \frac{\partial x_{j}}{\partial x_{i}} {Δ x}_{i} = b_{i, j} {Δ x}_{i} .$ 以下は間接的な接続を持つネットワークエッジ接続における線形モデルの合成関数で、それに応じて微分可能である。そのため、間接的に接続されたノード間でも、連鎖法則を用いて標的遺伝子の偏微分を計算することができる。 $\frac{\partial x_{j}}{\partial x_{i}} = \prod_{k = 0}^{n} \frac{\partial x_{k + 1}}{\partial x_{k}} = \prod_{k = 0}^{n} b_{k, k + 1},$ 具体的には遺伝子0から遺伝子1そして遺伝子2のネットワークエッジについて考えるとき、 $\frac{\partial x_{2}}{\partial x_{0}} = \frac{\partial x_{1}}{\partial x_{0}} \times \frac{\partial x_{2}}{\partial x_{1}} = b_{0, 1} \times b_{1, 2}$ ${Δ x}_{2} = \frac{\partial x_{2}}{\partial x_{0}} {Δ x}_{0} = b_{0, 1} b_{1, 2} {Δ x}_{0}$ 標的遺伝子の小さなシフトはGRNモデルの係数bi,jと入力TFシフトの2つの要素のみの乗算で定式化できる。
ポイント：scRNA-seqデータ内の観測不可能な因子を含む可能性のあるモデルの誤差や切片をモデル化いないように、絶対発現値ではなく遺伝子発現方程式の勾配に焦点を与えている。

Code Availability：https://github.com/morris-lab/CellOracle

TOP Statistics Veterinary Medicine IT Biotechnology Bioinformatics Chemoinformatics MachineLearning Mathematics (except statistics) Hobby About this blog

Date created: April 23, 2024

Original Paper

Simulation of Cell Identity Transition in CellOracle

Objective: Understand the algorithm and implementation of CellOracle.

Simulation of cell identity transition accompanying perturbation of regulatory genes:
1. Data preprocessing
2. Signal propagation within the network
3. Estimation of transition probabilities
4. Analysis of cell identity transition simulation
These processes are involved in the simulation. Data preprocessing:
Preprocessing was performed with modified files using Velocyto.py. Signal propagation within the network:
This is done to estimate the effect of TF perturbation on cell identity. CellOracle simulates how a shift in TF expression input leads to a shift in target gene expression and uses partial derivatives ∂x/∂j. For linear models, ∂x/∂j becomes a constant. Additionally, if gene j is directly controlled by gene i, then bi,j is computed in the previous step. $\frac{\partial x_{j}}{\partial x_{i}} = b_{i, j} .$ The following calculates the control of target genes on shifts in control genes. ${Δ x}_{j} = \frac{\partial x_{j}}{\partial x_{i}} {Δ x}_{i} = b_{i, j} {Δ x}_{i} .$ The following is a composite function of linear models for network edge connections with indirectly connected nodes, allowing for differentiation. Thus, even between indirectly connected nodes, the partial derivatives of target genes can be calculated using the chain rule. $\frac{\partial x_{j}}{\partial x_{i}} = \prod_{k = 0}^{n} \frac{\partial x_{k + 1}}{\partial x_{k}} = \prod_{k = 0}^{n} b_{k, k + 1},$ Specifically, when considering the network edge from gene 0 to gene 1 and then to gene 2: $\frac{\partial x_{2}}{\partial x_{0}} = \frac{\partial x_{1}}{\partial x_{0}} \times \frac{\partial x_{2}}{\partial x_{1}} = b_{0, 1} \times b_{1, 2}$ ${Δ x}_{2} = \frac{\partial x_{2}}{\partial x_{0}} {Δ x}_{0} = b_{0, 1} b_{1, 2} {Δ x}_{0}$ Small shifts in target genes can be formulated by multiplying the coefficients bi,j of the GRN model and the input TF shift.
Point: Focusing on the gradient of the gene expression equation rather than the absolute expression values to avoid modeling errors or intercepts in models that may contain unobservable factors in scRNA-seq data.

Code Availability: https://github.com/morris-lab/CellOracle

Vet IT

元論文

CellOracleのcell identityの遷移シュミレーション

Original Paper

Simulation of Cell Identity Transition in CellOracle