基于 python 的单细胞转录因子分析

基于 python 的单细胞转录因子分析

pyscenic


前言

流程极为简单,几乎没有任何难度


Main

Install pyscenic

!Attention, python version >=3.7

pip install pyscenic

Download reference datas

wget -c https://github.com/aertslab/pySCENIC/archive/refs/heads/master.zip
x master.zip
cd master
mv resources/* ../../
wget -c https://resources.aertslab.org/cistarget/motif2tf/motifs-v9-nr.hgnc-m0.001-o0.0.tbl

wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/tc_v1/gene_based/encode_20190621__ChIP_seq_transcription_factor.hg19-tss-centered-5kb.max.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/tc_v1/gene_based/encode_20190621__ChIP_seq_transcription_factor.hg19-500bp-upstream.max.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/tc_v1/gene_based/encode_20190621__ChIP_seq_transcription_factor.hg19-tss-centered-10kb.max.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg38/refseq_r80/tc_v1/gene_based/encode_20190621__ChIP_seq_transcription_factor.hg38__refseq-r80__10kb_up_and_down_tss.max.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg38/refseq_r80/tc_v1/gene_based/encode_20190621__ChIP_seq_transcription_factor.hg38__refseq-r80__500bp_up_and_100bp_down_tss.max.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/gene_based/hg19-500bp-upstream-7species.mc8nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-500bp-upstream-7species.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/gene_based/hg19-500bp-upstream-10species.mc8nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-500bp-upstream-10species.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg38/refseq_r80/mc9nr/gene_based/hg38__refseq-r80__500bp_up_and_100bp_down_tss.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/gene_based/hg19-tss-centered-10kb-7species.mc8nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-7species.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg38/refseq_r80/mc9nr/gene_based/hg38__refseq-r80__10kb_up_and_down_tss.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/gene_based/hg19-tss-centered-10kb-10species.mc8nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-10species.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/gene_based/hg19-tss-centered-5kb-7species.mc8nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-5kb-7species.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-5kb-7species.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/gene_based/hg19-tss-centered-5kb-10species.mc8nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-5kb-10species.mc9nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/region_based/hg19-regions-9species.all_regions.mc8nr.feather
wget -c https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/region_based/hg19-regions-9species.all_regions.mc9nr.feather

The pipline of pyscenic only 3 steps

Step.1


pyscenic grn \
        --num_workers 6 \
        -o /data/expr_mat.adjacencies.tsv \ # input Count data
        # csv (rows=cells x columns=genes) or loom (rows=genes x columns=cells).
        /data/expr_mat.tsv \
        /data/allTFs_hg38.txt

Step.2

pyscenic ctx \
        /data/expr_mat.adjacencies.tsv \ # First Step out put file
        /data/hg19-tss-centered-5kb-7species.mc9nr.feather \
        /data/hg19-tss-centered-10kb-7species.mc9nr.feather \
        --annotations_fname /data/motifs-v9-nr.hgnc-m0.001-o0.0.tbl \
        --expression_mtx_fname /data/expr_mat.tsv \ # the same to the first input data
        --mode "dask_multiprocessing" \
        --output /data/regulons.csv \
        --num_workers 6

Step.3

pyscenic aucell \
        /data/expr_mat.tsv \
        /data/regulons.csv \
        -o /data/auc_mtx.csv \
        --num_workers 6
### 细胞通讯与转录因子分析工具及方法 #### 单细胞技术下的细胞通讯和转录因子研究 单细胞技术使得研究人员能够聚焦到个体细胞层面,深入探究基因表达模式、发育路径以及细胞间交互作用等问题。特别是在复杂的生物体系中,如肿瘤微环境里,细胞如何响应外部信号并改变自身的转录状态成为了一个重要的研究方向[^1]。 #### SCENIC及其Python版本PySCENIC的应用 针对单细胞数据中的转录因子分析,SCENIC是一个被广泛认可的选择;而为了提高计算效率,基于Python开发的PySCENIC也被更多地应用于实际项目之中。这些工具可以帮助识别特定条件下活跃的转录因子,并构建相应的调控网络图谱。 #### GENIE3/GRNBoost用于推断转录调控关系 GENIE3或GRNBoost作为SCENIC流程的一部分,利用机器学习的方法—具体来说是随机森林模型—来评估不同转录因子对目标基因的影响程度。通过这种方式可以量化两者之间的关联强度,进而推测潜在的调控机制[^3]。 #### 实施细胞通讯与转录因子联合分析的具体步骤概述 虽然这里不使用诸如“首先”这样的引导词,但在执行此类综合性的生物信息学分析时通常会涉及以下几个方面的工作: - **获取高质量的数据集**:确保所使用的单细胞测序数据具有足够的覆盖度和准确性。 - **预处理阶段**:包括质量控制(QC),标准化(normalization),批次效应校正(batch correction)等操作以优化后续分析效果。 - **应用专门设计好的软件包来进行TF活性估计**:比如前述提到过的SCENIC或者其改进版PySCENIC。 - **探索性数据分析EDA**:可视化展示结果,寻找有意义的趋势或规律。 - **功能富集测试**:确定哪些通路可能受到了显著影响。 - **验证假设并通过实验手段进一步确认发现的结果** ```python import scanpy as sc from pyscenic.cli.utils import load_signatures from pyscenic.aucell import aucell # 加载AnnData对象 adata = sc.read_h5ad('path_to_your_data.h5ad') # 导入已知的motif-gene签名文件 signatures = load_signatures('path_to_signature_file.txt') # 计算AUC得分矩阵 auc_mtx = aucell(adata, signatures) # 将AUC得分添加回原始AnnData对象中 adata.obsm['AUC'] = auc_mtx.toarray() ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值