tree inference for single-cell data ---- part 2

趋同演化与单细胞肿瘤数据:突变历史重建与序列误差的影响
探讨了趋同演化的可能性在单细胞肿瘤数据中的受限,通过712个SNVs和18个癌症相关变异位点的实例,分析了homozygous mutation如何与无限 sites model相冲突。方法部分介绍了利用augmented ancestral matrix和MCMC技术,以及在处理数据质量不同的实际肿瘤样本时的方法。

插入一点有关趋同演化的内容:
infinite site assumption => each site mutates at most once => 所有相同的变异都同源 (要不然就有可能有一个 site 会有多个 mutations) => 趋同演化不大可能

reconstructing mutation histories from real tumor data

apply it to three real single-cell tumor data sets of different data quality

JAK2-negative myeloproliferative neoplasm

712 SNVs, 58 tumor cells, 18 cancer-related mutation sites,error rates are known, mutation matrix distinguishes three observed states: normal, heterozygous, homozygous mutations.

这里说,如果出现 homozygous mutation, 就跟 infinite sites model 矛盾?

接下来看 Methods 部分的内容
首先定义一个 augmented ancestor matrix A(T),
row: mutation index
column: mutation index + the empty root node index
A_{i,k} = 1 if i=k, or i is an ancestor of k; A_{i,k}=0 otherwise.
vector \sigma: \sigma_{i}, the mutation index of the i-th cell;
Based on this, the likelihood of the data P( D|T, \sigma, \theta ) can be calculated.
可以看到,在式子(12)中,
the likelihood of the data 主要是跟 sequencing error 有关。

MCMC sampling
three elements would influence the final output tree T:
the mutation tree T, the attachment vector \sigma, and the sequencing error rates \theta.
Two ways: one is to marginalize out the \sigma component, and the other one is to consider it.

1: marginalize out the sample attachment
pick a sample, and uniformly choose an attachment point, how to satisfy the necessary properties for the MCMC chain on \sigma to converge.
后面的以后再写。

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值