ld regression

最新推荐文章于 2024-09-09 17:06:39 发布

tmjdone

最新推荐文章于 2024-09-09 17:06:39 发布

阅读量571

点赞数

分类专栏： bioinformatics 文章标签： distance pair parameters output c object

bioinformatics 专栏收录该内容

31 篇文章

订阅专栏

本文介绍了一种使用R语言估算连锁不平衡(LD)随距离衰减的方法，基于Hill和Weir提出的公式。通过非线性模型拟合，估算出群体重组参数。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

source: http://www.r-bloggers.com/estimate-decay-of-linkage-disequilibrium-with-distance/

It is well known that linkage disequilibrium (LD) decays with distance. Several functions have been proposed to estimate such decay. Among the most widely used are the Hill and Weir (1) formula for describing the decay of r² and a formula proposed by Abecasis (2) for describing the decay of D’.
I wrote R functions to estimate decay of LD according to both the formulas for a paper I recently published (3), but I post here only the one according to Hill and Weir (just because is the only one currently in a “publishable” form!). Please, refer to the original publications for details. Here I just use a non-linear model to fit the data do the decay function.

Input:
n: sample size
LD.data: estimates of LD as r2 between pair of markers
distance: the distance between pair of markers
(note that LD.data and distance must be in the same order and of the same length since they represent respectively the LD values and distance of any pair of markers considered)

Output:
HW.nonlinear: object obtained after fitting the non-linear model
new.rho: estimate of population recombination parameter (which is actually C/distance)
fpoints: points obtained fitting the linear model.

Below you find the commands, including some sample data. Any feedback is appreciated!

distance<-c(19,49,81,91,104,131,158,167,30)
LD.data<-c(0,0.07,0.018,0.007,0,0.09,0.09,0.05,0)
n<-52

HW.st<-c(C=0.1)
HW.nonlinear<-nls(LD.data~((10+C*distance)/((2+C*distance)*(11+C*distance)))*(1+((3+C*distance)*(12+12*C*distance+(C*distance)^2))/(n*(2+C*distance)*(11+C*distance))),start=HW.st,control=nls.control(maxiter=100))
tt<-summary(HW.nonlinear)
new.rho<-tt$parameters[1]
fpoints<-((10+new.rho*distance)/((2+new.rho*distance)*(11+new.rho*distance)))*(1+((3+new.rho*distance)*(12+12*new.rho*distance+(new.rho*distance)^2))/(n*(2+new.rho*distance)*(11+new.rho*distance)))

References:
(1) Hill WG, Weir BS (1988) Variances and covariances of squared linkage disequilibria in finite populations. Theor Popul Biol 33:54–78
(2) Abecasis GR et al (2001) Extent and distribution of linkage disequilibrium in three genomic regions. Am J Hum Genet 68:191–197
(3) Marroni et al (2011) Nucleotide diversity and linkage disequilibrium in Populus nigra cinnamyl alcohol dehydrogenase (CAD4) gene. Tree Genetics & Genomes, DOI 10.1007/s11295-011-0391-5.