前言
G4-iM Grinder, a system for the localization, characterization and selection of potential G4s, i-Motifs and higher order structures. 这个包的功能很明确,就是通过序列分析潜在的核酸非标准螺旋,或者叫高级二级结构。
一、G-quadruplex 是什么?
稳定的G-quadruplex结构可以是由四条核酸链组合而成的复合体,可以是由两条链组合而成的二聚体,也可以是一条单链经过折叠形成的拓扑结构。虽然维系其结构稳定性的核心都是四个鸟嘌呤组成的G4平面堆积,但是由于链骨架的组合与折叠不同,核酸G4联体(G-quadruplex)具有多种多样的空间骨架结构。核酸i-motif结构是一种由富含胞嘧啶的的核酸序列折叠组装而成的四联体核酸结构。
二、G4-iM Grinder
G4-iM Grinder算法简介
G4-iM Grinder 包括Method 1 (M1),Method 2 (M2) ,Method 3 (M3)三个子流程。Method 1分两个部分,M1A通过GGG定位分析序列,M1B通过GGG序列之间的X序列确认GGG序列之间的关系,初步判断形成G4的可能性。M2与M3是用来分析潜在G4结构的算法。M2遵循overlapping size-restricted manner (限制折叠),而M3不限制,有可能找到更高级的结构(上图中间的higher-order structure)。在每一次计算中,M2A and M3A负责基因序列的定位,M2B and M3B返回序列频率。
whilst
will not be detected.
Regarding frequency of the quadruplex results, Quadruplexes may actually be repeated because they form part of repetitive nucleotide sequences, including transposon families. For example, several authors have already located recurrent PQS in such repetitive elements (both human and non-human species), which depending on the location and context, may potentially grant different biological significance to the same recurrent quadruplex.
操作流程
首先是安装,这个包比较新, 说明书里的运行环境是R 4.0.3 and R-studio 1.3.1093,兼容性应该不是问题。
devtools::install_github("EfresBR/G4iMGrinder")
library(G4iMGrinder)
如果运行不正常,可能是缺少下面几个包:
pck <- c("stringr", "stringi", "plyr", "seqinr", "stats", "parallel", "doParallel", "beepr", "stats4", "devtools", "dplyr", "BiocManager", "tibble")
#foo was written by Simon O'Hanlon Nov 8 2013.
#Thanks Simon, thanks StackOverflow and all its amazing community.
foo <- function(x){
for( i in x ){
# require returns TRUE invisibly if it was able to load package
if( ! require( i , character.only = TRUE