1. Basic functions
The basic functions of R to deal with regular expression is nchar, tolower, toupper, chartr, paste.
nchargives the number of characters of each element in a character vector.
> temp <- c("Hello", "Kitty", "word")
> nchar(temp)
[1] 5 5 4
tolower, toupperis translate lower and upper cases.
> tolower(temp)
[1] "hello" "kitty" "word"
> toupper(temp)
[1] "HELLO" "KITTY" "WORD"
chartr(old, new, x)translates each character in x that is specified in old to the corresponding character specified in new
> chartr("HKw", "ABs", temp)
[1] "Aello" "Bitty" "sord"
pasteconcatenates vectors after converting to character
> paste ("A", 1:3, sep = "")
[1] "A1" "A2" "A3"
> paste ("A", 1:3, sep = "*")
[1] "A*1" "A*2" "A*3"
> paste(c("A", "B"), 1:7)
[1] "A 1" "B 2" "A 3" "B 4" "A 5" "B 6" "A 7"
> paste("A", 1:5, sep="", collapse = "-")
[1] "A1-A2-A3-A4-A5"
> paste("A", 1:5, sep="", collapse = "*")
[1] "A1*A2*A3*A4*A5"
2.Complex functions
grepsearches for match and return their subscript(place in the vector).greplsearches for match and return the logical value for each element in the vector if it matches.
> temp1 <- c("abs", "abd", "bss")
> grep("s$", temp1)
[1] 1 3
> grepl("s$", temp1)
[1] TRUE FALSE TRUE
> temp1[grep("s$", temp1)]
[1] "abs" "bss"
> temp1[grepl("s$", temp1)]
[1] "abs" "bss"
regexpr、gregexpr、regexecsearch for march and return the place of matching characters in each element in the vector, but the format of their return is different.
> regexpr("s$", temp1)
[1] 3 -1 3
attr(,"match.length")
[1] 1 -1 1
attr(,"useBytes")
[1] TRUE
###############################
> gregexpr("s$", temp1)
[[1]]
[1] 3
attr(,"match.length")
[1] 1
attr(,"useBytes")
[1] TRUE
[[2]]
[1] -1
attr(,"match.length")
[1] -1
attr(,"useBytes")
[1] TRUE
[[3]]
[1] 3
attr(,"match.length")
[1] 1
attr(,"useBytes")
[1] TRUE
##########################################
> regexec("s$", temp1)
[[1]]
[1] 3
attr(,"match.length")
[1] 1
[[2]]
[1] -1
attr(,"match.length")
[1] -1
[[3]]
[1] 3
attr(,"match.length")
[1] 1
sub、gsubsearch for match and replace them. Butsubonly replace the first match in each element (replace for all element but only the first match).gsubreplace all the matches(replace all matches for all elements)
> temp2 <- c("HelloHello", "Kitty", "Hello", "word")
> sub("Hello", "Hi", temp2)
[1] "HiHello" "Kitty" "Hi" "word"
> gsub("Hello", "Hi", temp2)
[1] "HiHi" "Kitty" "Hi" "word"
本文详细介绍了R语言中基本的正则表达式函数(如nchar、tolower、toupper、chartr、paste)及复杂函数(如grep、grepl、regexpr、gregexpr、regexec、sub、gsub)的使用方法,并通过实例展示了如何进行字符串搜索、匹配、替换等操作。

被折叠的 条评论
为什么被折叠?



