原文链接
http://www.endmemo.com/program/R/gsub.php
gsub()
function replaces all matches of a string, if the parameter is a string vector, returns a string vector of the same length and with the same attributes (after possible coercion to character). Elements of string vectors which are not substituted will be returned unchanged (including any declared encoding).
<span style="color:red">gsub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
</span>
• pattern
: string to be matched
• replacement
: string for replacement
• x
: string or string vector
• ignore.case
: if TRUE, ignore case...
<span style="color:red">> x <- "R Tutorial"
> gsub("ut","ot",x)
</span>
<span style="color:blue">[1] "R Totorial"
</span>
Case insensitive replace:
<span style="color:red">> gsub("tut","ot",x,ignore.case=T))
</span>
<span style="color:blue">[1] "R otorial"
</span>
If ignore.case is not set to True, no replace take place:
<span style="color:red">> gsub("tut","ot",x)
</span>
<span style="color:blue">[1] "R Tutorial"
</span>
<span style="color:red">> x <- "line 4322: He is now 25 years old, and weights 130lbs"
> y <- gsub("\\d+","---",x)
> y
</span>
<span style="color:blue">[1] "line ---: He is now --- years old, and weights ---lbs"
</span>
<span style="color:red">> x <- "line 4322: He is now 25 years old, and weights 130lbs"
> y <- gsub("[[:lower:]]","-",x)
> y
</span>
<span style="color:blue">[1] "---- 4322: H- -- --- 25 ----- ---, --- ------- 130---"
</span>
Vector replacement:
<span style="color:red">> x <- c("R Tutorial","PHP Tutorial", "HTML Tutorial")
> gsub("Tutorial","Examples",x)
</span>
<span style="color:blue">[1] "R Examples" "PHP Examples" "HTML Examples"
</span>
Regular Expression Syntax:
Syntax | Description |
\\d | Digit, 0,1,2 ... 9 |
\\D | Not Digit |
\\s | Space |
\\S | Not Space |
\\w | Word |
\\W | Not Word |
\\t | Tab |
\\n | New line |
^ | Beginning of the string |
$ | End of the string |
\ | Escape special characters, e.g. \\ is "\", \+ is "+" |
| | Alternation match. e.g. /(e|d)n/ matches "en" and "dn" |
• | Any character, except \n or line terminator |
[ab] | a or b |
[^ab] | Any character except a and b |
[0-9] | All Digit |
[A-Z] | All uppercase A to Z letters |
[a-z] | All lowercase a to z letters |
[A-z] | All Uppercase and lowercase a to z letters |
i+ | i at least one time |
i* | i zero or more times |
i? | i zero or 1 time |
i{n} | i occurs n times in sequence |
i{n1,n2} | i occurs n1 - n2 times in sequence |
i{n1,n2}? | non greedy match, see above example |
i{n,} | i occures >= n times |
[:alnum:] | Alphanumeric characters: [:alpha:] and [:digit:] |
[:alpha:] | Alphabetic characters: [:lower:] and [:upper:] |
[:blank:] | Blank characters: e.g. space, tab |
[:cntrl:] | Control characters |
[:digit:] | Digits: 0 1 2 3 4 5 6 7 8 9 |
[:graph:] | Graphical characters: [:alnum:] and [:punct:] |
[:lower:] | Lower-case letters in the current locale |
[:print:] | Printable characters: [:alnum:], [:punct:] and space |
[:punct:] | Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~ |
[:space:] | Space characters: tab, newline, vertical tab, form feed, carriage return, space |
[:upper:] | Upper-case letters in the current locale |
[:xdigit:] | Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f |