repr() 函数可以将对象转为 string 类型。
主要用于 NLP 处理,里面存在一些常量列表,包括数字、字母、大写字母、小写字母、标点符号、空格等。
可以用于删除文本中的标点符号,将标点符号 replace 为 空。
>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
>>> string.digits
'0123456789'
>>> string.ascii_letters
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'
>>> string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.hexdigits
'0123456789abcdefABCDEF'
>>> string.printable
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
>>> string.whitespace
' \t\n\r\x0b\x0c'
6.1.1. String constants
The constants defined in this module are:
string.ascii_lettersThe concatenation of the ascii_lowercase and ascii_uppercase constants described below. This value is not locale-dependent.string.ascii_lowercaseThe lowercase letters 'abcdefghijklmnopqrstuvwxyz'. This value is not locale-dependent and will not change.string.ascii_uppercaseThe uppercase letters 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'. This value is not locale-dependent and will not change.string.digitsThe string '0123456789'.string.hexdigitsThe string '0123456789abcdefABCDEF'.string.octdigitsThe string '01234567'.string.punctuationString of ASCII characters which are considered punctuation characters in the C locale.string.printableString of ASCII characters which are considered printable. This is a combination of digits, ascii_letters, punctuation, and whitespace.string.whitespaceA string containing all ASCII characters that are considered whitespace. This includes the characters space, tab, linefeed, return, formfeed, and vertical tab.