Python中你不知道的特性-优快云博客

内置函数print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)

本函数是实现对象以字符串表示的方式格式化输出到流文件对象file里。其中所有非关键字参数都按str()方式进行转换为字符串输出，关键字参数sep是实现分隔符，比如多个参数输出时想要输出中间的分隔字符；关键字参数end是输出结束时的字符，默认是换行符\n；关键字参数file是定义流输出的文件，可以是标准的系统输出sys.stdout，也可以重定义为别的文件；参数flush是立即把内容输出到流文件，不作缓存。

print(1, 2, 3, sep = ',', end = '\r\n')  
print(1, 2, 3, sep = ' ', end = '\r\n')  
  
with open(r'c:\\abc1.txt', 'w') as demo:  
    print(1, 2, 3, 88888888888, sep = ',', end = '\n', file = demo)


输出：

1,2,3

 

1 2 3

文件abc1.txt里：

1,2,3,88888888888

无穷嵌套的列表

>>> a = [1, 2, 3, 4]
>>> a.append(a)
>>> a
[1, 2, 3, 4, [...]]
>>> a[4]
[1, 2, 3, 4, [...]]
>>> a[4][4][4][4][4][4][4][4][4][4] == a
True

无穷嵌套的字典

>>> a = {}
>>> b = {}
>>> a['b'] = b
>>> b['a'] = a
>>> print a
{'b': {'a': {...}}}

列表重构

>>> l = [[1, 2, 3], [4, 5], [6], [7, 8, 9]]
>>> sum(l, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
或者

import itertools
data = [[1, 2, 3], [4, 5, 6]]
list(itertools.chain.from_iterable(data))
再或者
from functools import reduce
from operator import add
data = [[1, 2, 3], [4, 5, 6]]
reduce(add, data)

列表元素的序号

>>> l = ["spam", "ham", "eggs"]
>>> list(enumerate(l))
>>> [(0, "spam"), (1, "ham"), (2, "eggs")]
>>> list(enumerate(l, 1)) # 指定计数起点
>>> [(1, "spam"), (2, "ham"), (3, "eggs")]

操作列表切片

>>> a = range(10)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[:5] = [42] # All symbols up to 5 be replaced by "42"
>>> a
[42, 5, 6, 7, 8, 9]
>>> a[:1] = range(5)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> del a[::2] # Delete every second element
>>> a
[1, 3, 5, 7, 9]
>>> a[::2] = a[::-2] # Alternative reserved
>>> a
[9, 3, 5, 7, 1]

跳步切片

a = [1,2,3,4,5]
>>> a[::2] # indicate step
[1,3,5]
或者反着跳
>>> a[::-1] # Reverse list
[5,4,3,2,1]

列表拷贝

错误的做法
>>> x = [1, 2, 3]
>>> y = x
>>> y[2] = 5>>> y
[1, 2, 5]
>>> x
[1, 2, 5]

正确的做法
>>> x = [1,2,3]
>>> y = x[:]
>>> y.pop()

>>> y
[1, 2]
>>> x
[1, 2, 3]

对于递归列表的做法
import copy
my_dict = {'a': [1, 2, 3], 'b': [4, 5, 6]}
my_copy_dict = copy.deepcopy(my_dict)

访问Python的For循环的索引

对于许多人来说这可能是常识，但这也是经常问的。Python的内置enumerate 函数提供索引和值的访问如下：
x = [1, 8, 4, 5, 5, 5, 8, 1, 8]
for index, value in enumerate(x):
　　print(index, value)

通过指定enumerate函数的start参数改变起始索引：
x = [1, 8, 4, 5, 5, 5, 8, 1, 8]
for index, value in enumerate(x, start=1):
　　print(index, value)

现在该索引从1到9而不是0到8

用指定字符连接列表

theList = ["a","b","c"]
joinedString = ",".join(theList)

去除重复元素

Python中用一行代码快速简单的删除一个列表中的重复元素（不维持顺序）：
x = [1, 8, 4, 5, 5, 5, 8, 1, 8]
list(set(x))
这个方法利用了set是一个不同的对象的集合这一事实。然而，set不维持顺序，
因此如果你在乎对象的顺序，使用下面的技术：
from collections import OrderedDict
x = [1, 8, 4, 5, 5, 5, 8, 1, 8]
list(OrderedDict.fromkeys(x))

在一列字符串中去除空字符串

targetList = [v for v in targetList if not v.strip()=='']
# or
targetList = filter(lambda x: len(x)>0, targetList)

将一个列表连接到另一个列表后面

anotherList.extend(aList)

遍历一个字典

for k,v in aDict.iteritems():
print k+v

字典生成

>>> {a:a**2 for a in range(1, 10)}
{1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

集合

>>> a = set([1,2,3,4])
>>> b = set([3,4,5,6])
>>> a | b # Combining{1, 2, 3, 4, 5, 6}
>>> a & b # Intersection{3, 4}
>>> a < b # SubsetsFalse>>> a - b # Variance{1, 2}
>>> a ^ b # The symmetric difference{1, 2, 5, 6}
集合定义必须使用set关键字, 除非使用集合生成器
{ x for x in range(10)} # Generator sets

set([1, 2, 3]) == {1, 2, 3}
set((i*2 for i in range(10))) == {i*2 for i in range(10)}

比较操作

>>> x = 5
>>> 1 < x < 10
True
>>> 10 < x < 20
False
>>> x < 10 < x*10 < 100
True
>>> 10 > x <= 9
True
>>> 5 == x > 4
True

浮点除法

通过将分子或分母转换为float类型，可以确保浮点除法：
answer = a/float(b)

条件赋值

x = 1 if (y == 10) else 2
x = 3 if (y == 1) else 2 if (y == -1) else 1

错误的默认值会导致下面的结果

>>> def foo(x=[]):
... x.append(1)
... print x
...
>>> foo()
[1]
>>> foo()
[1, 1] # A should be [1]>>> foo()
[1, 1, 1]
这时应该把默认值设置为None
>>> def foo(x=None):
... if x is None:
... x = []
... x.append(1)
... print x
>>> foo()
[1]
>>> foo()
[1]

zip合并列表

a = [(1,2), (3,4), (5,6)]
zip(*a)
--》 [(1, 3, 5), (2, 4, 6)]

在字典中合并两个列表

>>> t1 = (1, 2, 3)
>>> t2 = (4, 5, 6)
>>> dict (zip(t1,t2))
{1: 4, 2: 5, 3: 6}

检查一列字符串中是否有任何一个出现在指定字符串里

if any(x in targetString for x in aList):
print "true"

循环遍历数字的范围

for i in [0,1,2,3,4,5]:
　　print i**2

更好的方式 (看上去更好):

for i in range(6):
　　print i**2

在这次循环中发生了什么?
range 在内存中创建一个列表然后for循环遍历这个列表。
两种方式都是在内存中创建6个整数然后迭代每个数字，将它们二次方然后答应出来。所以上面两个循环是使用了完全相同的方式做了相同的事情！

Pythonic方式：使用xrange()
#Python 2.x
for i in xrange(6):
　　print i**2

#Python 3.x
for i in range(6):
　　print i**2

Xrange是一个lazy方式求值的序列对象。
xrange 通过范围（列表）创建一个迭代器并且每次使用只会产生一个数字，因此比上面的方法消耗更少的内存。

循环遍历一个集合

colours = ['red', 'green', 'blue', 'yellow']

for i in range(len(colours)):
　　print colours[i]

#Pythonic方式：
for colour in colours:
　　print colour

遍历一个集合和它的索引

for i in range(len(colours)):
　　print i, '-->', colours[i]

#Pythonic 方式: 使用enumerate()
for i, colour in enumerate(colours):
　　print i, '-->', colour

反向遍历

for i in range(len(colours), -1, -1, -1):
　　print colours[i]

#Pythonic 方式: 使用reversed()
for colour in reversed(colours):
　　print colour

有序遍历

Pythonic 方式: 使用sorted()
for colour in sorted(colours):
　　print colour

有序反向遍历

只需要在sorted方法的参数中添加reverse=True。

Pythonic 方式
for colour in sorted(colours, reverse=True):
　　print colour

并行遍历两个集合

names = ['a', 'b', 'c']
colours = ['red', 'green', 'blue', 'yellow']

n = min(len(colours), len(names))

for i in range(n):
　　print names[i], '-->', colours[i]

更好的方法
for name, colour in zip(names, colours):
　　print name, '-->', colour

zip 在内存中创建由元组组成的第三个列表, 其中每一个元素都是带有指向原始数据指针的独立对象。换句话说，它需要比原来两个列表的总和还要使用更多的内存。
最重要的是”这本来就不是比例决定的”。
Pythonic 方式: 使用 izip()
from itertools import izip
for name, colour in izip(names, colours):
　　print name, '-->', colour

对于较小的列表, zip是比较快的,但是如果你的列表有数百万的数据，那么最要使用izip，应为只有izip 会在需要的时候优先使用迭代器。

字典的missing内置方法

__missing__内置方法消除了KeyError异常, 重新定义了找不到Key时的返回.

class MyDict(dict): # The function of creating a dictionary
　　def __missing__(self, key):
　　return key

...
>>> m = MyDict(a=1, b=2, c=3)
>>> m{'a': 1, 'c': 3, 'b': 2}
>>> m['a'] # The key exists and returns 1
1
>>> m['z'] # Key does not exist and returns the name of the requested key
'z'

以函数为变量

>>> def jim(phrase):
... return 'Jim says, "%s".' % phrase
>>> def say_something(person, phrase):
... print person(phrase)

>>> say_something(jim, 'hey guys')
更高阶的体现
def f(x):
return x + 3
def g(function, x):
return function(x) * function(x)
print g(f, 7)

负的round

>>> round(1234.5678, -2)
1200.0
>>> round(1234.5678, 2)
1234.57

如果你想像C语言一样用{}代替缩进

from __future__ import braces

变量解包

>>> first,second,*rest = (1,2,3,4,5,6,7,8)
>>> first # The first value1
>>> second # The second value2
>>> rest # All other values
[3, 4, 5, 6, 7, 8]

>>> first,*rest,last = (1,2,3,4,5,6,7,8)
>>> first
1
>>> rest
[2, 3, 4, 5, 6, 7]
>>> last
8

类中方法重置

class foo:
　　def normal_call(self):
　　　　print("normal_call")
　　def call(self):
　　　　print("first_call")
　　　　self.call = self.normal_call

>>> y = foo()

>>> y.call()
first_call
>>> y.call()
normal_call
>>> y.call()
normal_call

获取类属性

class GetAttr(object):
　　def __getattribute__(self, name):
　　　　f = lambda: "Hello {}".format(name)
　　　　return f

>>> g = GetAttr()
>>> g.Mark()
'Hello Mark'

动态创建新类

>>> NewType = type("NewType", (object,), {"x": "hello"})
>>> n = NewType()
>>> n.x'hello'
另一个普通版本
>>> class NewType(object):
>>> x = "hello"
>>> n = NewType()
>>> n.x"hello"

异常捕获中使用else

try:
　　function()
except Error:
　　# If not load the try and declared Error
else:
　　# If load the try and did not load except
finally:
　　# Performed anyway

在浏览器中打开页面

import webbrowser
webbrowser.open_new_tab('http://facebook.com/')
#Returns True and open tab

一行一行地读文件

with open("/path/to/file") as f:
　　for line in f:
　　　　print line

逐行写文件

f = open("/path/tofile", 'w')
for e in aList:
　　f.write(e + "\n")f.close()

正则匹配查找

sentence = "this is a test, not testing."
it = re.finditer('\\btest\\b', sentence)
for match in it:
　　print "match position: " + str(match.start()) +"-"+ str(match.end())

正则匹配搜索

m = re.search('\d+-\d+', line) #search 123-123 like strings
if m:
　　current = m.group(0)

查询数据库

db = MySQLdb.connect("localhost","username","password","dbname")
cursor = db.cursor()
sql = "select Column1,Column2 from Table1"
cursor.execute(sql)
results = cursor.fetchall()
for row in results:
　　print row[0]+row[1]

db.close()

调用一个外部命令

有时你需要通过shell或命令提示符调用一个外部命令，这在Python中通过使用subprocess模块很容易实现。
只需要运行一条命令：
import subprocess
subprocess.call(['mkdir', 'empty_folder'])

如果想运行一条命令并输出得到的结果：
import subprocess
output = subprocess.check_output(['ls', '-l'])
要说明的是上面的调用是阻塞的。

如果运行shell中内置的命令，如cd或者dir，需要指定标记shell=True：
import subprocess
output = subprocess.call(['cd', '/'], shell=True)
对于更高级的用例，可以使用 Popen constructor。

Python 3.5引进了一个新的run函数，它的行为与call和check_output很相似。如果你使用的是3.5版本或更高版本，看一看run的文档，里面有一些有用的例子。否则，如果你使用的是Python 3.5以前的版本或者你想保持向后兼容性，上面的call和check_output代码片段是你最安全和最简单的选择。

美观打印

开发时使用pprint模块替代标准的print 函数，可以让shell输出的信息更具可读性。这使得输出到shell上的字典和嵌套对象更易读。
import pprint as pp
animals = [{'animal': 'dog', 'legs': 4, 'breeds': ['Border Collie', 'Pit Bull', 'Huskie']}, {'animal': 'cat', 'legs': 4, 'breeds': ['Siamese', 'Persian', 'Sphynx']}]
pp.pprint(animals, width=1)
width参数指定一行上最大的字符数。设置width为1确保字典打印在单独的行。
按属性进行数据分组

假设你查询一个数据库，并得到如下数据：
data = [
{'animal': 'dog', 'name': 'Roxie', 'age': 5},
{'animal': 'dog', 'name': 'Zeus', 'age': 6},
{'animal': 'dog', 'name': 'Spike', 'age': 9},
{'animal': 'dog', 'name': 'Scooby', 'age': 7},
{'animal': 'cat', 'name': 'Fluffy', 'age': 3},
{'animal': 'cat', 'name': 'Oreo', 'age': 5},
{'animal': 'cat', 'name': 'Bella', 'age': 4}
]

通过动物类型分组得到一个狗的列表和一个猫的列表。幸好，Python的itertools有一个groupby函数可以让你很轻松的完成这些。

from itertools import groupby

data = [
{'animal': 'dog', 'name': 'Roxie', 'age': 5},
{'animal': 'dog', 'name': 'Zeus', 'age': 6},
{'animal': 'dog', 'name': 'Spike', 'age': 9},
{'animal': 'dog', 'name': 'Scooby', 'age': 7},
{'animal': 'cat', 'name': 'Fluffy', 'age': 3},
{'animal': 'cat', 'name': 'Oreo', 'age': 5},
{'animal': 'cat', 'name': 'Bella', 'age': 4}
]

for key, group in groupby(data, lambda x: x['animal']):
　　for thing in group:
　　　　print(thing['name'] + " is a " + key)

得到的输出是：
Roxie is a dog
Zeus is a dog
Spike is a dog
Scooby is a dog
Fluffy is a cat
Oreo is a cat
Bella is a cat

groupby()有2个参数：1、我们想要分组的数据，它在本例中是一个字典。2、分组函数：lambda x: x['animal']告诉groupby函数每个字典按动物的类型分组
现在通过列表推导式可以很容易地构建一个狗的列表和一个猫的列表：

from itertools import groupby
import pprint as pp

grouped_data = {}

for key, group in groupby(data, lambda x: x['animal']):
　　grouped_data[key] = [thing['name'] for thing in group]

　　pp.pprint(grouped_data)

最后得到一个按动物类型分组的输出：

{
'cat': [
'Fluffy',
'Oreo',
'Bella'
],
'dog': [
'Roxie',
'Zeus',
'Spike',
'Scooby'
]
}

StackOverflow上这个问题的答案非常有帮助，当我试图找出如何以最Pythonic的方式分组数据时，这篇文章节省了我很多时间。

字符串和日期相互转换
一个常见的任务是将一个字符串转换为一个datetime对象。使用strptime 函数这将很容易做到：
from datetime import datetime
date_obj = datetime.strptime('May 29 2015 2:45PM', '%B %d %Y %I:%M%p')

它的逆操作是转换一个datetime对象为一个格式化的字符串，对datetime对象使用strftime函数：
from datetime import datetime
date_obj = datetime.now()
date_string = date_obj.strftime('%B %d %Y %I:%M%p')

有关格式化代码的列表和他们的用途，查看官方文档

解析JSON文件并写一个对象到JSON文件中

使用load函数可以解析JSON文件并构建一个Python对象。假定有一个叫做data.json的文件包括以下数据：
{
"dog": {
"lives": 1,
"breeds": [
"Border Collie",
"Pit Bull",
"Huskie"
]
},
"cat": {
"lives": 9,
"breeds": [
"Siamese",
"Persian",
"Sphynx"
]
}
}
import json
with open('data.json') as input_file:
data = json.load(input_file)