*args
传递多个变量进来
**kwargs
传递个字典过来
def func(**kwargs):
for key, value in kwargs.items():
print(key + ':' + value)
lambda
map(func, seq) 会遍历所有items在seq中
⚠️:要使用list(xxxxx)来读取数据
filter(func, seq)
表达的是一个判断,返回为True的原数值
list(map(lambda x : x % 2, range(10)))
# => [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
list(filter(lambda x : x % 2, range(10)))
# => [1, 3, 5, 7, 9]
iter() + next()
e.g 每一次调用next都会读一个数据
# Create a list of strings: flash
flash = ['jay garrick', 'barry allen', 'wally west', 'bart allen']
# Create an iterator for flash: superhero
superhero = iter(flash)
# Print each item from the iterator
print(next(superhero)) # jay garrick
print(next(superhero)) # barry allen
print(next(superhero)) # wally west
print(next(superhero)) # bart allen
Q: 要做一个这样的tuple list怎么搞?
[(0, ‘a’), (1, ‘b’), (2, ‘c’), (3, ‘dd’)]
[(100, ‘a’), (101, ‘b’), (102, ‘c’), (103, ‘dd’)]
a = ['a', 'b', 'c', 'dd']
enu_a = enumerate(a)
print(type(enu_a))
#<class 'enumerate'>
list_enu_a = list(enu_a)
print(list_enu_a)
#[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'dd')]
list(enumerate(a, start=100))
# [(100, 'a'), (101, 'b'), (102, 'c'), (103, 'dd')]
Q: 如何去遍历读取enumerate里头的值?
# Unpack and print the tuple pairs
for index1, value1 in enumerate(enu_a):
print(index1, value1)
# Change the start index
for index2, value2 in enumerate(enu_a, start=1):
print(index2, value2)
Q: zip()有啥用?
a = ['a', 'b', 'c', 'dd']
b = ['q', 'w', 'e', 'rr']
c = ['a', 's', 'd', 'ff']
list(zip(a, b, c))
# [('a', 'q', 'a'), ('b', 'w', 's'), ('c', 'e', 'd'), ('dd', 'rr', 'ff')]
Q: 情景介绍:Processing large amounts of Twitter data
Sometimes, the data we have to process reaches a size that is too much for a computer’s memory to handle. This is a common problem faced by data scientists. A solution to this is to process an entire data source chunk by chunk, instead of a single go all at once.
使用chunksize批量处理
# Initialize an empty dictionary: counts_dict
counts_dict ={}
# Iterate over the file chunk by chunk
for chunk in pd.read_csv('tweets.csv', chunksize=10):
# Iterate over the column in DataFrame
for entry in chunk['lang']:
if entry in counts_dict.keys():
counts_dict[entry] += 1
else:
counts_dict[entry] = 1
# Print the populated dictionary
print(counts_dict)
Q:这个matrix怎么用[…]一行写出来?
matrix = [[col for col in range(5)] for row in range(5)]
or
matrix = [[col for col in range(5)]] * 5
generator function
应用:list创造出来的数据如果很大的话非常占内存,这时候可以换一种方法处理,就是用generator,思想是逐步计算出来,而不是一次性全算出来(内存不够用)。其中yield就相当于return,具体看这篇文章。next()与其连用读取数据
e.g.
# Create a list of strings: lannister
lannister = ['cersei', 'jaime', 'tywin', 'tyrion', 'joffrey']
# Create a generator object: lengths
lengths = (len(person) for person in lannister)
# Iterate over and print the values in lengths
for value in lengths:
print(value)
相当于
# Create a list of strings
lannister = ['cersei', 'jaime', 'tywin', 'tyrion', 'joffrey']
# Define generator function get_lengths
def get_lengths(input_list):
"""Generator function that yields the
length of the strings in input_list."""
# Yield the length of a string
for person in input_list:
yield len(person)
# Print the values generated by get_lengths()
for value in get_lengths(lannister):
print(value)