数据类型

最新推荐文章于 2022-08-10 19:16:57 发布

weixin_43693639

最新推荐文章于 2022-08-10 19:16:57 发布

阅读量364

点赞数

CC 4.0 BY-SA版权

分类专栏： python基础知识

本文链接：https://blog.youkuaiyun.com/weixin_43693639/article/details/97140285

python基础知识专栏收录该内容

1 篇文章

订阅专栏

本文介绍了Python的基础数据结构，包括列表、元组、字典和集合。列表是可变的，支持多种操作如append、insert、sort等；元组不可变，可以使用count方法统计元素出现次数；字典是键值对的集合，可变且可通过get方法获取默认值；集合是无序且唯一的，提供了多种集合操作如并集、交集和差集。

Python基础知识

第一部分数据类型

数据结构（Data Structures）只是一种结构，能够将一些数据聚合在一起。

Python中有四种内置的数据结构——列表（List）、元组（Tuple）、字典（Dictionary）和集合（Set）。

1、列表[list]——[ , ]——可变的（Mutable）

1.1 创建列表

可使用中括号或者list类型函数来定义列表

bicycles = ['trek','cannondale','redline','specialized']
print(bicycles)
print(bicycles[0])
print(bicycles[-1])
print(bicycles[3])
print(bicycles[0].title())

['trek', 'cannondale', 'redline', 'specialized']
trek
specialized
specialized
Trek

gen = range(10)
list(gen)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

1.2 列表方法

append()方法——在列表末尾添加元素

bicycles = ['trek','cannondale','redline','specialized']
print(bicycles)
bicycles.append('honda')
print(bicycles)

['trek', 'cannondale', 'redline', 'specialized']
['trek', 'cannondale', 'redline', 'specialized', 'honda']

insert()方法——在列表任何位置添加新元素

bicycles = ['trek','cannondale','redline','specialized']
bicycles.insert(0,'honda')
print(bicycles)

['honda', 'trek', 'cannondale', 'redline', 'specialized']

pop()方法——删除列表的一个元素（默认最后一个），并返回该元素的值

bicycles = ['trek','cannondale','redline','specialized']
print(bicycles)

popped_bicycles = bicycles.pop()
print(bicycles)
print(popped_bicycles)

first_bicycles = bicycles.pop(0)
print(bicycles)
print(first_bicycles)

['trek', 'cannondale', 'redline', 'specialized']
['trek', 'cannondale', 'redline']
specialized
['cannondale', 'redline']
trek

remove()方法——根据值删除元素，可接着使用它的值

bicycles = ['trek','cannondale','redline','specialized']
print(bicycles)

first_bicycles = 'trek'
bicycles.remove(first_bicycles)
print(bicycles)
print(first_bicycles)

['trek', 'cannondale', 'redline', 'specialized']
['cannondale', 'redline', 'specialized']
trek

extend()方法——在已经定义的列表末尾添加多个元素

x = [4, None, 'foo']

x.extend([7, 8, (2, 3)])
print(x)

[4, None, 'foo', 7, 8, (2, 3)]

sort()方法——对列表进行永久性排序(按字母顺序)

a = ['saw', 'small', 'He', 'foxes', 'six']

a.sort()
print(a)

a.sort(key = len)
print(a)

a.sort(reverse = True)
print(a)

['He', 'foxes', 'saw', 'six', 'small']
['He', 'saw', 'six', 'foxes', 'small']
['small', 'six', 'saw', 'foxes', 'He']

reverse()——倒着打印列表

a = ['saw', 'small', 'He', 'foxes', 'six']
print(a)

a.reverse()
print(a)

['saw', 'small', 'He', 'foxes', 'six']
['six', 'foxes', 'He', 'small', 'saw']

count()——统计某各元素在列表中出现的次数

a = [1, 2, 2, 2, 5, 3 ,4]

a.count(2)

1.3 列表函数

len函数——获取列表长度（计算列表元素数从1开始）

cars = ['bmw', 'audi', 'toyota', 'subaru']
len(cars)

sorted函数——对列表进行临时排序

cars = ['bmw', 'audi', 'toyota', 'subaru']

print("Here is the original list:")
print(cars)

print("\nHere is the original list:")
print (sorted(cars))

print("\nHere is the original list again:")
print(cars)

Here is the original list:
['bmw', 'audi', 'toyota', 'subaru']

Here is the original list:
['audi', 'bmw', 'subaru', 'toyota']

Here is the original list again:
['bmw', 'audi', 'toyota', 'subaru']

zip函数——将列表、元组或其他序列的元素配对，新建一个元组构成的列表

seq1 = ['foo', 'bar', 'baz']
seq2 = ['one', 'two', 'three']

zipped = zip(seq1, seq2)
list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

reversed函数——将序列的元素倒序排列

list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

enumerate函数——构建一个字典，将序列值（假设是唯一的）映射到索引位置上

some_list = ['foo', 'bar', 'baz']
mapping = {}

for i, v in enumerate(some_list):
    mapping[v] = i
    
print(mapping)

{'foo': 0, 'bar': 1, 'baz': 2}

range函数——创建数字集，从指定的第一个值开始数，并到达指定的第二个值停止，输出不包含第二个值

for value in range(1,5):
    print(value)

numbers = list(range(1, 6))
print(numbers)

[1, 2, 3, 4, 5]

even_numbers = list(range(2,11,2))
print(even_numbers)

[2, 4, 6, 8, 10]

squares = []
for value in range(1, 11):
    squares.append(value**2)
    
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

列表统计计算

digits = [1, 2, 3, 4, 5, 6]

min(digits)
max(digits)
sum(digits)

1.4 列表模块

bisect模块——实现了二分搜索和已排序列表的插值

import bisect
c = [1, 2, 2, 2, 3, 4, 7]

bisect.bisect(c, 6)    #找到元素应该被插入的位置

import bisect
c = [1, 2, 2, 2, 3, 4, 7]

bisect.insort(c, 6)    #将元素插入相应位置
print(c)

[1, 2, 2, 2, 3, 4, 6, 7]

1.5 列表解析

squares = [value**2 for value in range(1,11)]
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

1.6 切片——到达你指定的第二个索引前面的元素后停止

seq = [7, 5, 8, 9, 0, 1, 3, 2]

seq[1:5]

[5, 8, 9, 0]

seq[3:4] = [6, 3]
seq

[7, 5, 8, 6, 3, 0, 1, 3, 2]

seq[ : 5]
seq

[7, 5, 8, 6, 3, 0, 1, 3, 2]

seq[3: ]
seq

[7, 5, 8, 6, 3, 0, 1, 3, 2]

seq[-6:-2]
seq

[7, 5, 8, 6, 3, 0, 1, 3, 2]

seq[ : :2]    #布进值step在第二个冒号后使用，表示每隔多少个数取一个值
seq

[7, 5, 8, 6, 3, 0, 1, 3, 2]

seq[ : :-1]    #对列表或元组进行翻转，向步进值传-1
seq

[7, 5, 8, 6, 3, 0, 1, 3, 2]

my_seq = seq[:]    #复制列表

print(seq)
print(my_seq)

[7, 5, 8, 6, 3, 0, 1, 3, 2]
[7, 5, 8, 6, 3, 0, 1, 3, 2]

1.7 列表推导

基本形式：

[expr for val in collection if condition]

result = []
for val in collection:
    if condition:
        result.append(expr)

案例：给定一个字符串列表，过滤出长度大于2的，并将字母改写为大写

strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

1.8 嵌套列表推导

案例：获得一个列表包含所有含有2个以上字母e的名字

all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
           ['Maria', 'Juan', 'Javier', 'Natalia', 'Plilar']]

result = [name for names in all_data for name in names
         if name.count('e') >= 2]
print(result)

['Steven']

# for 循环：
all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
           ['Maria', 'Juan', 'Javier', 'Natalia', 'Plilar']]

names_of_interest = []
for names in all_data:
    enough_es = [name for name in names if name.count('e') >= 2]
    names_of_interest.extend(enough_es)
    
print(names_of_interest)

['Steven']

2、元组(tuple)——小括号（ , ）——固定长度，不可变

2.1 创建元组

tup = 4 , 5, 6
tup

(4, 5, 6)

nested_tup = (4, 5, 6),(7, 8)

print(nested_tup)

((4, 5, 6), (7, 8))

one_tup = (1,)    #只包含一个值的元组，需要加逗号

print(one_tup )

(1,)

2.2 元组方法

count()——计量某个数值在元组中出现的次数

a = (1, 2, 2, 3, 4, 2, 5)

a.count(2)

2.3 元组函数

tuple函数——将任意序列或迭代器转换为元组

tuple([4, 0, 2])

(4, 0, 2)

tup = tuple('sting')
print(tup)

('s', 't', 'i', 'n', 'g')

tup[0]

's'

2.4 修改元组变量——可以给存储元组的变量赋值

dimensions = (200, 50)
print("Original dimensions:")
for dimension in dimensions:
    print(dimension)

    
dimensions = (400, 100)
print("\nOriginal dimensions:")
for dimension in dimensions:
    print(dimension)

Original dimensions:
200
50

Original dimensions:
400
100

2.5 元组拆包

seq = [(1, 2, 3),(4, 5, 6),(7, 8, 9)]

for a,b,c in seq:
    print('a = {0}, b = {1}, c = {2}'.format(a,b,c))

a = 1, b = 2, c = 3
a = 4, b = 5, c = 6
a = 7, b = 8, c = 9

values = 1, 2, 3, 4, 5
a, b, *rest = values    #不想要的变量/=a, b, *_ = values

a, b
rest

[3, 4, 5]

3、字典{dictionary}哈希表/关联数组——{key1：value1，key2：value2}——可变的

3.1 创建字典

alien_0 = {'color': 'green','points': 5}

print(alien_0['color'])
print(alien_0['points'])

green
5

字典是一系列键-值对。

键可以是数字、字符串或者元组，键必须时不可变对象，键必须是唯一的。

值可以是数字、字符串、列表乃至字典。

键-值对的排列顺序与添加顺序不同，python只关心键和值之间的关联关系。

3.2 使用字典

创建字典

empty_dict = {}
d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4]}

print(d1)

{'a': 'some value', 'b': [1, 2, 3, 4]}

添加键-值对

d1['c'] = 'an integer'    

print(d1)

{'a': 'some value', 'b': [1, 2, 3, 4], 'c': 'an integer'}

访问字典

d1['b']

[1, 2, 3, 4]

检查字典是否含有一个键

'b' in d1

True

del关键字删除值

必须指定字典名和要删除的键，永久删除

d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4], 'c': 'an integer'}

del d1['c'] 
print(d1)

{'a': 'some value', 'b': [1, 2, 3, 4]}

修改字典中的值

d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4], 'c': 'an integer'}

d1['c'] = 'green'
print(d1)

{'a': 'some value', 'b': [1, 2, 3, 4], 'c': 'green'}

多行定义字典

favorite_languages = {
    'jen' : 'python',
    'sarah' : 'c',
    'edward' : 'ruby',
    'phil' : 'python',
    }

print("Sarah's favourite language is " +
     favorite_languages['sarah'].title() +
     ".")

Sarah's favourite language is C.

3.3 字典方法

update方法——将两个字典合并（数据含有相同的键，将会被覆盖）

d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4]}

d1.update({'b' : 'foo', 'c' : 12})
print(d1)

{'a': 'some value', 'b': 'foo', 'c': 12}

pop()方法——删除值

d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4], 'c': 'an integer'}

ret = d1.pop('c')
print(ret)
print(d1)

an integer
{'a': 'some value', 'b': [1, 2, 3, 4]}

从序列生成字典——zip()和reversed()和range()

mapping = dict(zip(range(5), reversed(range(5))))

print(mapping)

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

get()方法——返回一个默认值

value = some_dict.get(key, default_value)

带有默认值的get方法会在key参数不是字典的键时返回None,而pop方法会抛出异常。

setdefault()方法——字典中的集合通过设置，成为另一种集合

words =['apple', 'bat', 'bar', 'atom', 'book']
by_letter = {}

for word in words:
    letter = word[0]
    by_letter.setdefault(letter, []).append(word)
    
print(by_letter)

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

words =['apple', 'bat', 'bar', 'atom', 'book']
by_letter = {}

for word in words:
    letter = word[0]
    if letter not in by_letter:
        by_letter[letter] = [word]
    else:
        by_letter[letter].append(word)
        
print(by_letter)

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

3.4 字典模块中的类

defaultdict

from collections import defaultdict
by_letter = defaultdict(list)
for word in words:
    by_letter[word[0]].append(word)
    
print(by_letter)

defaultdict(<class 'list'>, {'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']})

3.5 有效的字典键类型

hash函数（哈希化）

hash('string')

-7822124444377836189

hash((1, 2, [2,3]))    #会因为列表时可变的失败

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-5-e399656be800> in <module>
----> 1 hash((1, 2, [2,3]))


TypeError: unhashable type: 'list'

将列表转变称元组

d = {}
d[tuple([1, 2, 3])] = 5

print(d)

{(1, 2, 3): 5}

3.6 字典推导

基本形式：

dict_comp = {key-expr : value-expr for value in collection
            if condition}

案例：创建一个将字符串与其位置相匹配的字典

strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

loc_mapping = {val : index for index, val in enumerate(strings)}
print(loc_mapping)

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

4、集合{set}——无序且唯一——不可变

4.1 创建集合

set([2, 2, 2, 1, 3, 3])

{1, 2, 3}

{2, 2, 2, 1, 3, 3}

{1, 2, 3}

4.2 集合操作

a.add()——N/A——将元素x加入集合a

a.clear()——N/A——将集合重置为空，清空所有元素

a.remove(x)——N/A——从集合a一处某个元素

a.pop()——N/A——移除任意元素，如果集合是空的抛出keyError

a.union(b)——a|b——a和b中的所有不同元素

a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}

a.union(b)

{1, 2, 3, 4, 5, 6, 7, 8}

a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}

a | b

{1, 2, 3, 4, 5, 6, 7, 8}

a.update(b)——a|=b——将a的内容设置为a和b的并集

a = {1, 2, 3, 4, 5}
c = a.copy()

c |= b
print(c)

{1, 2, 3, 4, 5, 6, 7, 8}

a.intersection(b)——a&b——a、b中同时包含的元素

a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}

a.intersection(b)

{3, 4, 5}

a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}

a & b

{3, 4, 5}

a.intersection_update(b)——a&=b——将a的内容设置为a和b的交集

d = a.copy()
d &= b

d

{3, 4, 5}

a.difference(b)——a-b——在a不在b的元素

a.difference_update(b)——a-=b——将a的内容设为在a不在b的交集

a.symmetric_difference(b)——a^b——所有在a或b中，但不是同时在a、b中的元素

a.symmetric_difference_update(b)——a^=b——将a的内容设为所有在a或b中，但不是同时在a、b中的元素

a.issubset(b)——N/A——如果a包含于b返回True

a.issuperset(b)——N/A——如果a包含b返回True

a.isdisjoint(b)——N/A——a、b没有交集返回True

4.3 集合的检验

列表型元素转换称元组

my_data = [1, 2, 3, 4]
my_set = {tuple(my_data)}

print(my_set)

{(1, 2, 3, 4)}

检查一个集合是否是另一个集合的子集（包含于）或超集（包含）

a_set = {1, 2, 3, 4, 5}
{1, 2, 3}.issubset(a_set)

True

a_set.issuperset({1, 2, 3})

True

{1, 2, 3} == {3, 2, 1}

True

4.4 集合推导

基本形式：

set_comp = {expr for value in collection if condition}

案例：想要一个集合，集合里包含列表中字符串的长度

strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

unique_lengths = {len(x) for x in strings}
print(unique_lengths)

{1, 2, 3, 4, 6}

set(map(len, strings))

{1, 2, 3, 4, 6}

《Python编程从入门到实践》
《利用Python进行数据分析》

数据类型

Python基础知识

第一部分 数据类型

数据结构（Data Structures）只是一种结构，能够将一些数据聚合在一起。

Python中有四种内置的数据结构——列表（List）、元组（Tuple）、字典（Dictionary）和集合（Set）。

1、列表[list]——[ , ]——可变的（Mutable）

1.1 创建列表

可使用中括号或者list类型函数来定义列表

1.2 列表方法

append()方法——在列表末尾添加元素

insert()方法——在列表任何位置添加新元素

pop()方法——删除列表的一个元素（默认最后一个），并返回该元素的值

remove()方法——根据值删除元素，可接着使用它的值

extend()方法——在已经定义的列表末尾添加多个元素

sort()方法——对列表进行永久性排序(按字母顺序)

reverse()——倒着打印列表

count()——统计某各元素在列表中出现的次数

1.3 列表函数

len函数——获取列表长度（计算列表元素数从1开始）

sorted函数——对列表进行临时排序

zip函数——将列表、元组或其他序列的元素配对，新建一个元组构成的列表

reversed函数——将序列的元素倒序排列

enumerate函数——构建一个字典，将序列值（假设是唯一的）映射到索引位置上

range函数——创建数字集，从指定的第一个值开始数，并到达指定的第二个值停止，输出不包含第二个值

列表统计计算

1.4 列表模块

bisect模块——实现了二分搜索和已排序列表的插值

1.5 列表解析

1.6 切片——到达你指定的第二个索引前面的元素后停止

1.7 列表推导

基本形式：

案例：给定一个字符串列表，过滤出长度大于2的，并将字母改写为大写

1.8 嵌套列表推导

案例：获得一个列表包含所有含有2个以上字母e的名字

2、元组(tuple)——小括号（ , ）——固定长度，不可变

2.1 创建元组

2.2 元组方法

count()——计量某个数值在元组中出现的次数

2.3 元组函数

tuple函数——将任意序列或迭代器转换为元组

2.4 修改元组变量——可以给存储元组的变量赋值

2.5 元组拆包

3、字典{dictionary}哈希表/关联数组——{key1：value1，key2：value2}——可变的

3.1 创建字典

字典是一系列键-值对。

键可以是数字、字符串或者元组，键必须时不可变对象，键必须是唯一的。

值可以是数字、字符串、列表乃至字典。

键-值对的排列顺序与添加顺序不同，python只关心键和值之间的关联关系。

3.2 使用字典

创建字典

添加键-值对

访问字典

检查字典是否含有一个键

del关键字删除值

必须指定字典名和要删除的键，永久删除

修改字典中的值

多行定义字典

3.3 字典方法

update方法——将两个字典合并（数据含有相同的键，将会被覆盖）

pop()方法——删除值

从序列生成字典——zip()和reversed()和range()

get()方法——返回一个默认值

value = some_dict.get(key, default_value)

带有默认值的get方法会在key参数不是字典的键时返回None,而pop方法会抛出异常。

setdefault()方法——字典中的集合通过设置，成为另一种集合

3.4 字典模块中的类

defaultdict

3.5 有效的字典键类型

hash函数（哈希化）

将列表转变称元组

3.6 字典推导

基本形式：

案例：创建一个将字符串与其位置相匹配的字典

4、集合{set}——无序且唯一——不可变

4.1 创建集合

4.2 集合操作

a.add()——N/A——将元素x加入集合a

a.clear()——N/A——将集合重置为空，清空所有元素

a.remove(x)——N/A——从集合a一处某个元素

a.pop()——N/A——移除任意元素，如果集合是空的抛出keyError

第一部分数据类型