Python Hashmap/Dictionary 使用指南

本文深入探讨了Python中字典的基本用法及高级特性,包括如何使用get方法安全地获取值、利用in关键字检查键的存在性、通过len获取键值对数量等。此外,还介绍了字典的性能特点和常见应用场景。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Python

来自:http://www.dotnetperls.com/dictionary-python


Built-in Dictionary List Set Tuple 2D Array Bytes Class Console Convert Datetime Duplicates Error File Find If Lambda Len Lower Map Math Namedtuple None Random Re Slice Sort Split String Strip Sub Substring TypeWhile

Dictionary.  A dictionary optimizes element lookups. It associates keys to values. Each key must have a value. Dictionaries are used in many programs.
With square brackets,  we assign and access a value at a key. With get() we can specify a default result. Dictionaries are fast. We create, mutate and test them.
Get example.  There are many ways to get values. We can use the "[" and "]" characters. We access a value directly this way. But this syntax causes a KeyError if the key is not found.

Instead:We can use the get() method with one or two arguments. This does not cause any annoying errors. It returns None.

Argument 1:The first argument to get() is the key you are testing. This argument is required.

Argument 2:The second, optional argument to get() is the default value. This is returned if the key is not found.

Based on: Python 3

Python program that gets values

plants = {}

# Add three key-value tuples to the dictionary.
plants["radish"] = 2
plants["squash"] = 4
plants["carrot"] = 7

# Get syntax 1.
print(plants["radish"])

# Get syntax 2.
print(plants.get("tuna"))
print(plants.get("tuna", "no tuna found"))

Output

2
None
no tuna found
Get, none.  In Python "None" is a special value like null or nil. I like None. It is my friend. It means no value. Get() returns None if no value is found in a dictionary. None

Note:It is valid to assign a key to None. So get() can return None, but there is actually a None value in the dictionary.


Key error.  Errors in programs are not there just to torment you. They indicate problems with a program and help it work better. A KeyError occurs on an invalid access. KeyError
Python program that causes KeyError

lookup = {"cat": 1, "dog": 2}

# The dictionary has no fish key!
print(lookup["fish"])

Output

Traceback (most recent call last):
  File "C:\programs\file.py", line 5, in <module>
    print(lookup["fish"])
KeyError: 'fish'
In-keyword.  A dictionary may (or may not) contain a specific key. Often we need to test for existence. One way to do so is with the in-keyword. In

True:This keyword returns 1 (meaning true) if the key exists as part of a key-value tuple in the dictionary.

False:If the key does not exist, the in-keyword returns 0, indicating false. This is helpful in if-statements.

Python program that uses in

animals = {}
animals["monkey"] = 1
animals["tuna"] = 2
animals["giraffe"] = 4

# Use in.
if "tuna" in animals:
    print("Has tuna")
else:
    print("No tuna")

# Use in on nonexistent key.
if "elephant" in animals:
    print("Has elephant")
else:
    print("No elephant")

Output

Has tuna
No elephant
Len built-in.  This returns the number of key-value tuples in a dictionary. The data types of the keys and values do not matter. Len also works on lists and strings.

Caution:The length returned for a dictionary does not separately consider keys and values. Each pair adds one to the length.

Python program that uses len on dictionary

animals = {"parrot": 2, "fish": 6}

# Use len built-in on animals.
print("Length:", len(animals))

Output

Length: 2
Len notes.  Let us review. Len() can be used on other data types, not just dictionaries. It acts upon a list, returning the number of elements within. It also handles tuples. Len
Keys, values.  A dictionary contains keys. It contains values. And with the keys() and values() methods, we can store these elements in lists.

Next:A dictionary of three key-value pairs is created. This dictionary could be used to store hit counts on a website's pages.

Views:We introduce two variables, named keys and values. These are not lists—but we can convert them to lists.

Convert
Python program that uses keys

hits = {"home": 125, "sitemap": 27, "about": 43}
keys = hits.keys()
values = hits.values()

print("Keys:")
print(keys)
print(len(keys))

print("Values:")
print(values)
print(len(values))

Output

Keys:
dict_keys(['home', 'about', 'sitemap'])
3
Values:
dict_values([125, 43, 27])
3
Keys, values ordering.  Elements returned by keys() and values() are not ordered. In the above output, the keys-view is not alphabetically sorted. Consider a sorted view (keep reading).
Sorted keys.  In a dictionary keys are not sorted in any way. They are unordered. Their order reflects the internals of the hashing algorithm's buckets.

But:Sometimes we need to sort keys. We invoke another method, sorted(), on the keys. This creates a sorted view.

Python program that sorts keys in dictionary

# Same as previous program.
hits = {"home": 124, "sitemap": 26, "about": 32}

# Sort the keys from the dictionary.
keys = sorted(hits.keys())

print(keys)

Output

['about', 'home', 'sitemap']
Items.  With this method we receive a list of two-element tuples. Each tuple contains, as its first element, the key. Its second element is the value.

Tip:With tuples, we can address the first element with an index of 0. The second element has an index of 1.

Program:The code uses a for-loop on the items() list. It uses the print() method with two arguments.

Python that uses items method

rents = {"apartment": 1000, "house": 1300}

# Convert to list of tuples.
rentItems = rents.items()

# Loop and display tuple items.
for rentItem in rentItems:
    print("Place:", rentItem[0])
    print("Cost:", rentItem[1])
    print("")

Output

Place: house
Cost: 1300

Place: apartment
Cost: 1000
Items, assign.  We cannot assign elements in the tuples. If you try to assign rentItem[0] or rentItem[1], you will get an error. This is the error message.
Python error:

TypeError: 'tuple' object does not support item assignment
Items, unpack.  The items() list can be used in another for-loop syntax. We can unpack the two parts of each tuple in items() directly in the for-loop.

Here:In this example, we use the identifier "k" for the key, and "v" for the value.

Python that unpacks items

# Create a dictionary.
data = {"a": 1, "b": 2, "c": 3}

# Loop over items and unpack each item.
for k, v in data.items():
    # Display key and value.
    print(k, v)

Output

a 1
c 3
b 2
For-loop.  A dictionary can be directly enumerated with a for-loop. This accesses only the keys in the dictionary. To get a value, we will need to look up the value.

Items:We can call the items() method to get a list of tuples. No extra hash lookups will be needed to access values.

Here:The plant variable, in the for-loop, is the key. The value is not available—we would need plants.get(plant) to access it.

Python that loops over dictionary

plants = {"radish": 2, "squash": 4, "carrot": 7}

# Loop over dictionary directly.
# ... This only accesses keys.
for plant in plants:
    print(plant)

Output

radish
carrot
squash
Del built-in.  How can we remove data? We apply the del method to a dictionary entry. In this program, we initialize a dictionary with three key-value tuples. Del

Then:We remove the tuple with key "windows". When we display the dictionary, it now contains only two key-value pairs.

Python that uses del

systems = {"mac": 1, "windows": 5, "linux": 1}

# Remove key-value at "windows" key.
del systems["windows"]

# Display dictionary.
print(systems)

Output

{'mac': 1, 'linux': 1}
Del, alternative.  An alternative to using del on a dictionary is to change the key's value to a special value. This is a null object refactoring strategy.
Update.  With this method we change one dictionary to have new values from a second dictionary. Update() also modifies existing values. Here we create two dictionaries.

Pets1, pets2:The pets2 dictionary has a different value for the dog key—it has the value "animal", not "canine".

Also:The pets2 dictionary contains a new key-value pair. In this pair the key is "parakeet" and the value is "bird".

Result:Existing values are replaced with new values that match. New values are added if no matches exist.

Python that uses update

# First dictionary.
pets1 = {"cat": "feline", "dog": "canine"}

# Second dictionary.
pets2 = {"dog": "animal", "parakeet": "bird"}

# Update first dictionary with second.
pets1.update(pets2)

# Display both dictionaries.
print(pets1)
print(pets2)

Output

{'parakeet': 'bird', 'dog': 'animal', 'cat': 'feline'}
{'dog': 'animal', 'parakeet': 'bird'}
Copy.  This method performs a shallow copy of an entire dictionary. Every key-value tuple in the dictionary is copied. This is not just a new variable reference.

Here:We create a copy of the original dictionary. We then modify values within the copy. The original is not affected.

Python that uses copy

original = {"box": 1, "cat": 2, "apple": 5}

# Create copy of dictionary.
modified = original.copy()

# Change copy only.
modified["cat"] = 200
modified["apple"] = 9

# Original is still the same.
print(original)
print(modified)

Output

{'box': 1, 'apple': 5, 'cat': 2}
{'box': 1, 'apple': 9, 'cat': 200}
Fromkeys.  This method receives a sequence of keys, such as a list. It creates a dictionary with each of those keys. We can specify a value as the second argument.

Values:If you specify the second argument to fromdict(), each key has that value in the newly-created dictionary.

Python that uses fromkeys

# A list of keys.
keys = ["bird", "plant", "fish"]

# Create dictionary from keys.
d = dict.fromkeys(keys, 5)

# Display.
print(d)

Output

{'plant': 5, 'bird': 5, 'fish': 5}
Dict.  With this built-in function, we can construct a dictionary from a list of tuples. The tuples are pairs. They each have two elements, a key and a value.

Tip:This is a possible way to load a dictionary from disk. We can store (serialize) it as a list of pairs.

Python that uses dict built-in

# Create list of tuple pairs.
# ... These are key-value pairs.
pairs = [("cat", "meow"), ("dog", "bark"), ("bird", "chirp")]

# Convert list to dictionary.
lookup = dict(pairs)

# Test the dictionary.
print(lookup.get("dog"))
print(len(lookup))

Output

bark
3
Memoize.  One classic optimization is called memoization. And this can be implemented easily with a dictionary. In memoization, a function (def) computes its result. Memoize

And:Once the computation is done, it stores its result in a cache. In the cache, the argument is the key. And the result is the value.


Memoization, continued.  When a memoized function is called, it first checks this cache to see if it has been, with this argument, run before.

And:If it has, it returns its cached—memoized—return value. No further computations need be done.

Note:If a function is only called once with the argument, memoization has no benefit. And with many arguments, it usually works poorly.


Get performance.  I compared a loop that uses get() with one that uses both the in-keyword and a second look up. Version 2, with the "in" operator, was faster.

Version 1:This version uses a second argument to get(). It tests that against the result and then proceeds if the value was found.

Version 2:This version uses "in" and then a lookup. Twice as many lookups occur. But fewer statements are executed.

Python that benchmarks get

import time

# Input dictionary.
systems = {"mac": 1, "windows": 5, "linux": 1}

# Time 1.
print(time.time())

# Get version.
i = 0
v = 0
x = 0
while i < 10000000:
    x = systems.get("windows", -1)
    if x != -1:
        v = x
    i += 1

# Time 2.
print(time.time())

# In version.
i = 0
v = 0
while i < 10000000:
    if "windows" in systems:
        v = systems["windows"]
    i += 1

# Time 3.
print(time.time())

Output

1345819697.257
1345819701.155 (get = 3.90 s)
1345819703.453 (in  = 2.30 s)
String key performance.  In another test, I compared string keys. I found that long string keys take longer to look up than short ones. Shorter keys are faster. Dictionary String Key
Performance, loop.  A dictionary can be looped over in different ways. In this benchmark we test two approaches. We access the key and value in each iteration.

Version 1:This version loops over the keys of the dictionary with a while-loop. It then does an extra lookup to get the value.

Version 2:This version instead uses a list of tuples containing the keys and values. It actually does not touch the original dictionary.

But:Version 2 has the same effect—we access the keys and values. The cost of calling items() initially is not counted here.

Python that benchmarks loops

import time

data = {"michael": 1, "james": 1, "mary": 2, "dale": 5}
items = data.items()

print(time.time())

# Version 1: get.
i = 0
while i < 10000000:
    v = 0
    for key in data:
        v = data[key]
    i += 1

print(time.time())

# Version 2: items.
i = 0
while i < 10000000:
    v = 0
    for tuple in items:
        v = tuple[1]
    i += 1

print(time.time())

Output

1345602749.41
1345602764.29 (version 1 = 14.88 s)
1345602777.68 (version 2 = 13.39 s)
Benchmark, loop results.  We see above that looping over a list of tuples is faster than directly looping over a dictionary. This makes sense. With the list, no lookups are done.
Frequencies.  A dictionary can be used to count frequencies. Here we introduce a string that has some repeated letters. We use get() on a dictionary to start at 0 for nonexistent values.

So:The first time a letter is found, its frequency is set to 0 + 1, then 1 + 1. Get() has a default return.

Python that counts letter frequencies

# The first three letters are repeated.
letters = "abcabcdefghi"

frequencies = {}
for c in letters:
    # If no key exists, get returns the value 0.
    # ... We then add one to increase the frequency.
    # ... So we start at 1 and progress to 2 and then 3.
    frequencies[c] = frequencies.get(c, 0) + 1

for f in frequencies.items():
    # Print the tuple pair.
    print(f)

Output

('a', 2)
('c', 2)
('b', 2)
('e', 1)
('d', 1)
('g', 1)
('f', 1)
('i', 1)
('h', 1)
A summary.  A dictionary is usually implemented as a hash table. Here a special hashing algorithm translates a key (often a string) into an integer.
For a speedup,  this integer is used to locate the data. This reduces search time. For programs with performance trouble, using a dictionary is often the initial path to optimization.
我前段时间写了一个python程序,用于抓取自己闲鱼页面的商品信息,但是为了避免违规,我想重新编写一个程序,用于在阿奇索后台(开发指南:接入指南 1、接入流程 【开发者操作】登录后台申请AppIds。 入口:https://open.agiso.com/#/my/application/app-list 【开发者操作】申请到AppId后,开发者可以登录后台管理AppId,这里可以查看和更换AppSecret、更改推送url、更改授权回调url等。 入口:https://open.agiso.com/#/my/application/app-list 【商家操作】授权,方法有二: 1、输入开发者提供的AppId(相当于告诉Agiso,允许这个AppId通过Agiso开放平台获取或操作商家自己的订单数据),勾选相应要授权的权限。授权后会显示一个Token,将Token复制给开发者。 入口:https://aldsIdle.agiso.com/#/open/authorize 2、开发者如果有开发自动授权,则商家可以通过访问以下页面进行授权: 入口:https://aldsIdle.agiso.com/#/authorize?appId={$请替换为要授权的开发者的appId}&state=2233 【开发者操作】开发者得到各个商家授权给的Token,并使用Token调用接口。调用接口时,需要使用AppSecret进行签名,具体签名方法参见下文。 注意:开发者与商家,也可以是同一个人。 2、获取AccessToken详解 手动模式自动模式 将你的AppId告诉您的用户,用户通过在授权页面(https://aldsIdle.agiso.com/#/open/authorize) 进行授权。用户授权完成后,会获得一个AccessToken,让您的用户把该AccessToken发给你。 AccessToken的有效期和您的用户购买Agiso软件的使用时间一致。如果您的用户续费,那么AccessToken的有效期也会延长。 3、调用接口详解 调用任何一个API都必须把AccessToken 和 ApiVersion 添加到Header ,格式为"Authorization: Bearer access_token",其中Bearer后面有一个空格。同时还需传入以下公共参数: timestamp 是 Date 时间戳,例如:1468476350。API服务端允许客户端请求最大时间误差为10分钟。 sign 是 string API输入参数签名结果,签名算法参照下面的介绍。 注意:接口调用配额,20次/秒。 4、签名算法 【对所有API请求参数(包括公共参数和业务参数,但除去sign参数和byte[]类型的参数),根据参数名称的ASCII码表的顺序排序。如:foo=1, bar=2, foo_bar=3, foobar=4排序后的顺序是bar=2, foo=1, foo_bar=3, foobar=4。 将排序好的参数名和参数值拼装在一起,根据上面的示例得到的结果为:bar2foo1foo_bar3foobar4。 把拼装好的字符串采用utf-8编码,在拼装的字符串前后加上app的secret后,使用MD5算法进行摘要,如:md5(secret+bar2foo1foo_bar3foobar4+secret); 5、Header设置示例代码 JavaC#PHP HttpPost httpPost = new org.apache.http.client.methods.HttpPost(url); httpPost.addHeader("Authorization","Bearer "+ accessToken); httpPost.addHeader("ApiVersion", "1"); 6、签名算法示例代码 JavaC#PHP Map<String, String> data = new HashMap<String, String>(); data.put("modifyTimeStart", "2016-07-13 10:44:30"); data.put("pageNo", "1"); data.put("pageSize", "20"); //timestamp 为调用Api的公共参数,详细说明参考接入指南 data.put("timestamp", '1468476350');//假设当前时间为2016/7/14 14:5:50 //对键排序 String[] keys = data.keySet().toArray(new String[0]); Arrays.sort(keys); StringBuilder query = new StringBuilder(); //头加入AppSecret ,假设AppSecret值为****************** query.append(this.getClientSecret()); for (String key : keys) { String value = data.get(key); query.append(key).append(value); } //到这query的值为******************modifyTimeStart2016-07-13 10:44:30pageNo1pageSize20timestamp1468476350 //尾加入AppSecret query.append(this.getClientSecret()); //query=******************modifyTimeStart2016-07-13 10:44:30pageNo1pageSize20timestamp1468476350****************** byte[] md5byte = encryptMD5(query.toString()); //sign 为调用Api的公共参数,详细说明参考接入指南 data.put("sign", byte2hex(md5byte)); //byte2hex(md5byte) = 935671331572EBF7F419EBB55EA28558 // Md5摘要 public byte[] encryptMD5(String data) throws NoSuchAlgorithmException, UnsupportedEncodingException { MessageDigest md5 = MessageDigest.getInstance("MD5"); return md5.digest(data.getBytes("UTF-8")); } public String byte2hex(byte[] bytes) { StringBuilder sign = new StringBuilder(); for (int i = 0; i < bytes.length; i++) { String hex = Integer.toHexString(bytes[i] & 0xFF); if (hex.length() == 1) { sign.append("0"); } sign.append(hex.toLowerCase()); } return sign.toString(); } 7、完整调用API示例代码 以下代码以调用LogisticsDummySend(更新发货状态)为例 JavaC#PHP public String LogisticsDummySend() { string accessToken = "*************"; string appSecret = "*************"; WebRequest apiRequest = WebRequest.Create("http://gw.api.agiso.com/aldsIdle/Trade/LogisticsDummySend"); apiRequest.Method = "POST"; apiRequest.ContentType = "application/x-www-form-urlencoded"; apiRequest.Headers.Add("Authorization", "Bearer " + accessToken); apiRequest.Headers.Add("ApiVersion", "1"); //业务参数 TimeSpan ts = DateTime.UtcNow - new DateTime(1970, 1, 1, 0, 0, 0, 0); var args = new Dictionary<string,string>() { {"tids","123456789,987465123"}, 注意 tids是示例参数实际参数要以当前文档上的入参为准!!! {"timestamp",Convert.ToInt64(ts.TotalSeconds).ToString()} }; args.Add("sign", Sign(args, appSecret)); //拼装POST数据 string postData = ""; foreach (var p in args) { if (!String.IsNullOrEmpty(postData)) { postData += "&"; } string tmpStr = String.Format("{0}={1}", p.Key, HttpUtility.UrlEncode(p.Value)); postData += tmpStr; } using (var sw = new StreamWriter(apiRequest.GetRequestStream())) { sw.Write(postData); } WebResponse apiResponse = null; try { apiResponse = apiRequest.GetResponse(); } catch (WebException we) { if (we.Status == WebExceptionStatus.ProtocolError) { apiResponse = (we.Response as HttpWebResponse); } else{ //TODO:处理异常 return ""; } } using(Stream apiDataStream = apiResponse.GetResponseStream()){ using(StreamReader apiReader = new StreamReader(apiDataStream, Encoding.UTF8)){ string apiResult = apiReader.ReadToEnd(); apiReader.Close(); apiResponse.Close(); return apiResult; } } } //参数签名 public string Sign(IDictionary<string,string> args, string ClientSecret) { IDictionary<string, string> sortedParams = new SortedDictionary<string, string>(args, StringComparer.Ordinal); string str = ""; foreach (var m in sortedParams) { str += (m.Key + m.Value); } //头尾加入AppSecret str = ClientSecret + str + ClientSecret; var encodeStr = MD5Encrypt(str); return encodeStr; } //Md5摘要 public static string MD5Encrypt(string text) { MD5 md5 = new MD5CryptoServiceProvider(); byte[] fromData = System.Text.Encoding.UTF8.GetBytes(text); byte[] targetData = md5.ComputeHash(fromData); string byte2String = null; for (int i = 0; i < targetData.Length; i++) { byte2String += targetData[i].ToString("x2"); } return byte2String; })作为API接入商家后台,我想要获取我自己的商品数据:(# helps.py(仅仅作为功能参考,可以根据实际编写需要改写) import os def display_help(): """显示帮助信息""" print(""" 商品信息抓取工具 v1.0 ========================== 使用方法: python main.py [选项] 选项: -u, --url URL 商品URL -o, --output PATH 输出文件路径(.docx) -f, --file FILE 批量任务文件路径 -t, --threads N 并发线程数(默认:2) -d, --debug 启用调试模式 -v, --verbose 显示详细输出 -h, --help 显示帮助信息 批量任务文件格式: URL | 输出文件名称 以#开头的行默认为注释,将不会进行任何的处理 示例: https://2.taobao.com/item?id=123 | item1.docx https://2.taobao.com/item?id=456 | item2.docx 输出文件: 1. Word文档(.docx) - 包含商品标题、描述和图片 2. Excel报告(.xlsx) - 包含word文档标题和元数据(价格,商品标题等) 日志文件: xianyu_scraper.log - 包含详细运行日志 依赖: pip~=25.1.1 requests==2.32.4 beautifulsoup4==4.12.3 python-docx==1.1.0 pandas==2.3.1 openpyxl==3.1.5 fake-useragent==2.2.0 htmldocx==0.0.6 pillow==11.3.0 selenium==4.34.2 webdriver-manager==4.0.2 tqdm==4.67.1(进度条) """) def show_banner(): """显示程序横幅""" print(r""" 商品信息抓取工具 v1.0 ============================================================================== """),现在请为我编写程序,并且给我一个详细的接入阿奇索后台的步骤
最新发布
07-15
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值