sorting a python list by two criteria

本文介绍如何使用Python对列表进行基于两个排序标准的排序操作,包括直接使用itemgetter方法和通过稳定排序特性分步实现的方法。

I have the following list created from a sorted csv

list1 = sorted(csv1, key=operator.itemgetter(1))

I would actually like to sort the list by two criteria: first by the value in field 1 and then by the value in field 2. How do I do this?

share improve this question
 

4 Answers

up vote 66 down vote accepted

like this:

import operator
list1 = sorted(csv1, key=operator.itemgetter(1, 2))
share improve this answer
 
1 
+1: More elegant than mine. I forgot that itemgetter can take multiple indices. –  dappawit  Mar 6 '11 at 19:43
 
@half full: glad it help :) –  mouad  Mar 6 '11 at 19:53
6 
operator is a module that needs to be imported. –  trapicki  Aug 28 '13 at 14:45 
 
how will i proceed if i want to sort ascending on one element and descending on other, using itemgetter??. –  ashish  Oct 12 '13 at 10:13 
1 
@ashish, see my answer below with lambda functions this is clear, sort by "-x[1]" or even "x[0]+x[1]" if you wish –  jaap  Feb 27 '14 at 15:15

Replying to this dead thread for archive.

No need to import anything when using lambda functions.
The following sorts list by the first element, then by the second element.

sorted(list, key=lambda x: (x[0], -x[1]))
share improve this answer
 
 
I like this solution because you can convert strings to int for sorting like: lambda x: (x[0],int(x[1])). +1 –  pbible  Sep 10 '14 at 13:41 
4 
Nice. As you noted in comment to the main answer above, this is the best (only?) way to do multiple sorts with different sort orders. Perhaps highlight that. Also, your text does not indicate that you sorted descending on second element. –  PeterVermont  Jun 12 '15 at 14:25
 
Also what if  x[1] is date? Should I convert it to integer also? @pbible does conversion of string to int preserve the alphabetical ordering of the string? –  user1700890  Nov 20 '15 at 23:32 
1 
@user1700890 I was assuming the field was already string. It should sort strings in alphabetical order by default. You should post your own question separately on SO if it is not specifically related to the answer here or the OP's original question. –  pbible  Nov 23 '15 at 16:59
2 
@jan it's reverse sort –  jaap  Feb 28 at 23:39

Python has a stable sort, so provided that performance isn't an issue the simplest way is to sort it by field 2 and then sort it again by field 1.

That will give you the result you want, the only catch is that if it is a big list (or you want to sort it often) calling sort twice might be an unacceptable overhead.

list1 = sorted(csv1, key=operator.itemgetter(2))
list1 = sorted(list1, key=operator.itemgetter(1))

Doing it this way also makes it easy to handle the situation where you want some of the columns reverse sorted, just include the 'reverse=True' parameter when necessary.

Otherwise you can pass multiple parameters to itemgetter or manually build a tuple. That is probably going to be faster, but has the problem that it doesn't generalise well if some of the columns want to be reverse sorted (numeric columns can still be reversed by negating them but that stops the sort being stable).

So if you don't need any columns reverse sorted, go for multiple arguments to itemgetter, if you might, and the columns aren't numeric or you want to keep the sort stable go for multiple consecutive sorts.

Edit: For the commenters who have problems understanding how this answers the original question, here is an example that shows exactly how the stable nature of the sorting ensures we can do separate sorts on each key and end up with data sorted on multiple criteria:

DATA = [
    ('Jones', 'Jane', 58),
    ('Smith', 'Anne', 30),
    ('Jones', 'Fred', 30),
    ('Smith', 'John', 60),
    ('Smith', 'Fred', 30),
    ('Jones', 'Anne', 30),
    ('Smith', 'Jane', 58),
    ('Smith', 'Twin2', 3),
    ('Jones', 'John', 60),
    ('Smith', 'Twin1', 3),
    ('Jones', 'Twin1', 3),
    ('Jones', 'Twin2', 3)
]

# Sort by Surname, Age DESCENDING, Firstname
print("Initial data in random order")
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred''')
DATA.sort(key=lambda row: row[1])

for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.''')
DATA.sort(key=lambda row: row[2], reverse=True)
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.
''')
DATA.sort(key=lambda row: row[0])
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

This is a runnable example, but to save people running it the output is:

Initial data in random order
Jones      Jane       58
Smith      Anne       30
Jones      Fred       30
Smith      John       60
Smith      Fred       30
Jones      Anne       30
Smith      Jane       58
Smith      Twin2      3
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Jones      Twin2      3

First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Jones      Jane       58
Smith      Jane       58
Smith      John       60
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.
Smith      John       60
Jones      John       60
Jones      Jane       58
Smith      Jane       58
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.

Jones      John       60
Jones      Jane       58
Jones      Anne       30
Jones      Fred       30
Jones      Twin1      3
Jones      Twin2      3
Smith      John       60
Smith      Jane       58
Smith      Anne       30
Smith      Fred       30
Smith      Twin1      3
Smith      Twin2      3

Note in particular how in the second step the reverse=True parameter keeps the firstnames in order whereas simply sorting then reversing the list would lose the desired order for the third sort key.

share improve this answer
 
 
Thanks that's very helpful. –  half full  Mar 6 '11 at 19:53
 
Stable sorting doesn't mean that it won't forget what your previous sorting was. This answer is wrong. –  Mike Axiak  Mar 6 '11 at 21:10
1 
Stable sorting means that you can sort by columns a, b, c simply by sorting by column c then b then a. Unless you care to expand on your comment I think it is you that is mistaken. –  Duncan  Mar 6 '11 at 21:23 
2 
This answer is definitely correct, although for larger lists it's unideal: if the list was already partially sorted, then you'll lose most of the optimization of Python's sorting by shuffling the list around a lot more. @Mike, you're incorrect; I suggest actually testing answers before declaring them wrong. –  Glenn Maynard  Mar 6 '11 at 21:39
2 
@MikeAxiak: docs.python.org/2/library/stdtypes.html#index-29 states in comment 9: Starting with Python 2.3, the sort() method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade). –  trapicki  Aug 28 '13 at 14:40 
def keyfunc(x):
    return tuple(x[1],x[2])

list1 = sorted(csv1, key=keyfunc)
share improve this answer
 
1 
I don't think tuple() can receive two arguments (or rather, three, if you count with self) –  Filipe Correia Dec 12 '12 at 23:15
1 
tuple takes only can take one argument –  therealprashant  Jun 6 '15 at 11:02
【轴承故障诊断】基于融合鱼鹰和柯西变异的麻雀优化算法OCSSA-VMD-CNN-BILSTM轴承诊断研究【西储大学数据】(Matlab代码实现)内容概要:本文提出了一种基于融合鱼鹰和柯西变异的麻雀优化算法(OCSSA)优化变分模态分解(VMD)参数,并结合卷积神经网络(CNN)与双向长短期记忆网络(BiLSTM)的轴承故障诊断模型。该方法利用西储大学公开的轴承数据集进行验证,通过OCSSA算法优化VMD的分解层数K和惩罚因子α,有效提升信号分解精度,抑制模态混叠;随后利用CNN提取故障特征的空间信息,BiLSTM捕捉时间序列的动态特征,最终实现高精度的轴承故障分类。整个诊断流程充分结合了信号预处理、智能优化与深度学习的优势,显著提升了复杂工况下轴承故障诊断的准确性与鲁棒性。; 适合人群:具备一定信号处理、机器学习及MATLAB编程基础的研究生、科研人员及从事工业设备故障诊断的工程技术人员。; 使用场景及目标:①应用于旋转机械设备的智能运维与故障预警系统;②为轴承等关键部件的早期故障识别提供高精度诊断方案;③推动智能优化算法与深度学习在工业信号处理领域的融合研究。; 阅读建议:建议读者结合MATLAB代码实现,深入理解OCSSA优化机制、VMD参数选择策略以及CNN-BiLSTM网络结构的设计逻辑,通过复现实验掌握完整诊断流程,并可进一步尝试迁移至其他设备的故障诊断任务中进行验证与优化。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值