2

I have 98,000 USA home street addresses that I need to sort in 'walking' order, i.e. listed in the order that you would walk, down one side of the street and then cross the street and walk back.

import pandas as pd
df = pd.read_excel('c:pdsort.xlsx')

# add boolean column for even or odd on number column
is_even = df.loc[:,'number'] % 2 == 0
df.loc[:, 'even'] = is_even

# group and then sort by number
df.groupby(['town','street','even']).apply(lambda x: x.sort_values('number'))

# sort odd numbers ascending and even numbers descending

Desired df results, sort ascending for odd street numbers, then switch to descending sort for even. [sorry, first stackoverflow question, don't qualify yet to copy image of Jupyter notebook]

4 columns: number,street,town,even

desired outcome for column 'number': 1231 1233 1235 1237 1239 1238 1236 1234 1232 1230

4

2 回答 2

3

使用numpy.lexsort,您可以定义排序所依据的系列序列。来自@smj 的数据。

设置

import pandas as pd
import numpy as np

number_list = list(range(1, 11))

df = pd.DataFrame({'town': sorted(['Springfield', 'Shelbyville'] * 10),
                   'street': sorted(['Evergreen Terrace', 'Main Street'] * 10),
                   'number': number_list + number_list})

解决方案

订购时要小心。np.lexsort从序列的最后一个元素开始工作;例如s1,排序的优先级最高,s4最低。

s1 = df['town']
s2 = df['street']
s3 = ~df['number']%2                            # i.e. "is odd"
s4 = np.where(s3, -df['number'], df['number'])  # i.e. "negate if odd"

res = df.iloc[np.lexsort((s4, s3, s2, s1))]

结果

print(res)

           town             street  number
0   Shelbyville  Evergreen Terrace       1
2   Shelbyville  Evergreen Terrace       3
4   Shelbyville  Evergreen Terrace       5
6   Shelbyville  Evergreen Terrace       7
8   Shelbyville  Evergreen Terrace       9
9   Shelbyville  Evergreen Terrace      10
7   Shelbyville  Evergreen Terrace       8
5   Shelbyville  Evergreen Terrace       6
3   Shelbyville  Evergreen Terrace       4
1   Shelbyville  Evergreen Terrace       2
10  Springfield        Main Street       1
12  Springfield        Main Street       3
14  Springfield        Main Street       5
16  Springfield        Main Street       7
18  Springfield        Main Street       9
19  Springfield        Main Street      10
17  Springfield        Main Street       8
15  Springfield        Main Street       6
13  Springfield        Main Street       4
11  Springfield        Main Street       2
于 2018-06-20T22:39:32.843 回答
0

如果我理解正确,这是我的尝试,我确信这可以在 lambda 函数中完成,但它有助于以详细的方式设置逻辑:)

import pandas as pd
import numpy as np

number_list = list(range(1, 11))

data = pd.DataFrame(
    {
        'town': sorted(['Springfield', 'Shelbyville'] * 10),
        'street': sorted(['Evergreen Terrace', 'Main Street'] * 10),
        'number': number_list + number_list
    }
)

data['is_even'] = data['number'] % 2 == 0

final = pd.DataFrame()

for key, data_group in data.groupby(['town', 'street', 'is_even']):
    if key[2] == True:
        final = final.append(data_group.sort_values('number', ascending = False))
    else:
        final = final.append(data_group.sort_values('number'))

final.drop('is_even', axis = 1, inplace = True)

final

给出:

在此处输入图像描述

于 2018-06-20T22:26:49.350 回答