0

我正在尝试创建一个函数来减少分配给变量的大量重复代码。

目前,如果我这样做,它会起作用

from pyquery import PyQuery as pq
import pandas as pd

d = pq(filename='20160319RHIL0_edit.xml')

# from nominations
res = d('nomination')
nomID = [res.eq(i).attr('id') for i in range(len(res))]
horseName = [res.eq(i).attr('horse') for i in range(len(res))]

zipped = list(zip(nomID, horseName))

frames = pd.DataFrame(zipped)
print(frames)

产生这个输出。

In [9]:          0                  1
0   171115            Vergara
1   187674      Heavens Above
2   184732         Sweet Fire
3   181928            Alegria
4   158914            Piamimi
5   171408          Blendwell
6   166836     Adorabeel (NZ)
7   172933           Mary Lou
8   182533      Skyline Blush
9   171801         All Cerise
10  181079  Gust of Wind (NZ)

然而,为了继续添加这个,我需要创建更多像这个(下)这样的变量。在这种情况下,唯一变化的部分是变量名和属性 attr( 'horse' )

horseName = [res.eq(i).attr('horse') for i in range(len(res))]

因此,DRY 并创建一个接受参数的函数是合乎逻辑的,该参数是属性列表

from pyquery import PyQuery as pq
import pandas as pd

d = pq(filename='20160319RHIL0_edit.xml')

# from nominations
res = d('nomination')

aList = []


def inputs(args):
    '''function to get elements matching criteria'''
    optsList = ['id', 'horse']
    for item in res:
        for attrs in optsList:
            if res.attr(attrs) in item:
                aList.append([res.eq(i).attr(attrs) for i in range(len(res))])

zipped = list(zip(aList))

frames = pd.DataFrame(zipped)
print(frames)
4

1 回答 1

1
attrs = ('id', 'horse', ...)

 ...

data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))]
于 2016-04-18T07:48:04.383 回答