0

我有返回不同过滤结果的循环脚本,我可以使这些数据作为每个不同过滤器类的数组返回。但是,我不确定将所有这些数组连接在一起的最佳方法。

import mechanize
import urllib
import json
import re
import random
import datetime
from sched import scheduler
from time import time, sleep
from sets import Set

##### Code to loop the script and set up scheduling time
s = scheduler(time, sleep)
random.seed()

##### Code to stop duplicates part 1 
userset = set ()

def run_periodically(start, end, interval, func):
    event_time = start
    while event_time < end:
        s.enterabs(event_time, 0, func, ())
        event_time += interval + random.randrange(-5, 10)
    s.run()

##### Code to get the data required from the URL desired
def getData():  
    post_url = "URL OF INTEREST"
    browser = mechanize.Browser()
    browser.set_handle_robots(False)
    browser.addheaders = [('User-agent', 'Firefox')]

##### These are the parameters you've got from checking with the aforementioned tools
    parameters = {'page' : '1',
                  'rp' : '250',
                  'sortname' : 'race_time',
                  'sortorder' : 'asc'
                 }
##### Encode the parameters
    data = urllib.urlencode(parameters)
    trans_array = browser.open(post_url,data).read().decode('UTF-8')

    xmlload1 = json.loads(trans_array)
    pattern2 = re.compile('/control/profile/view/(.*)\' title=')
    pattern4 = re.compile('title=\'posted: (.*) strikes:')
    pattern5 = re.compile('strikes: (.*)\'><img src=')

    for row in xmlload1['rows']:
        cell = row["cell"]

##### defining the Keys (key is the area from which data is pulled in the XML) for use in the pattern finding/regex

        user_delimiter = cell['username']
        selection_delimiter = cell['race_horse']

        user_numberofselections = float(re.findall(pattern4, user_delimiter)[0])
        user_numberofstrikes = float(re.findall(pattern5, user_delimiter)[0])
        strikeratecalc1 = user_numberofstrikes/user_numberofselections
        strikeratecalc2 = strikeratecalc1*100
        userid_delimiter_results = (re.findall(pattern2, user_delimiter)[0])


##### Code to stop duplicates throughout the day part 2 (skips if the id is already in the userset)

        if userid_delimiter_results in userset: continue;
        userset.add(userid_delimiter_results)

        arraym = ""
        arrayna = ""

        if strikeratecalc2 > 50 and strikeratecalc2 < 100):

            arraym0 = "System M" 
            arraym1 = "user id = ",userid_delimiter_results
            arraym2 = "percantage = ",strikeratecalc2,"%"
            arraym3 = ""
            arraym = [arraym0, arraym1, arraym2, arraym3]

        if strikeratecalc2 > 0 and strikeratecalc2 < 50):

            arrayna0 = "System NA" 
            arrayna1 = "user id = ",userid_delimiter_results
            arrayna2 = "percantage = ",strikeratecalc2,"%"
            arrayna3 = ""
            arrayna = [arrayna0, arrayna1, arrayna2, arrayna3]


getData()

run_periodically(time()+5, time()+1000000, 10, getData)

我想要做的是将'arraym'和'arrayna'作为一个最终数组返回,但是由于脚本在脚本的每个循环上的循环性质,旧的'arraym'/'arrayna'被覆盖,目前我尝试生成一个包含所有数据的数组已导致“systemm”的最后一个用户标识和“sustemna”的最后一个用户标识。这显然是因为,在每次运行循环时,它都会覆盖旧的“arraym”和“arrayna”,但是我不知道有什么方法可以解决这个问题,这样我的所有数据都可以累积在一个数组中。请注意,我现在已经累计编码两周了,所以很可能有一些简单的函数可以解决这个问题。

亲切的问候 AEA

4

1 回答 1

2

无需查看那个巨大的代码段,通常您可以执行以下操作:

my_array = [] # Create an empty list
for <some loop>:
    my_array.append(some_value)


# At this point, my_array is a list containing some_value for each loop iteration
print(my_array)

查看 python 的list.append()

所以你的代码可能看起来像:

#...
arraym = []
arrayna = []

for row in xmlload1['rows']:
    #...
    if strikeratecalc2 > 50 and strikeratecalc2 < 100):
        arraym.append("System M")
        arraym.append("user id = %s" % userid_delimiter_results)
        arraym.append("percantage = %s%%" % strikeratecalc2)
        arraym.append("")
    if strikeratecalc2 > 0 and strikeratecalc2 < 50):
        arrayna.append("System NA")
        arrayna.append("user id = %s" % userid_delimiter_results)
        arrayna.append("percantage = %s%%" % strikeratecalc2)
        arrayna.append("")
#...
于 2013-06-13T01:53:07.500 回答