0

我有一个司机旅行日记的数据集。对于每次旅行,在 csv 文件中都有相关的开始时间、结束时间和星期几。没有与旅行相关的日期。

我现在已将数据输入 python,其中每个开始时间和结束时间都附有工作日,如下所示:

time.struct_time(tm_year=1900, tm_mon=1, tm_mday=1, tm_hour=23, 
                 tm_min=45, tm_sec=0, tm_wday=0, tm_yday=1, tm_isdst=-1)

print journey['BeginTime'][2].tm_wday, journey['BeginTime'][2].tm_hour

星期一返回 0,每小时返回 23。

这些旅行中有 11,000 次,我想要获得的是每周根据一天中的时间驾驶的汽车数量的概况。

这可以通过计算在指定时间间隔内各自的 ['BeginTime'] 和 ['EndTime'] 间隔之间的行程次数来推断。五分钟的间隔就足够了,因为数据是最接近的五分钟。

有没有一种优雅的python方式来做到这一点?就像是:

for fiveMinutes in Week:
count = 0
    for trip in range(len(journey['BeginTime']):
        if journey['BeginTime'][trip] == fiveMinutes
               or (journey['BeginTime'][trip] < fiveMinutes 
                   and journey['EndTime'][trip] > fiveMinutes):
           count = count + 1
carCount[fiveMinutes] = count
4

1 回答 1

0

如果这有帮助,这里有一个想法......

from datetime import datetime, timedelta

# This does not check for crossing from Sunday to Monday
def convert_dt(start_dt, journey):
    begin_weekday, begin_hour, begin_minute = journey[0]
    end_weekday, end_hour, end_minute = journey[1]

    begin_dt = start_dt + timedelta(days=begin_weekday)
    begin_dt += timedelta(hours=begin_hour, minutes=begin_minute)

    end_dt = start_dt + timedelta(days=end_weekday)
    end_dt += timedelta(hours=end_hour,minutes=end_minute)
    return (begin_dt, end_dt)

def get_slot_journeys(start_dt, journeys):           
    next_dt = start_dt
    slot_count =  60/5 * 24 * 7
    slot_dict = {}

    journey_dts = []
    #convert journey begin and end to datetimes
    for index in range(len(journeys['begin_weekday'])):
        next_journey = [(journeys['begin_weekday'][index],
                         journeys['begin_hour'][index],
                         journeys['begin_minute'][index],),
                        (journeys['end_weekday'][index],
                         journeys['end_hour'][index],
                         journeys['end_minute'][index],)
                       ]
        journey_dts.append(convert_dt(start_dt, next_journey))

    for slot in range(slot_count):
        slot_dict[next_dt] = 0
        for journey_start, journey_end in journey_dts:
            if next_dt >= journey_start and next_dt <= journey_end:
                slot_dict[next_dt] = slot_dict[next_dt] + 1                    

        next_dt += timedelta(minutes=(5))

    return slot_dict

if __name__ == "__main__":
    start_dt = datetime(2012, 1, 2, 0, 0)    

    journeys = {'begin_weekday': [0, 0],
                'begin_hour': [14, 18],
                'begin_minute': [20, 30],
                'end_weekday': [0, 1],
                'end_hour': [19, 12],
                'end_minute': [15, 55],
               }
    slot_dict = get_slot_journeys(start_dt, journeys)       
    slot_keys = slot_dict.keys()
    slot_keys.sort()

    for key in slot_keys:
        if slot_dict[key]:    
            print key, slot_dict[key]
于 2012-05-29T19:48:23.490 回答