1

我有一个 csv 文件,其中的数据看起来像(见下文)。我需要帮助解析日期时间并填写缺失的日期时间并将缺失的数据分配为“M”(缺失):

Datetime, Data

19920101 00:00,2
19920101 01:00,3
19920101 23:00,5
19920505 12:00,5
4

1 回答 1

1

不是一个完整的答案,但只是试图解析日期时间字符串

>>> s="19920101 00:00"
>>> format = "%Y%m%d %H:%M"
>>> d = datetime.datetime.strptime(s, format)
>>> print d
1992-01-01 00:00:00

这是否有助于为您找出丢失的日期和时间。

我无法理解字符串中的 3 是什么3 19920101 23:00

[编辑:根据您的评论]

>>> expected = d + datetime.timedelta(days=1)
>>> print expected
1992-01-02 00:00:00

所以在你的代码中,你可以尝试这样的事情(你需要工作和完善它)

[编辑:代码替换]

import csv
import sys
import datetime
import pprint


all_data_points = {}
all_dates = []
expected = ''
format = "%Y%m%d %H:%M"

with open('datafile', 'rt') as f:
    reader = csv.reader(f)
    for row in reader:
        if row and 'Datetime' not in row:
            day_str = row[0]
            rain_str = row[1]
            if not expected:
                all_data_points[day_str] = rain_str
                all_dates.append(day_str)
                d = datetime.datetime.strptime(day_str, format)
                expected = d + datetime.timedelta(days=1)
            else:
                d = datetime.datetime.strptime(day_str, format)
                gap_in_days = d - expected 
                start_day = expected
                if gap_in_days.days > 1:
                    for i in xrange(gap_in_days.days):
                        next_day = start_day + datetime.timedelta(days=1+i)
                        day_str = next_day.strftime(format)
                        all_data_points[day_str] = 'M'
                        all_dates.append(day_str)
                all_data_points[day_str] = rain_str
                expected = d

    pprint.pprint(all_data_points)
于 2012-06-22T17:08:42.547 回答