目标:-我需要解析具有以下输入的文件,并在状态获得批准且构建日期为最新时获取最新的(构建日期时间戳应该是最新的)位置。我提供了示例输入和输出,我在 perl 中有代码,python新手,请建议如何在python中实现

Build:          M1234BAAAANAAW9321.1
Location:       \\dreyers\builds468\INTEGRATION\M1234BAAAANAA9321.1
Comments:       Build completed, labeled, and marked for retention.
Status:         Approved
BuildDate:      10/25/2012 12:51:25

Build:          M1234BAAAANAAW9321.2
Location:       \\crmbld01\Builds\FAILED\M1234BAAAANAA9321.2
Comments:       The build is currently in a failed status.
Status:         Failed
BuildDate:      10/25/2012 19:37:17

Build:          M1234BAAAANAAW9321.3
Location:       \\freeze\builds427\INTEGRATION\M1234BAAAANAA9321.3
Comments:       Build completed, labeled, and marked for retention.
Status:         Approved
BuildDate:      10/25/2012 19:43:28



$ echo 'Build:          M1234BAAAANAAW9321.1
Location:       \\dreyers\builds468\INTEGRATION\M1234BAAAANAAW9321.1
Comments:       Build completed, labeled, and marked for retention.
Status:         Approved
BuildDate:      10/25/2012 12:51:25

Build:          M1234BAAAANAAW9321.2
Location:       \\crmbld01\Builds\FAILED\M1234BAAAANAAW9321.2
Comments:       The build is currently in a failed status.
Status:         Failed
BuildDate:      10/25/2012 19:37:17

Build:          M1234BAAAANAAW9321.3
Location:       \\freeze\builds427\INTEGRATION\M1234BAAAANAAW9321.3
Comments:       Build completed, labeled, and marked for retention.
Status:         Approved
BuildDate:      10/25/2012 19:43:28
' | perl -e'

local $/ = "";
my ( $build_date, $location );
while ( <> ) {
next unless /status:\s+approved/i;
my $date = sprintf "%04d%02d%02d%02d%02d%02d", ( /builddate:\s+(\d
++)\D+(\d+)\D+(\d+)\D+(\d+)\D+(\d+)\D+(\d+)/i )[ 2, 0, 1, 3, 4, 5 ];
if ( !defined $build_date || $build_date lt $date ) {
    ( $build_date, $location ) = ( $date, /location:\s+(.+)/i );

打印“$位置\n”;' \冻结\builds427\INTEGRATION\M1234BAAAANAAW9321.3


for line_info in lines:


    if line_info.find('Location') == 0:
        # Build Location
        print "  Found Build location"
        logString += "  Found Build location\n"
        location = line_info.split(" ")
        location1 = location[len(location)-1]
    elif line_info.find('Status') == 0:
        # Status
        print "  Found Status"
        logString += "  Found Status\n"
        status = line_info.split(" ")
        status1 = status[1].strip()
        if status1 != "Approved"
        goto .start
    elif line_info.find('BuildDate') == 0:
        # Main Make
        print "  Found BuildDate"
        logString += "  Found BuildDate\n"
        builddate1 = line_info.split(" ")
        builddate1 = builddate1[1]
        #if builddate1 > 

2 回答 2


You could split the problem into series of simpler steps:

  1. read input line by line
  2. collect 'name: value' pairs
  3. group pairs into records (each record starts with 'Build' pair)
  4. select 'Approved' records that have non-blank 'BuildDate', 'Location' values
  5. find the latest record using given date/time format


#!/usr/bin/env python
import sys
from datetime  import datetime
from itertools import groupby

# find all 'name: value' pairs
file = sys.stdin
pairs = ([s.strip() for s in line.partition(':')[::2]]
         for line in file if ':' in line)

# group records
def record_start(pair, count=[False]):
    """Mark start of a record."""
    if pair[0] == 'Build':
        count[0] = not count[0]
    return count[0]
records = (dict(record) for _, record in groupby(pairs, record_start))

approved = (r for r in records if r.get('Status') == 'Approved' and
            all(r.get(name) for name in "BuildDate Location".split()))

# find latest record
def get_date(record):
        return datetime.strptime(record['BuildDate'], '%m/%d/%Y %H:%M:%S')
    except ValueError:
        return datetime.min # handle invalid date strings

latest = max(approved, key=get_date)
assert get_date(latest) != datetime.min


于 2012-11-03T07:47:00.683 回答

Try this code:

lastLocation = None
lastTime = None
skip = False

bestLocation = None
bestTime = None

for line in text.split('\n'):    
    if line.find('Location') == 0:
        # Build Location
        skip = False
        print "  Found Build location"
        lastLocation = line.split(":")[1].lstrip()       
    elif line.find('Status') == 0:
        # Status
        print "  Found Status"
        status = line.split(":")
        status1 = status[1].strip()
        if status1 != "Approved":
            skip = True
    elif line.find('BuildDate') == 0 and not skip:
        # Main Make
        print "  Found BuildDate"
        timeStr = line.split(":", 1)[1].lstrip()
        lastTime = datetime.datetime.strptime(timeStr, "%m/%d/%Y %H:%M:%S")
        if bestTime == None or bestTime < lastTime:
            bestTime = lastTime
            bestLocation = lastLocation
print lastLocation
于 2012-11-03T07:48:54.850 回答