我有以下输入。我想将其解析为 CSV 分隔的字符串。我可以通过正则表达式模式获取 SKU,但由于我是正则表达式解析的新手,所以我不知道复杂的模式。如果有人可以帮助我,那就太好了。
谢谢!
charset="iso-8859-1"
BODY {
}
TD {
}
TH {
}
H1 {
}
TABLE,IMG,A {
}
**PO Number:** 35102
**Ship To:**
Georgie Clements
6902 Stonegate Drive
Odessa, TX 79765
432-363-8459
SKU
Product
Qty
JJ-Rug-Zebra-PK
Zebra Pink Rug
1
JJ-Zebra-PK-Twin-4
Zebra Pink 4 Piece Twin Comforter Set
1
JJ-TwinSheets-Zebra-PK
Zebra Pink 3 Piece Twin Sheet Set
1
JJ-Memo-Zebra-PK
Zebra Pink Memory Board
1
我希望它的格式如下:
PONumber, Shipping info, SKU, Product, Qty
'35102', '[ShipToAddress]', 'JJ-Rug-Zebra-PK', 'Zebra Pink Rug', '1'
'35102', '[ShipToAddress]', 'JJ-Zebra-PK-Twin-4', 'Zebra Pink 4 Piece Twin Comforter Set', '1'
'35102', '[ShipToAddress]', 'JJ-TwinSheets-Zebra-PK', 'Zebra Pink 3 Piece Twin Sheet Set', '1'
'35102', '[ShipToAddress]', 'JJ-Memo-Zebra-PK', 'Zebra Pink Memory Board', '1'
当前代码如下:
pattern = re.compile(r'(\b\w*JJ-\S*)')
pos = 0
while True:
match = pattern.search(msgStr, pos)
if not match:
break
a = match.start()
e = match.end()
print ' %2d : %2d = %s' % (a, e-1, msgStr[a:e])
pos = e