0

我有一个格式如下的文件:

/* No comment provided by engineer. */
"Logout Successful!" = "Logout Successful!";

/* No comment provided by engineer. */
"London" = "London";

/* No comment provided by engineer. */
"Low Balance" = "Low Balance";

/* No comment provided by engineer. */
"Low-Cost Call" = "Low-Cost Call";

/* No comment provided by engineer. */
"Making A Low Cost Call" = "Making A Low Cost Call";

/* No comment provided by engineer. */
"Making FREE Calls" = "Making FREE Calls";

/* No comment provided by engineer. */
"MNO" = "MNO";

/* No comment provided by engineer. */
"more free credit" = "more free credit";

/* No comment provided by engineer. */
"My Phone Number" = "My Phone Number";

/* No comment provided by engineer. */
"My Purchase is Missing" = "My Purchase is Missing";

/* No comment provided by engineer. */
"Next" = "Next";

/* No comment provided by engineer. */
"NO" = "NO";

/* No comment provided by engineer. */
"No" = "No";

/* No comment provided by engineer. */
"No Balance" = "No Balance";

/* No comment provided by engineer. */
"Post Successful" = "Post Successful";

/* No comment provided by engineer. */
"Post to %d %@ Facebook Wall" = "Post to %1$d %2$@ Facebook Wall";

/* No comment provided by engineer. */
"Post to Facebook Wall" = "Post to Facebook Wall";

/* No comment provided by engineer. */
"Post To My Facebook Wall" = "Post To My Facebook Wall";

/* No comment provided by engineer. */
"Post to My Wall" = "Post to My Wall";

/* No comment provided by engineer. */
"Posted" = "Posted";

/* No comment provided by engineer. */
"Posting" = "Posting";

/* No comment provided by engineer. */
"Posting to Your Facebook Wall..." = "Posting to Your Facebook Wall...";

/* No comment provided by engineer. */
"PQRS" = "PQRS";

/* No comment provided by engineer. */
"Proceed" = "Proceed";

/* No comment provided by engineer. */
"Proceed, Don't Show Again" = "Proceed, Don't Show Again";

/* No comment provided by engineer. */
"Processing..." = "Processing...";

/* No comment provided by engineer. */
"Purchase History" = "Purchase History";

/* No comment provided by engineer. */
"Rates" = "Rates";

/* No comment provided by engineer. */
"Remind me later" = "Remind me later";

/* No comment provided by engineer. */
"Restart" = "Restart";

/* No comment provided by engineer. */
"Retry Failed" = "Retry Failed";

/* No comment provided by engineer. */
"Return to %@ after each call ends" = "Return to %@ after each call ends";

/* No comment provided by engineer. */
"Return To App After Call" = "Return To App After Call";

/* No comment provided by engineer. */
"Roaming Support" = "Roaming Support";

/* No comment provided by engineer. */
"Roaming Warning!" = "Roaming Warning!";

/* No comment provided by engineer. */
"Searching..." = "Searching...";

/* No comment provided by engineer. */
"See The Time In Any Country" = "See The Time In Any Country";

/* No comment provided by engineer. */
"Select All" = "Select All";

/* No comment provided by engineer. */
"Select the number for an iPhone with %@" = "Select the number for an iPhone with %@";

/* No comment provided by engineer. */
"Send" = "Send";

/* No comment provided by engineer. */
"Send a Text Message" = "Send a Text Message";

/* No comment provided by engineer. */
"Sending..." = "Sending...";

/* No comment provided by engineer. */
"Settings" = "Settings";

/* No comment provided by engineer. */
"Show All" = "Show All";

/* No comment provided by engineer. */
"Show Me How" = "Show Me How";

/* No comment provided by engineer. */
"Show Selected" = "Show Selected";

/* No comment provided by engineer. */
"Sign In" = "Sign In";

/* No comment provided by engineer. */
"Signing in..." = "Signing in...";

/* No comment provided by engineer. */
"Skip" = "Skip";

/* No comment provided by engineer. */
"SMS" = "SMS";

/* No comment provided by engineer. */
"Speed Dial & Favorites" = "Speed Dial & Favorites";

/* No comment provided by engineer. */
"Store" = "Store";

/* No comment provided by engineer. */
"Success" = "Success";

/* No comment provided by engineer. */
"Success!" = "Success!";

/* No comment provided by engineer. */
"Support" = "Support";

/* No comment provided by engineer. */
"System Status" = "System Status";

/* No comment provided by engineer. */
"Tapjoy Offers" = "Tapjoy Offers";

/* No comment provided by engineer. */
"Tell %d Friend%@" = "Tell %1$d Friend%2$@";

/* No comment provided by engineer. */
"Tell Facebook Friends" = "Tell Facebook Friends";

/* No comment provided by engineer. */
"Tell Friends" = "Tell Friends";

/* No comment provided by engineer. */
"Tell Friends About %@" = "Tell Friends About %@";

/* No comment provided by engineer. */
"Tell via E-Mail" = "Tell via E-Mail";

/* No comment provided by engineer. */
"Tell via SMS" = "Tell via SMS";

/* No comment provided by engineer. */
"Test Call" = "Test Call";

/* No comment provided by engineer. */
"Text Message" = "Text Message";

/* No comment provided by engineer. */
"Try Again" = "Try Again";

/* No comment provided by engineer. */
"Turning Caller ID ON/OFF" = "Turning Caller ID ON/OFF";

/* No comment provided by engineer. */
"TUV" = "TUV";

/* No comment provided by engineer. */
"Tweet to Friends" = "Tweet to Friends";

/* No comment provided by engineer. */
"Unable to Call" = "Unable to Call";

/* No comment provided by engineer. */
"Unable to Check Talk Time" = "Unable to Check Talk Time";

/* No comment provided by engineer. */
"Unable to connect." = "Unable to connect.";

/* No comment provided by engineer. */
"Unable to Create Account" = "Unable to Create Account";

/* No comment provided by engineer. */
"Unable to Purchase" = "Unable to Purchase";

/* No comment provided by engineer. */
"Unable to Sign In" = "Unable to Sign In";

/* No comment provided by engineer. */
"Unknown" = "Unknown";

/* No comment provided by engineer. */
"unknown caller" = "unknown caller";

/* No comment provided by engineer. */
"Unselect All" = "Unselect All";

/* No comment provided by engineer. */
"Updating Your Phone Number" = "Updating Your Phone Number";

/* No comment provided by engineer. */
"VoIP %@" = "VoIP %@";

/* No comment provided by engineer. */
"WARNING!" = "WARNING!";

我想使用正则表达式来解析它,以便只获取键和值,而无需将引号括入字典中:

def load_replacement_dict(file_name):
    with open(file_name, 'r') as f:
        content = f.read()
        resultDict = {}

        dictionary_regex = re.compile('"([^"]*)" = "([^"]*)"',)

        for result in dictionary_regex.finditer(content):
            resultDict[result.group(1)] = result.group(2)

        for key, value in resultDict.items():
            print (key+" = "+value).decode('utf-8')

        return resultDict

第一个子组匹配,但是当我在之后添加任何内容时,它不再匹配。我尝试使用空格,使用 \s 似乎没有什么与等号周围的空格相匹配。我在这里想念什么?

编辑:我发现如果我从文件开头删除 unicode 字节顺序标记,那么正则表达式就会起作用。显然不是解决方案,但也许是关于如何修改正则表达式的线索?

4

5 回答 5

5

在我看来,使用字符串方法而不是正则表达式可以更轻松地完成您想要实现的目标:

>>> s = '"A Key With \"quotes\" in it" = " Another Value "'
>>> l,r = [v.strip().strip('"').strip() for v in s.split('=')]
>>> l,r
 ('A Key With "quotes" in it', 'Another Value')

转义将被保留,它只是因为我创建字符串的方式而在上面丢失。我从文件中读取文本,然后发生的是:

In [1]: lines = open('x.txt').read().splitlines()

In [2]: for s in lines: print [v.strip().strip('"').strip() for v in s.split('=')]
   ...: 
['Some Key', 'Some Value']
['Another Key', 'Another Value']
['A Key With \\"quotes\\" in it', 'Another Value']
于 2013-06-17T16:49:37.673 回答
3

为避免转义引号问题,您可以使用此

"((?:[^"]+|(?<=\\)")*)" = "((?:[^"]+|(?<=\\)")*)"
于 2013-06-17T16:47:13.437 回答
1

您没有检查正则表达式中值的引号,因此它无法匹配。此外,为了处理键或值中的转义引号,我相信这应该涵盖它:

dictionary_regex = re.compile(r'"((?:(?:\\")|[^"])*)" = "((?:(?:\\")|[^"])*)"')
于 2013-06-17T17:01:59.023 回答
1

它最终成为一个编码问题。该文件是 UTF-16。一旦我添加:

with codecs.open(file_name, 'r', 'utf-16') as f:

正则表达式工作正常。

于 2013-06-17T18:44:20.540 回答
0

对于已发布的示例键值对,以下正则表达式似乎正在工作:

re.compile('"(.*)" = "(.*)"')

我错过了什么吗?

于 2013-06-17T17:06:40.033 回答