1

我有一个类似的字符串'$200,000,000''Yan300,000,000'

我想拆分货币和数字,并输出一个元组('$', '200000000'),而','不是数字字符串。

目前我正在使用以下脚本,该脚本正在运行:

def splitCurrency(cur_str):
    cuttingIdx = 0
    for char in cur_str:
        try:
            int(char)
            break
        except ValueError:
            cuttingIdx = cuttingIdx + 1
    return (cur_str[0:cuttingIdx].strip(),
            cur_str[cuttingIdx:len(cur_str)].replace(',',''))

我想避免使用 for-loop 和 try-except 以提高性能和可读性。有什么建议么?

4

6 回答 6

3
>>> import re
>>> string = 'YAN300,000,000'
>>> match = re.search(r'([\D]+)([\d,]+)', string)
>>> output = (match.group(1), match.group(2).replace(',',''))
>>> output
('YAN', '300000000')

(感谢zhangyangyu指出我没有完全回答问题)

于 2013-07-23T16:09:50.457 回答
2
>>> filter(str.isdigit, s)
'200000000'
>>> filter(lambda x: not x.isdigit() and x != ',', s)
'$'
>>> 
>>> (filter(lambda x: not x.isdigit() and x != ',' ,s), filter(str.isdigit, s))
('$', '200000000')
>>> 
于 2013-07-23T16:13:23.357 回答
1
import locale
import re
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

def split_currency(text):
    _, currency, num = re.split('^(\D+)', text, 1)
    num = locale.atoi(num)
    return currency, num
print(split_currency('$200,000,000'))
# ('$', 200000000)
print(split_currency('Yan300,000,000'))
# ('Yan', 300000000)

split_currencytext如果不以货币符号(或任何非数字)开头,则会引发 ValueError 。如果你愿意,你可以用try...except不同的方式处理这种情况。

于 2013-07-23T16:14:49.547 回答
0
>>> import itertools
>>> myStr = '$200,000,000'
>>> ''.join(itertools.dropwhile(lambda c: not c.isdigit(), myStr))
'200,000,000'
>>> myStr = 'Yan300,000,000'
>>> ''.join(itertools.dropwhile(lambda c: not c.isdigit(), myStr))
'300,000,000'

同样,您可以使用itertools.takewhile相同的lambda函数来获取货币符号。但是,这可能更简单:

idx = itertools.dropwhile(lambda c: not c.isdigit()).next()
sign, val = myStr[:idx], myStr[idx:]
于 2013-07-23T16:04:27.980 回答
0

我敢打赌它不会更快......但我认为它更具可读性

>>> cur_string = "asd1,23456,123,1233"
>>> cur_sym = re.search(r"([^0-9, ]*)[0-9]","asd123").groups()[0]
>>> cur = re.sub("[^0-9]","",cur_string)
>>> print cur_sym,int(cur)
asd 1234561231233
于 2013-07-23T16:13:13.653 回答
0

您可以为此使用正则表达式。

p1 = re.compile("\d")  #match digits
p2 = re.compile("\D")  match non-digits


currency_symbol = p1.split(cur_str)[0]
value = int("".join([group for group in p2.split(cur_str)]))
于 2013-07-23T16:19:57.593 回答