我环顾四周,没有看到 Python 将字母数字字符串转换为数字字符串的明确答案。这是我想转换的数字示例。
"1234alpha" --> 1234
"a1234asdf" --> 0
"1234.56yt" --> 1234.56
任何意见,将不胜感激。
丹麦
对于更改itertools
且没有正则表达式:
>>> import itertools as it
>>> number = ''.join(it.takewhile(str.isdigit, '123dfd'))
>>> int(number) if number else 0
123
>>> number = ''.join(it.takewhile(str.isdigit, 'a123dfd'))
int(number) if number else 0
0
它适用于浮点数有点丑:
>>> number = ''.join(it.takewhile(lambda x: x.isdigit() or
x == '.', '123.45dfd'))
>>> float(number) if number else 0
123.45
浮动,底片:
def make_number(alphanum):
sign = 1
if alphanum and alphanum[0] in '+-':
sign = int(alphanum[0] + '1')
alphanum = alphanum[1:]
try:
return float(''.join(it.takewhile(lambda x: x.isdigit()
or x == '.', alphanum))) * sign
except ValueError:
return 0
结论:不断改变需求可以将简单的解决方案变成复杂的解决方案。
要支持正/负整数/浮点数,您可以使用来自Extract float/double value的稍微修改的正则表达式:
import re
re_float = re.compile("""(?x)
^
[+-]?\ * # first, match an optional sign *and space*
( # then match integers or f.p. mantissas:
\d+ # start out with a ...
(
\.\d* # mantissa of the form a.b or a.
)? # ? takes care of integers of the form a
|\.\d+ # mantissa of the form .b
)
([eE][+-]?\d+)? # finally, optionally match an exponent
""")
def extract_number(s, default=None):
m = re_float.match(s)
if not m:
return default # no number found
f = float(m.group(0)) #XXX to support huge numbers, try/except int() first
return int(f) if f.is_integer() else f
for s in sys.stdin:
print(extract_number(s, default=0))
1234alpha
a1234asdf
1234.56yt
-1e20.
1234
0
1234.56
-100000000000000000000
import re
def str_to_int(string):
match = re.match("\d+", string)
if match:
try:
return int(match.group())
except ValueError:
return float(match.group())
else:
return 0
str_to_int("1234alpha")
1234
str_to_int("a1234asdf")
0
import ast
from itertools import takewhile
ast.literal_eval(''.join(takewhile(lambda x: x<='9', string)) or '0')
当 OK 的规则变得难以定义时,您可能会考虑这种试图找到界限的二进制搜索方法。
def binsearch_prefix(seq, predicate):
best_upper = 0
lower, upper = 0, len(seq)
while lower < upper:
mid = (lower + upper) / 2
if predicate(seq[:mid]):
best_upper = mid
lower = mid + 1
else:
upper = mid
return seq[:best_upper]
它将返回您认为可以接受的字符串部分。例如,这可能是您的接受函数:
def can_float(s):
try:
float(s)
return True
except ValueError:
return False
例子:
print binsearch_prefix(can_float, "1234alpha") # "1234"
print binsearch_prefix(can_float, "a1234asdf") # ""
print binsearch_prefix(can_float, "1234.56yt") # "1234.56"
然后,您可以以任何您喜欢的方式格式化前缀。
您可以使用 re 模块:
import re
def alp(s):
m = re.match('\d+', s)
return int(m.group(0)) if m is not None and m.start() == 0 else 0
In [3]: alp('a1234asdf')
Out[3]: 0
In [4]: alp('1234alpha')
Out[4]: 1234
如果要包含负整数:
def alp_neg(s):
m = re.match('[+-]?\d+', s)
return int(m.group(0)) if m is not None and m.start() == 0 else 0
如果你也想要花车:
def alp_floats(s):
m = re.match('[+-]?\d+(\.\d+)?', s)
return float(m.group(0)) if m is not None and m.start() == 0 else 0
In [7]: alp_floats('-12.2ss31.232sadas')
Out[7]: -12.2
也许使用正则表达式?
import re
def str2num(s):
try:
num = re.match(r'^([0-9]+)', s).group(1)
except AttributeError:
num = 0
return int(num)
print str2num('1234alpha')
print str2num('a1234asdf')
输出:
1234
0