3

I am trying to write a function that splits a string containing a floating-point number and some units. The string may or may not have spaces between the number and the units.

In C, the function strtod has a very handy parameter, named endptr that allows you to parse-out the initial part of a string, and get a pointer to the remainder. Since this is exactly what I need for this scenario, I was wondering if there is a similar functionality buried somewhere in Python.

Since float itself does not currently offer this functionality, I am using a regex solution based on https://stackoverflow.com/a/4703508/2988730:

float_pattern = re.compile(r'[+-]?(?:(?:\d+\.?)|(?:\d*.\d+))(?:[Ee][+-]?\d+)')
def split_units(string):
    match = float_pattern.match(string)
    if match is None: raise ValueError('not a float')
    num = float(match.group())
    units = string[match.end():].strip()
    return num, units

This is not completely adequate for two reasons. The first is that it reinvents the wheel. The second is that it is not properly locale-aware without adding additional complexity (which is why I don't want to reinvent the wheel in the first place).

For the record, the tail of the string can not contain any characters that a number would contain. The only real issue is that I am not requiring units to be separated from numbers by a space, so doing a simple string.split(maxsplit=1) won't work.

Is there a better way to get a floating point number out of the beginning of the string, so I can process the rest as something else?

4

1 回答 1

0

我知道这是一个愚蠢的解决方案,但是如何:

def float_and_more(something):
    orig = something
    rest = ''
    while something:
        try:
            return float(something), rest                  
        except ValueError:
            rest = something[-1] + rest                    
            something = something[:-1]                     
    raise ValueError('Invalid value: {}'.format(orig))

你可以像这样使用它:

>>> float_and_more('2.5 meters')
(2.5, 'meters')

如果您想真正使用它,您可能会使用io.StringIO而不是不断地重新创建字符串。

于 2018-05-21T19:26:01.843 回答