I am trying to write a function that splits a string containing a floating-point number and some units. The string may or may not have spaces between the number and the units.
In C, the function strtod
has a very handy parameter, named endptr
that allows you to parse-out the initial part of a string, and get a pointer to the remainder. Since this is exactly what I need for this scenario, I was wondering if there is a similar functionality buried somewhere in Python.
Since float
itself does not currently offer this functionality, I am using a regex solution based on https://stackoverflow.com/a/4703508/2988730:
float_pattern = re.compile(r'[+-]?(?:(?:\d+\.?)|(?:\d*.\d+))(?:[Ee][+-]?\d+)')
def split_units(string):
match = float_pattern.match(string)
if match is None: raise ValueError('not a float')
num = float(match.group())
units = string[match.end():].strip()
return num, units
This is not completely adequate for two reasons. The first is that it reinvents the wheel. The second is that it is not properly locale-aware without adding additional complexity (which is why I don't want to reinvent the wheel in the first place).
For the record, the tail of the string can not contain any characters that a number would contain. The only real issue is that I am not requiring units to be separated from numbers by a space, so doing a simple string.split(maxsplit=1)
won't work.
Is there a better way to get a floating point number out of the beginning of the string, so I can process the rest as something else?