I have a CSV file with data reading that I want to read into Python. I get lists that contain strings like "2,5"
. Now doing float("2,5")
does not work, because it has the wrong decimal mark.
How do I read this into Python as 2.5
?
I have a CSV file with data reading that I want to read into Python. I get lists that contain strings like "2,5"
. Now doing float("2,5")
does not work, because it has the wrong decimal mark.
How do I read this into Python as 2.5
?
您可以使用区域感知方式进行操作:
import locale
# Set to users preferred locale:
locale.setlocale(locale.LC_ALL, '')
# Or a specific locale:
locale.setlocale(locale.LC_NUMERIC, "en_DK.UTF-8")
print locale.atof("3,14")
使用此方法前请阅读本节。
float("2,5".replace(',', '.'))
在大多数情况下都可以
如果value
是一个很大的数字并且.
已经使用了数千个,您可以:
将所有逗号替换为点:value.replace(",", ".")
删除除最后一点之外的所有内容:value.replace(".", "", value.count(".") -1)
Pandas supports this out of the box:
df = pd.read_csv(r'data.csv', decimal=',')
See http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
using a regex will be more reliable
import re
decmark_reg = re.compile('(?<=\d),(?=\d)')
ss = 'abc , 2,5 def ,5,88 or (2,5, 8,12, 8945,3 )'
print ss
print decmark_reg.sub('.',ss)
result
abc , 2,5 def ,5,88 or (2,5, 8,12, 8945,3 )
abc , 2.5 def ,5.88 or (2.5, 8.12, 8945.3 )
If you want to treat more complex cases (numbers with no digit before the decimal mark for exemple) the regex I crafted to detect all types of numbers in the following thread may be of interest for you:
首先,您必须确保用于提供号码的区域设置。不这样做肯定会发生随机问题。
import locale
loc = locale.getlocale() # get and save current locale
# use locale that provided the number;
# example if German locale was used:
locale.setlocale(locale.LC_ALL, 'de_DE')
pythonnumber = locale.atof(value)
locale.setlocale(locale.LC_ALL, loc) # restore saved locale
尝试用小数点替换所有小数点逗号:
floatAsStr = "2,5"
floatAsStr = floatAsStr.replace(",", ".");
myFloat = float(floatAsStr)
当然,该函数replace
适用于任何子字符串,因为 python 现在确实区分了 char 和 string。
如果将点用作千位分隔符,要交换逗号和点,您可以使用第三个符号作为临时占位符,如下所示:
value.replace('.', '#').replace(',', '.').replace('#', ',')
但是看到你想从字符串转换为浮点数,你可以删除任何点,然后用点替换任何逗号
float(value.replace('.', '').replace(',', '.'))
IMO这是最易读的解决方案