python - 转换问题 - Python？

Question

我一直在调试我的脚本，并将我的问题缩小到几行我认为导致我的问题的代码。我正在从 3 个 csv 文件中读取数据，从 SQL Server 中的存储过程中提取数据，并将两者中的数据导出到 excel 文件中以绘制 cmparisons。我遇到的问题是我的源文件正在生成重复项（每个源文件中的一行）。我将打印语句放入以下数据中以查看发生了什么。

#convert district codes to strings
if dfyearfound:
    df2['district_code']=df2['district_code'].apply(lambda x: str(x))
    print df2['district_code'][df2.index[0]]
    df2['district_type_code']=df2['district_type_code'].apply(lambda x: str(x))
    print df2['district_type_code'][df2.index[0]]
if teacheryearfound:
    teacherframe['district_code']=teacherframe['district_code'].apply(lambda x: str(x))
    print teacherframe['district_code'][teacherframe.index[0]]
    teacherframe['district_type_code']=teacherframe['district_type_code'].apply(lambda x: str(x))
    print teacherframe['district_type_code'][teacherframe.index[0]]
if financialyearfound:
    financialframe['district_code']=financialframe['district_code'].apply(lambda x: str(x))
    print financialframe['district_code'][financialframe.index[0]]
    financialframe['district_type_code']=financialframe['district_type_code'].apply(lambda x: str(x))
    print financialframe['district_type_code'][financialframe.index[0]]

print 语句给了我以下输出： 1, 1, 1, 3.0, 0012, 1

所有 dist_code 的长度应为 4，并且它们在源文件中从 1 位到 4 位不等。在数据库中，它们都是 4 位数字（例如：0001、0012）。区类型为1位或2位，在数据库中均为2（例：01、03）。我不确定为什么上面的字符串转换不起作用。我打算编写一个函数来格式化 District_code 和 District_type_code 但我不想硬编码长度并且我写的函数我无法开始工作：

#function for formating district codes
def formatDistrictCodes(code):

    dist=code
    dist.zfill(4)

    return dist


formatDistrictCodes(districtformat)

score 2 · Accepted Answer

我认为你的问题的核心是：

所有 dist_code 的长度应为 4，并且它们在源文件中从 1 位到 4 位不等。在数据库中，它们都是 4 位数字（例如：0001、0012）。区类型为1位或2位，在数据库中均为2（例：01、03）。

0在 Python 中，任何以八进制开头的数字都是：

>>> 016
14

所以你真正想要的是取一个数字，并在前面加上一系列零，固定长度为 4，然后确保它是一个字符串。

>>> str(1).zfill(4)
'0001'

在您的代码中，这将是：

str(df2['district_code']).zfill(4)

请注意，这不会强制执行长度。它只会确保最小长度为 4。对于所有超过 4 位的值，上述无效。

python - 转换问题 - Python？

1 回答 1

Related

Reference