您需要创建一个自定义函数并将其映射到birth_date 列。
您可以决定一个截止年份(例如 40 年),将其分类为 19 世纪,低于其分类为 20 世纪。例如,62 将转换为 1962 年,32 将转换为 2032。
下面的代码创建了转换日期字符串的自定义函数。
import pandas as pd
import datetime as dt
def custom_date_function(date_string: str) -> dt.date:
"""
Convert date string to date object
"""
# Note that the first 8 character is the date without the time
# Selecting the first 8 character
# And then splitting the string using '/' to year, month and date
date_components = date_string[0:8].split('/')
# All number greater than 40 will be changed to 19th century
# Else to 20th century
# You may change the cutoff from 40
if int(date_components[2]) >= 40:
year = 1900 + int(date_components[2])
else:
year = 2000 + int(date_components[2])
return dt.date(year=year, month=int(date_components[0]), day=int(date_components[1]))
创建自定义函数后,您可以在birth_date 列中使用它。
# Example Code of applying the custom function on birth_date DataFrame column
# Creating an example DataFrame with birth_date column
df_dict = {'birth_date': ['11/22/67', '03/23/69', '11/22/27']}
dataCopy = pd.DataFrame(df_dict)
# Applying the function on birth_date DataFrame column
out = dataCopy['birth_date'].apply(custom_date_function)
print(out)
birth_date
column有可能已经是一个日期对象。在这种情况下,您需要在应用custom_date_function
.