python - 在熊猫中转换和插入时间戳

Question

我在转换时间时遇到问题。列 [0] 是一个时间戳，我想在 [1] 处插入一个新列，现在它称为时间戳 2。然后我尝试使用 for 语句将列 [0] 转换为可读时间并将其添加到列 [1]。目前我插入了新列，但出现此错误：

raise TypeError(f"cannot convert the series to {converter}") TypeError: cannot convert the series to <class 'int'>

我将 .astype(int) 添加到时间戳变量中，但这没有帮助。代码：

import requests
import json
import pandas as pd
from datetime import datetime


url = 'https://us.market-api.kaiko.io/v2/data/trades.v1/exchanges/cbse/spot/btc-usd/aggregations/count_ohlcv_vwap?interval=1h&page_size=1000'

KEY = 'xxx'

headers = {
   "X-Api-Key": KEY,
   "Accept": "application/json",
   "Accept-Encoding": "gzip"
}

res = requests.get(url, headers=headers)
j_data = res.json()
parse_data = j_data['data']

# create dataframe
df = pd.DataFrame.from_dict(pd.json_normalize(parse_data), orient='columns')
df.insert(1, 'timestamp2', ' ')    
 
for index, row in df.iterrows(): 
    timestamp = df['timestamp'].astype(int)
    dt = datetime.fromtimestamp(timestamp)
    df.at[index, "timestamp2"] = dt
       
                 
             
print(df)

df.to_csv('test.csv', index=False, encoding='utf-8')

解析数据：

timestamp,timestamp2,open,high,low,close,volume,price,count
1611169200000,5,35260,35260.6,35202.43,35237.93,7.1160681299999995,35231.58133242965,132
1611165600000,5,34861.78,35260,34780.26,35260,1011.0965832999998,34968.5318431902,11313
1611162000000,5,34730.11,35039.98,34544.33,34855.43,1091.5246025199979,34794.45207484006,12877

在此示例中，我将 'df.at[index, "timestamp2"] = dt' 设置为 5，以确保它插入每一行，因此我只需将 column[0] 转换为 column[ 的可读时间1]。

score 1 · Accepted Answer

如果将时间戳转换为整数，根据值的大小，它似乎是自 epoc 以来的毫秒数。

如果您有兴趣，这里是有关 unix-time 的更多详细信息。https://en.wikipedia.org/wiki/Unix_time

您可以使用 pd.to_datetime 将其转换为日期时间。

这是一个矢量化操作，因此您不需要通过数据帧使用循环。pd.to_numeric 和 pd.to_datetime 都可以应用于整个系列。

没有所有数据就很难调试，但下面应该可以工作。.astype(int) 是 pd.to_numeric 的替代品，唯一的区别是 pd.to_numeric 为您提供了更大的错误处理灵活性，允许您强制转换为 nan（不确定是否需要）。

import pandas as pd
df = pd.DataFrame({'timestamp':['1611169200000']})
# convert to integer. If there are invalid entries this will set to nan. Depends on your case how you want to treat these.
timestamp_num = pd.to_numeric(df['timestamp'],errors='ignore')
df['timestamp2'] pd.to_datetime(timestamp_num,unit='ms')
print(df.to_dict())
#{'timestamp': {0: '1611169200000'}, 'timestamp2': {0: Timestamp('2021-01-20 19:00:00')}}

python - 在熊猫中转换和插入时间戳

1 回答 1

Related

Reference