对我来说工作得很好,删除header=0
然后只删除NaN
s 行:
url ='https://finance.naver.com/sise/investorDealTrendDay.nhn?bizdate=215600&sosok=&page=2'
df = pd.read_html(url)[0].dropna(how='all')
print (df)
날짜 개인 외국인 기관계 기관 \
날짜 개인 외국인 기관계 금융투자 보험 투신(사모) 은행 기타금융기관
0 20.08.06 -850.0 1638.0 -801.0 2247.0 -517.0 -993.0 46.0 -138.0
1 20.08.05 4315.0 -516.0 -3666.0 -1277.0 -441.0 -871.0 -18.0 -30.0
2 20.08.04 1844.0 -583.0 -1488.0 392.0 -493.0 -205.0 14.0 -54.0
3 20.08.03 6237.0 -2687.0 -3795.0 -2841.0 -108.0 -411.0 0.0 -5.0
4 20.07.31 4716.0 -556.0 -3861.0 -2659.0 -129.0 -709.0 -7.0 -4.0
8 20.07.30 64.0 2247.0 -2342.0 423.0 -171.0 -428.0 -3.0 -13.0
9 20.07.29 476.0 2936.0 -3368.0 -1346.0 -296.0 -698.0 -8.0 -92.0
10 20.07.28 -10495.0 13060.0 -2220.0 -1440.0 -526.0 318.0 12.0 -76.0
11 20.07.27 -2996.0 1584.0 1395.0 1968.0 -20.0 161.0 -179.0 -58.0
12 20.07.24 2881.0 876.0 -3678.0 -1173.0 -545.0 -843.0 -43.0 -8.0
기타법인
연기금등 기타법인
0 -1446.0 13.0
1 -1029.0 -133.0
2 -1142.0 227.0
3 -429.0 246.0
4 -352.0 -299.0
8 -2151.0 30.0
9 -929.0 -44.0
10 -507.0 -345.0
11 -476.0 16.0
12 -1066.0 -79.0
如果需要第一列到index
然后到DatetimeIndex
:
url ='https://finance.naver.com/sise/investorDealTrendDay.nhn?bizdate=215600&sosok=&page=2'
df = pd.read_html(url, index_col=0)[0].dropna(how='all')
df.index = pd.to_datetime(df.index, format='%y.%m.%d')
print (df)
날짜 개인 외국인 기관계 기관 \
날짜 개인 외국인 기관계 금융투자 보험 투신(사모) 은행 기타금융기관
2020-08-06 -850.0 1638.0 -801.0 2247.0 -517.0 -993.0 46.0 -138.0
2020-08-05 4315.0 -516.0 -3666.0 -1277.0 -441.0 -871.0 -18.0 -30.0
2020-08-04 1844.0 -583.0 -1488.0 392.0 -493.0 -205.0 14.0 -54.0
2020-08-03 6237.0 -2687.0 -3795.0 -2841.0 -108.0 -411.0 0.0 -5.0
2020-07-31 4716.0 -556.0 -3861.0 -2659.0 -129.0 -709.0 -7.0 -4.0
2020-07-30 64.0 2247.0 -2342.0 423.0 -171.0 -428.0 -3.0 -13.0
2020-07-29 476.0 2936.0 -3368.0 -1346.0 -296.0 -698.0 -8.0 -92.0
2020-07-28 -10495.0 13060.0 -2220.0 -1440.0 -526.0 318.0 12.0 -76.0
2020-07-27 -2996.0 1584.0 1395.0 1968.0 -20.0 161.0 -179.0 -58.0
2020-07-24 2881.0 876.0 -3678.0 -1173.0 -545.0 -843.0 -43.0 -8.0
날짜 기타법인
날짜 연기금등 기타법인
2020-08-06 -1446.0 13.0
2020-08-05 -1029.0 -133.0
2020-08-04 -1142.0 227.0
2020-08-03 -429.0 246.0
2020-07-31 -352.0 -299.0
2020-07-30 -2151.0 30.0
2020-07-29 -929.0 -44.0
2020-07-28 -507.0 -345.0
2020-07-27 -476.0 16.0
2020-07-24 -1066.0 -79.0