0

将 DataFrame 写入 excel 文件会使工作表的数据为零。

我正在创建一个机器人“侦察应用程序”。它在两天内接收多个 .csv 文件。csv 文件将以四位数字加连字符和匹配号命名。例如“2073-18.csv”。每个团队的多个文件将到达。我需要为每个团队提供一张表,其中每个 csv 文件的内容都在该团队的同一张表上。创建工作表有效,将数据写入这些工作表则无效。

import os
import glob
import csv
from xlsxwriter.workbook import Workbook
import pandas as pd
import numpy as np
#from sqlalchemy import create_engine
from openpyxl import load_workbook

os.chdir ("/EagleScout")
path = '.'
extension = 'csv'
engine = 'xlsxwriter'

files_in_dir = [ f for f in glob.glob('*.csv')]

workbook = Workbook('Tournament.xlsx')

with pd.ExcelWriter('Tournament.xlsx') as writer:
    for csvfile in files_in_dir:
        df = pd.read_csv(csvfile)
        fName, fExt = (os.path.splitext(csvfile))
        sName = fName.split('-')
        worksheet = workbook.get_worksheet_by_name(sName [0])

        if worksheet is None:
            worksheet = workbook.add_worksheet(sName [0]) #workseet with csv file name

        df.to_excel(writer, sheet_name = (sName[0]))

    writer.save()

workbook.close()

我需要的是一个工作簿,每个团队一张纸,最多 70 个团队。每张工作表将有多行,一个用于到达该团队的每个 csv 文件。问题是,如何让 Pandas 或其他库将每个 csv 文件的内容写入工作簿中相应的工作表?

4

1 回答 1

0

好的,有了@ivan_pozdeev 的意见,我终于解决了我的问题。请记住,我最初的愿望是生成一个可以定期运行的脚本并生成一个包含多个工作表的电子表格。每个工作表都将包含 .csv 文件中每场比赛的所有数据,并按球队编号分组。我还添加了一个包含原始数据的电子表格。这是我想出的:

import os
import glob
import csv
import xlsxwriter
from xlsxwriter.workbook import Workbook
import pandas as pd
import numpy as np
#from sqlalchemy import create_engine
#import openpyxl
#from openpyxl import load_workbook

os.chdir ("/EagleScout")
path = '.'
extension = 'csv'


# Remove the combined .csv file from previous runs
#This will provide clean date without corruption from earlier runs
if os.path.exists('./Spreadsheets/combined.csv'): 
    os.remove ('./Spreadsheets/combined.csv')

#Remove previous Excel spreadsheet
if os.path.exists('./Spreadsheets/Tournament.xlsx'): 
    os.remove ('./Spreadsheets/Tournament.xlsx')


#Remove sorted combined csv
#Remove previous Excel spreadsheet
if os.path.exists('./Spreadsheets/Combined.xlsx'): 
    os.remove ('./Spreadsheets/Combined.xlsx')


#Read in and merge all .CSV file names
files_in_dir = [ f for f in glob.glob('*.csv')] 


#Create a single combined .csv file with all data
#from all matches completed so far.
d1 = pd.read_csv('Header.txt')
d1.to_csv('./Spreadsheets/combined.csv', header = True, index = False)

for filenames in files_in_dir: 
    df = pd.read_csv(filenames)
    fName, fExt = (os.path.splitext(filenames))
    sName = fName.split('-')
    N=(sName[1])
    df.insert(0,N,N,True)
    df.to_csv('./Spreadsheets/combined.csv', index_label = (sName[0]), mode = 'a')


#Combine all csv files into one master Raw Excel Data file
#and add column headers as labels
with pd.ExcelWriter('./Spreadsheets/Combined.xlsx') as writer:
    dt = pd.read_csv('./Spreadsheets/combined.csv')
    dt.to_excel(writer, sheet_name = 'All data')

    writer.save()



#Parse through all .CSV files and append content to appropriate team worksheet.
with pd.ExcelWriter('./Spreadsheets/Tournament.xlsx') as writer:

    df2 = pd.read_excel('./Spreadsheets/Combined.xlsx')
    group = df2.groupby('Team')
    for Team, Team_df in group:

        Team_df.to_excel(writer, sheet_name = str(Team))


    writer.save()

我确信有一种更简洁的方法来执行此代码,我仍然是新手,但现在它可以满足我的期望。

于 2019-06-07T18:41:25.083 回答