python - python panda链式赋值的替代方法是什么

Question

我正在尝试将新列添加到新的 excel 中，并使用熊猫链式分配添加到现有列中。但这太慢了，内存消耗很高，显示低于警告

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stab
le/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.data_frame.loc[self.data_frame.index[self.row_index], config.col4] = self.col4
extract_comparsion.py:71: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

为了更好的解释

        src_file = source.xlsv
        data = pd.read_excel(src_file, header=1)
        df = pd.DataFrame(data, columns=['col1', 'col2'])
        result = df.loc[(df['col1'] != 'con') & (df['col2'] != 'con2')]
        self.data_frame = result.head(10)

        for index, row in self.data_frame.iterrows():
            self.row_index = index


        self.data_frame.loc[self.data_frame.index[self.row_index], config.status_label] = response
        self.data_frame.loc[self.data_frame.index[self.row_index], config.col4] = self.col4
        self.data_frame.loc[self.data_frame.index[self.row_index], config.col5l] = self.col5
        self.data_frame.loc[self.data_frame.index[
                                self.row_index], config.ipact_col6 = self.col6
       

        writer = pd.ExcelWriter(config.output_file, engine='xlsxwriter')
        print(writer)

        self.data_frame.to_excel(writer, sheet_name='Sheet1', index=False)
        writer.save()

我想知道这个实现还有其他选择吗？如果是这样，请分享您的方法提前谢谢

#UPDATE 假设这个源excel数据

COL 1   COL2    COL3
AAA BBB CCC
DDD EEE FFF
GGG HHH III
JJJ KKK LLL
MMM NNN OOO

基于 sql 查询，我将得到 col AAA 的一些结果为 1，类似地，DDD 将得到 2。所以我需要形成一个最终的 excel，如下所示

COL 1   COL2    COL3 COL4
AAA BBB CCC 1
DDD EEE FFF 2
GGG HHH III 3
JJJ KKK LLL 4
MMM NNN OOO 5

我需要传递 COL1 和 COL2 的值来获取查询结果。这就是为什么我使用这个循环。希望你有清晰的想法

python - python panda链式赋值的替代方法是什么

0 回答 0

Related

Reference