python - 禁用索引熊猫数据框

Question

如何删除或禁用熊猫数据框中的索引？

我正在从“python 用于数据分析”一书中学习 pandas，我已经知道我可以使用 dataframe.drop 删除一列或一行。但是我没有发现任何有关禁用所有索引的信息。

score 19 · Accepted Answer

df.valuesndarray为您提供没有索引的原始 NumPy 。

>>> df
   x   y
0  4  GE
1  1  RE
2  1  AE
3  4  CD
>>> df.values
array([[4, 'GE'],
       [1, 'RE'],
       [1, 'AE'],
       [4, 'CD']], dtype=object)

你不能有一个没有索引的 DataFrame，它们是 DataFrame 的重点:)

但要明确一点，这个操作不是就地的：

>>> df.values is df.values
False

DataFrame 将数据保存在按类型分组的二维数组中，因此当您想要整个数据帧时，它必须找到所有 dtype 的 LCD 并构造该类型的二维数组。

要使用旧数据框的值实例化一个新数据框，只需将旧数据框传递给新的构造函数，不会复制任何数据，相同的数据结构将被重用：

>>> df1 = pd.DataFrame([[1, 2], [3, 4]])
>>> df2 = pd.DataFrame(df1)
>>> df2.iloc[0,0] = 42
>>> df1
    0  1
0  42  2
1   3  4

但是您可以显式指定copy参数：

>>> df1 = pd.DataFrame([[1, 2], [3, 4]])
>>> df2 = pd.DataFrame(df1, copy=True)
>>> df2.iloc[0,0] = 42
>>> df1
   0  1
0  1  2
1  3  4

score 4 · Accepted Answer

d.index = range(len(d))

进行简单的就地索引重置 - 即它删除所有现有索引，并添加一个基本整数，这是 pandas Dataframe 可以拥有的最基本的索引类型。

score 2 · Accepted Answer

此外，如果您正在使用df.to_excela 的函数pd.ExcelWriter，也就是将其写入 Excel 工作表的位置，您可以index=False在其中指定参数。

创建 Excel 编写器：

writer = pd.ExcelWriter(type_box + '-rules_output-' + date_string + '.xlsx',engine='xlsxwriter')

我们有一个名为lines：

# create a dataframe called 'df'
df = pd.DataFrame([sub.split(",") for sub in lines], columns=["Rule", "Device", "Status"]))

#convert df to Excel worksheet
df.to_excel(writer, sheet_name='all_status',**index=False**)
writer.save()

score 1 · Accepted Answer

我在尝试从无索引的 CSV 中获取 DataFrame 并将其写回另一个文件时遇到了类似的问题。

我想出了以下内容：

import pandas as pd
import os

def csv_to_df(csv_filepath):
    # the read_table method allows you to set an index_col to False, from_csv does not
    dataframe_conversion = pd.io.parsers.read_table(csv_filepath, sep='\t', header=0, index_col=False)
    return dataframe_conversion

def df_to_excel(df):
    from pandas import ExcelWriter
    # Get the path and filename w/out extension
    file_name = 'foo.xlsx'
    # Add the above w/ .xslx
    file_path = os.path.join('some/directory/', file_name)
    # Write the file out
    writer = ExcelWriter(file_path)
    # index_label + index are set to `False` so that all the data starts on row
    # index 1 and column labels (called headers by pandas) are all on row index 0.
    df.to_excel(writer, 'Attributions Detail', index_label=False, index=False, header=True)
    writer.save()

score 0 · Accepted Answer

我有一个功能可以帮助一些人。我在 python 中以下列方式将 csv 文件与标题组合：

    def combine_csvs(filedict, combined_file):
        files = filedict['files']
        df = pd.read_csv(files[0])
        for file in files[1:]:
            df = pd.concat([df, pd.read_csv(file)])
        df.to_csv(combined_file, index=False)
        return df

它可以根据需要获取尽可能多的文件。称之为：

    combine_csvs(dict(files=["file1.csv","file2.csv", "file3.csv"]), 'output.csv')

或者，如果您正在将 python 中的数据框读取为：

    df = combine_csvs(dict(files=["file1.csv","file2.csv"]), 'output.csv')

combine_csvs 函数不保存索引。如果您需要索引，请改用 'index=True'。

python - 禁用索引熊猫数据框

5 回答 5

Related

Reference