47

听起来有些奇怪,但我需要将 Pandas 控制台输出字符串保存为 png pics。例如:

>>> df
                   sales  net_pft     ROE    ROIC
STK_ID RPT_Date                                  
600809 20120331  22.1401   4.9253  0.1651  0.6656
       20120630  38.1565   7.8684  0.2567  1.0385
       20120930  52.5098  12.4338  0.3587  1.2867
       20121231  64.7876  13.2731  0.3736  1.2205
       20130331  27.9517   7.5182  0.1745  0.3723
       20130630  40.6460   9.8572  0.2560  0.4290
       20130930  53.0501  11.8605  0.2927  0.4369 

有没有办法df.output_as_png(filename='df_data.png')生成一个只显示上面内容的图片文件?

4

7 回答 7

53

选项 1:使用 matplotlib 表格功能,以及一些额外的样式:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame()
df['date'] = ['2016-04-01', '2016-04-02', '2016-04-03']
df['calories'] = [2200, 2100, 1500]
df['sleep hours'] = [8, 7.5, 8.2]
df['gym'] = [True, False, False]

def render_mpl_table(data, col_width=3.0, row_height=0.625, font_size=14,
                     header_color='#40466e', row_colors=['#f1f1f2', 'w'], edge_color='w',
                     bbox=[0, 0, 1, 1], header_columns=0,
                     ax=None, **kwargs):
    if ax is None:
        size = (np.array(data.shape[::-1]) + np.array([0, 1])) * np.array([col_width, row_height])
        fig, ax = plt.subplots(figsize=size)
        ax.axis('off')
    mpl_table = ax.table(cellText=data.values, bbox=bbox, colLabels=data.columns, **kwargs)
    mpl_table.auto_set_font_size(False)
    mpl_table.set_fontsize(font_size)

    for k, cell in mpl_table._cells.items():
        cell.set_edgecolor(edge_color)
        if k[0] == 0 or k[1] < header_columns:
            cell.set_text_props(weight='bold', color='w')
            cell.set_facecolor(header_color)
        else:
            cell.set_facecolor(row_colors[k[0]%len(row_colors) ])
    return ax.get_figure(), ax

fig,ax = render_mpl_table(df, header_columns=0, col_width=2.0)
fig.savefig("table_mpl.png")

在此处输入图像描述

Options-2使用 Plotly + kaleido

import plotly.figure_factory as ff
import pandas as pd

df = pd.DataFrame()
df['date'] = ['2016-04-01', '2016-04-02', '2016-04-03']
df['calories'] = [2200, 2100, 1500]
df['sleep hours'] = [8, 7.5, 8.2]
df['gym'] = [True, False, False]

fig =  ff.create_table(df)
fig.update_layout(
    autosize=False,
    width=500,
    height=200,
)
fig.write_image("table_plotly.png", scale=2)
fig.show()

在此处输入图像描述

对于上述情况,可以使用以下属性更改字体大小:font

fig.update_layout(
    autosize=False,
    width=500,
    height=200,
    font={'size':8}
)
于 2016-09-06T22:32:56.433 回答
42

您必须使用DataFrame.plot()命令返回的图形:

ax = df.plot()
fig = ax.get_figure()
fig.savefig('asdf.png')
于 2014-08-31T02:31:58.650 回答
7

我有兴趣将我的数据框保存为报表附录的表格。我发现这是最简单的解决方案:

import pandas as pd
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt

# Assuming that you have a dataframe, df
pp = PdfPages('Appendix_A.pdf')
total_rows, total_cols = df.shape; #There were 3 columns in my df

rows_per_page = 40; # Assign a page cut off length
rows_printed = 0
page_number = 1;

while (total_rows >0): 
    #put the table on a correctly sized figure    
    fig=plt.figure(figsize=(8.5, 11))
    plt.gca().axis('off')
    matplotlib_tab = pd.tools.plotting.table(plt.gca(),df.iloc[rows_printed:rows_printed+rows_per_page], 
        loc='upper center', colWidths=[0.2, 0.2, 0.2])    

    # Give you cells some styling 
    table_props=matplotlib_tab.properties()
    table_cells=table_props['child_artists'] # I have no clue why child_artists works
    for cell in table_cells:
        cell.set_height(0.024)
        cell.set_fontsize(12)

    # Add a header and footer with page number 
    fig.text(4.25/8.5, 10.5/11., "Appendix A", ha='center', fontsize=12)
    fig.text(4.25/8.5, 0.5/11., 'A'+str(page_number), ha='center', fontsize=12)

    pp.savefig()
    plt.close()

    #Update variables
    rows_printed += rows_per_page;
    total_rows -= rows_per_page;
    page_number+=1;

pp.close()
于 2015-10-10T00:19:39.077 回答
3

我对我正在做的项目有同样的要求。但是根据我的要求,没有一个答案是优雅的。这是最终帮助我的东西,并且可能对这种情况有用,使用 Bokeh:

from bokeh.io import export_png, export_svgs
from bokeh.models import ColumnDataSource, DataTable, TableColumn

def save_df_as_image(df, path):
    source = ColumnDataSource(df)
    df_columns = [df.index.name]
    df_columns.extend(df.columns.values)
    columns_for_table=[]
    for column in df_columns:
        columns_for_table.append(TableColumn(field=column, title=column))

    data_table = DataTable(source=source, columns=columns_for_table,height_policy="auto",width_policy="auto",index_position=None)
    export_png(data_table, filename = path)

样本输出:

在此处输入图像描述

于 2020-06-18T20:28:20.177 回答
2

这是一个有点骇人听闻的解决方案,但它可以完成工作。

import numpy as np
import pandas as pd
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt

from PySide.QtGui import QImage
from PySide.QtGui import QPainter
from PySide.QtCore import QSize
from PySide.QtWebKit import QWebPage

arrays = [np.hstack([ ['one']*3, ['two']*3]), ['Dog', 'Bird', 'Cat']*2]
columns = pd.MultiIndex.from_arrays(arrays, names=['foo', 'bar'])
df =pd.DataFrame(np.zeros((3,6)),columns=columns,index=pd.date_range('20000103',periods=3))

h = "<!DOCTYPE html> <html> <body> <p> " + df.to_html() + " </p> </body> </html>";
page = QWebPage()
page.setViewportSize(QSize(5000,5000))

frame = page.mainFrame()
frame.setHtml(h, "text/html")

img = QImage(1000,700, QImage.Format(5))
painter = QPainter(img)
frame.render(painter)
painter.end()
a = img.save("html.png")
于 2014-07-17T22:31:15.503 回答
0

您可能希望将 df 保存为 pdf,在这种情况下,reportlab Table 将完成这项工作。

于 2020-04-13T21:44:56.690 回答
0

你也可以只使用Dask从 RAM 中卸载工作负载,它也适用于 Pandas 数据帧、Numpy 和 Sklearn 以及 ML。

于 2020-10-23T10:18:26.317 回答