I have to plot some data using histograms. My data are between [0,1], with no large concentrations on any particular point.
What's a good ratio between number of samples and number of bins (of equal length)?
I have to plot some data using histograms. My data are between [0,1], with no large concentrations on any particular point.
What's a good ratio between number of samples and number of bins (of equal length)?
我一般使用样本数的平方根作为 bin 数。这是Wikipedia histogram 文章中讨论适当数量的 bin 时列出的最简单的选择。从这篇文章
没有“最佳”数量的 bin,不同的 bin 大小可以揭示数据的不同特征。一些理论家试图确定最佳的箱数,但这些方法通常对分布的形状做出强有力的假设。
如果您不想对数据的分布做出假设,则使用样本数的平方根通常是一个不错的起点。