pandas - Is there a limit to the amount of rows Pandas read_csv can load?

Question

I am trying to load a .csv file using Pandas read_csv method, the file has 29872046 rows and it's total size is 2.2G. I notice that most of the lines loaded miss their values, for a large amount of columns. The csv file when browsed from shell contains those values... Are there any limitations to loaded files? If not, how could this be debugged? Thanks

score 4 · Accepted Answer

@d1337,

我想知道你是否有记忆问题。这里有一个提示。

可能this is related 或this。

如果我试图调试它，我会做简单的事情。将文件切成两半 - 会发生什么？如果可以，则上升 50%，如果不下降 50%，直到能够确定其发生的点。您甚至可能希望从 20 行开始，并确保它与大小相关。

我还会在帖子中添加操作系统和内存信息以及您正在使用的 Pandas 版本以防万一（我正在运行 Pandas 11.0、Python 3.2、Linux Mint x64 和 16G RAM，所以我希望没有问题，说）。此外，您可能会发布指向您的数据的链接，以便其他人可以对其进行测试。

希望有帮助。

pandas - Is there a limit to the amount of rows Pandas read_csv can load?

1 回答 1

Related

Reference