3

Recently I've had to increase the volume of an data extraction script and I've found the output to be missing some data that is present in the source tables. Can anyone suggest some ideas for what might be causing this issue?

Background

The extraction procedure is an sql script which I execute from an sqlplus client (running in Windows Server 2008) and which spools the query results to output files.

The data that is extracted includes customer records and their transaction records. Each customer may have several hundred transaction records each month. Before the increased load, we were extracting about 100K customers with the last 3 months of transactions. This has now increased to 1 million. I expect that the output file would be about 30GB.

The transactions table is partitioned into monthly partitions. The extraction script includes a separate query spooling each partition. It seemed that the first of the 3 month's transactions was output, and the 2nd and 3rd month of transactions was missing (though the process log showed that it still spent the time querying all partitions).

Some interesting progress

This sql script is invoked within a perl script. Originally the sql script was invoked in one go. I decided to split up the sql and to invoke each query in turns and the output seems to have tripled in size (to about 30GB). Why might this have happened? (This may help with the real problem below.)

The current issue

There is a second database with 5x more customers and 2 types of transaction tables, each partitioned into daily partitions. So to extract 3 months of transactions, the script queries on 90 partitions on 2 separate tables. Again, the 1st month of data seems to have output successfully, but the 2nd and 3rd months are missing. The above solution did not work for this database.

Something else that is unusual is that the 2 files output from the 2 transaction tables are both almost exactly 4GB in size (4,294,967,362 bytes). Could it be that Oracle or something else is limiting the extraction size?

4

1 回答 1

2

什么版本的 Windows 以及磁盘上的文件系统是什么?从内存中,FAT32 文件系统具有文件大小限制。认为您需要超过 2G(或者可能是 4G)的 NTFS。我怀疑您在 Windows 或 sqlplus 本身中达到了该限制。

您可以尝试通过尝试缩小范围

sqlplus > file.txt

而不是使用spool命令。如果可行,它看起来像一个 sqlplus 问题。如果不是,则更有可能是文件系统问题。

于 2011-07-14T09:20:29.087 回答