I have a very large database table in PostgresQL and a column like "copied". Every new row starts uncopied and will later be replicated to another thing by a background programm. There is an partial index on that table "btree(ID) WHERE replicated=0". The background programm does a select for at most 2000 entries (LIMIT 2000), works on them and then commits the changes in one transaction using 2000 prepared sql-commands.
Now the problem ist that I want to give the user an option to reset this replicated-value, make it all zero again.
An update table set replicated=0;
is not possible:
- It takes very much time
- It duplicates the size of the tabel because of MVCC
- It is done in one transaction: It either fails or goes through.
I actually don't need transaction-features for this case: If the system goes down, it shall process only parts of it.
Several other problems: Doing an
update set replicated=0 where id >10000 and id<20000
is also bad: It does a sequential scan all over the whole table which is too slow. If it weren't doing that, it would still be slow because it would be too many seeks.
What I really need is a way of going through all rows, changing them and not being bound to a giant transaction.
Strangely, an
UPDATE table
SET replicated=0
WHERE ID in (SELECT id from table WHERE replicated= LIMIT 10000)
is also slow, although it should be a good thing: Go through the table in DISK-order...
(Note that in that case there was also an index that covered this)
(An update LIMIT like Mysql is unavailable for PostgresQL)
BTW: The real problem is more complicated and we're talking about an embedded system here that is already deployed, so remote schema changes are difficult, but possible It's PostgresQL 7.4 unfortunately.
The amount of rows I'm talking about is e.g. 90000000. The size of the databse can be several dozend gigabytes.
The database itself only contains 5 tables, one is a very large one. But that is not bad design, because these embedded boxes only operate with one kind of entity, it's not an ERP-system or something like that!
Any ideas?