I have a question regarding the WSClock algorithm used for page replacement in operating systems.
As far as I understand, the WSClock combines the features of both Working Set (a page is in the WS if it was referenced to in last T time intervals, so the last reference time is stored for the each page with R&M bits) and the Clock algorithm (page frame table is stored as a circular list, when page fault occurs the list is traversed searching for the pages which are not present in the working set).
On each page reference the HW sets its R bit to 1. On some time intervals the OS resets R bits for all pages to 0.
When a page fault occurs the algorithm traverses over the list and does the following for each page:
1) If R == 1 : set R = 0, set t = process_t, go further
2) Else if R == 0 and (t - process_t) <= T, go further
3) Else if R == 0 and (t - process_t) > T
1) If M = 1 - set page for writing to disk, go further
2) If M = 0 - this is our victim, evict
If there are no pages for immediate evict are found - traverse until a disk write for any marked page is finished, then evict.
If there are no pages scheduled for a disk write - take the one which was referenced to the most time ago.
The algorithm looks just fine for me except for one thing:
The last time a page was reference to is changed only on one occasion: if a page fault has occured after this page was referenced to (R = 1) but the OS has not reset the R bit to 0 yet (R = 0). So as I understand this - the last used time is only approximated by that approach which leads to a better performance.
So for really-really frequently used pages which are no doubt present in the WS of the process everything is good: their reference frequency is higher than the frequency of the OS resetting event.
But there should be some unlucky pages which are referenced to pretty frequently but unfortunately for them no page faults happen after these references. But they are also so unlucky that when page faults occurs - they look like they have not been referenced to although this is not true.
So from what I see it looks like the OS takes snapshots of the working set only during page faults events. These snapshots represent working set of pages being referred to in time intervals between R bit resetting and page faults.
Why is that a good approximation of the real work set?
Is the quality of the approximation (and thus the effectiveness of the algorithm) determined by the value of the time interval when R bits are reset?
So this value should not be too low to be able to catch most of these unlucky pages I've mentioned above but also not be too big to be able not to catch pages which are really already not in the Working Set?