I have read about how boolean columns don't serve much as searching indexes.. But my question is.. if a clustered index, affects the physical arrangement of the records can't it be used to put a type of records, all together (in the same page) so that those page will have less chance of being loaded into memory.. I will try to explain better: for the table
[BookPages]
ID(int)
Deleted(Boolean)
Text(Varchar)
if the clustered index is on ID
column, a sample data would be
1, true, 'the quick..'
2, false, 'hello w..'
3, true, 'stack m..'
4, false, 'just thin...'
this means that the delete/active records as interleaved, so if we search for the record 2
SELECT [Text] FROM [BookPages] WHERE [Deleted] = false AND [ID] = 2
the "leaf" data page may end up with the rows (1,2) this mean that we are loading into memory, records with the deleted field, that we will never be interested in..
but if the index was in the columns Deleted,ID
the data would now be
2, false, 'hello w..'
4, false, 'just thin...'
1, true, 'the quick..'
3, true, 'stack m..'
now, when we target only the active records as SQL loads the pages, we will have pages full with of only active records..
So on a database with a long history and a lot deleted records, we can have better locality on the records that we want, and help with IO..
And on thousands of pages we can make sure that a large chunk of them will never be loaded on to memory, and that data will always only remain on disk.
is this reasoning correct? may this impact(improve) overall performance on large databases?