For SQL Server
Q1 Extra space is only needed for the clustered index if it is not unique. SQL Server will add a 4 byte uniquifier internally to a non-unique clustered index. This is because it uses the cluster key as a rowid in non-clustered indexes.
Q2 A non-clustered index can be read in order. That may aid queries where you specify an order. It may also make merge joins attractive. It will also help with range queries (x < col and y > col).
Q3 SQL Server does an extra "bookmark lookup" when using a non-clustered index. But, this is only if it needs a column that isn't in the index. Note also, that you can include
extra columns in the leaf level of indexs. If an index can be used without the additional lookup it is called a covering index.
If a bookmark lookup is required, it doesn't take a high percentage of rows until it's quicker just to scan the whole clustered index. The level depends on row size, key size etc. But 5% of rows is a typical cut off.
Q4 If the most important thing in your application was making both these queries as fast as possible, you could create covering index on both of them:
create index IX_1 on employee (age) include (name, salary);
create index IX_2 on employee (salary) include (name, age);
Note you don't have to specifically include the cluster key, as the non-clustered index has it as the row pointer.
Q5 This is more important for cluster keys than non-cluster keys due to the uniquifier. The real issue though is whether an index is selective or not for your queries. Imagine an index on a bit
value. Unless the distribution of data is very skewed, such an index is unlikely to be used for anything.
More info about the uniquifier. Imagine you and a non unique clustered index on age, and a non-clustered index on salary. Say you had the following rows:
age | salary | uniqifier
20 | 1000 | 1
20 | 2000 | 2
Then the salary index would locate rows like so
1000 -> 20, 1
2000 -> 20, 2
Say you ran the query select * from employee where salary = 1000
, and the optimizer chose to use the salary index. It would then find the pair (20, 1) from the index lookup, then lookup this value in the main data.