Clustered Columnstore 最重要的限制之一是它们的锁定,您可以在此处找到一些详细信息:http: //www.nikoport.com/2013/07/07/clustered-columnstore-indexes-part-8-locking/
关于你的问题:
1) 上面的陈述是否说当存在大量重复值时,聚集列存储索引总是比 B-Tree 索引更适合提取数据
- 批处理模式不仅可以更快地扫描重复项,而且当从段中读取所有数据时,列存储索引的机制更有效地读取数据。
2)聚集列存储索引和非聚集B-Tree覆盖索引之间的性能如何,例如当表有很多列时
- 列存储索引比页或行具有明显更好的压缩,可用于行存储,批处理模式将在处理方面产生最大的差异,如前所述,即使读取相同大小的页和范围,列存储索引也应该更快
3) 我可以在一张表上同时使用聚集和非聚集列存储索引吗
4) ...谁能告诉如何定义一个表是否适合列存储索引?
- 您正在扫描和处理大量(超过 100 万行)的任何表,或者甚至可能完全扫描超过 100K 的整个表都可能是考虑的候选者。与要构建聚集列存储索引的表相关的使用技术有一些限制,这是我正在使用的查询:
select object_schema_name( t.object_id ) as 'Schema'
, object_name (t.object_id) as 'Table'
, sum(p.rows) as 'Row Count'
, cast( sum(a.total_pages) * 8.0 / 1024. / 1024
as decimal(16,3)) as 'size in GB'
, (select count(*) from sys.columns as col
where t.object_id = col.object_id ) as 'Cols Count'
, (select count(*)
from sys.columns as col
join sys.types as tp
on col.system_type_id = tp.system_type_id
where t.object_id = col.object_id and
UPPER(tp.name) in ('VARCHAR','NVARCHAR')
) as 'String Columns'
, (select sum(col.max_length)
from sys.columns as col
join sys.types as tp
on col.system_type_id = tp.system_type_id
where t.object_id = col.object_id
) as 'Cols Max Length'
, (select count(*)
from sys.columns as col
join sys.types as tp
on col.system_type_id = tp.system_type_id
where t.object_id = col.object_id and
(UPPER(tp.name) in ('TEXT','NTEXT','TIMESTAMP','HIERARCHYID','SQL_VARIANT','XML','GEOGRAPHY','GEOMETRY') OR
(UPPER(tp.name) in ('VARCHAR','NVARCHAR') and (col.max_length = 8000 or col.max_length = -1))
)
) as 'Unsupported Columns'
, (select count(*)
from sys.objects
where type = 'PK' AND parent_object_id = t.object_id ) as 'Primary Key'
, (select count(*)
from sys.objects
where type = 'F' AND parent_object_id = t.object_id ) as 'Foreign Keys'
, (select count(*)
from sys.objects
where type in ('UQ','D','C') AND parent_object_id = t.object_id ) as 'Constraints'
, (select count(*)
from sys.objects
where type in ('TA','TR') AND parent_object_id = t.object_id ) as 'Triggers'
, t.is_tracked_by_cdc as 'CDC'
, t.is_memory_optimized as 'Hekaton'
, t.is_replicated as 'Replication'
, coalesce(t.filestream_data_space_id,0,1) as 'FileStream'
, t.is_filetable as 'FileTable'
from sys.tables t
inner join sys.partitions as p
ON t.object_id = p.object_id
INNER JOIN sys.allocation_units as a
ON p.partition_id = a.container_id
where p.data_compression in (0,1,2) -- None, Row, Page
group by t.object_id, t.is_tracked_by_cdc, t.is_memory_optimized, t.is_filetable, t.is_replicated, t.filestream_data_space_id
having sum(p.rows) > 1000000
order by sum(p.rows) desc