这看起来是位图索引的好候选:
位图索引主要用于数据仓库或查询以特殊方式引用许多列的环境。可能需要位图索引的情况包括:
索引列的基数较低,也就是说,与表行数相比,不同值的数量很小。
索引表要么是只读的,要么不受 DML 语句的重大修改。
具体来说,位图连接索引在这里可能是理想的。手册中的示例甚至与您的数据模型相匹配。我尝试在下面重新创建您的模型和数据,位图连接索引的运行速度似乎比其他解决方案快几个数量级。
样本数据
--Create tables
create table customer
(
customer_id number,
region varchar2(100) not null
) nologging;
create table product
(
product_id number,
customer_id number not null,
category varchar2(100) not null
) nologging;
--Load 30M rows, 1M rows at a time. Takes about 6 minutes.
begin
for i in 1 .. 30 loop
insert /*+ append */ into customer
select (1000000*i)+level, 'Region '||trunc(dbms_random.value(1, 1000))
from dual connect by level <= 1000000;
commit;
insert /*+ append */ into product
select (1000000*i)+level, (1000000*i)+level
,'Category '||trunc(dbms_random.value(1, 1000))
from dual connect by level <= 1000000;
commit;
end loop;
end;
/
--Add primary keys and foreign key constraints.
alter table customer add constraint customer_pk primary key (customer_id);
alter table product add constraint product_pk primary key (product_id);
alter table product add constraint product_customer_fk
foreign key (customer_id) references customer(customer_id);
--Gather stats
begin
dbms_stats.gather_table_stats(user, 'CUSTOMER');
dbms_stats.gather_table_stats(user, 'PRODUCT');
end;
/
未编入索引 - 慢
正如预期的那样,性能很差。这个示例查询在我的机器上大约需要 75 秒。
SELECT count(*)
FROM product
JOIN customer ON product.CUSTOMER_ID = customer.customer_id
WHERE (product.CATEGORY = 'Category 1' AND customer.REGION = 'Region 1')
OR (product.CATEGORY = 'Category 2' AND customer.REGION = 'Region 2')
OR (product.CATEGORY = 'Category 888' AND customer.REGION = 'Region 888');
B-tree 索引 - 仍然很慢
计划发生变化,但性能保持不变。我认为这可能是因为我的示例是最坏的索引场景,其中数据是真正随机的。
create index customer_idx on customer(region);
create index product_idx on product(category);
begin
dbms_stats.gather_table_stats(user, 'CUSTOMER');
dbms_stats.gather_table_stats(user, 'PRODUCT');
end;
/
位图索引 - 好一点
这稍微提高了性能,大约为 61 秒。
drop index customer_idx;
drop index product_idx;
create bitmap index customer_bidx on customer(region);
create bitmap index product_bidx on product(category);
begin
dbms_stats.gather_table_stats(user, 'CUSTOMER');
dbms_stats.gather_table_stats(user, 'PRODUCT');
end;
/
位图连接索引 - 非常快
现在查询几乎立即返回结果,我的 IDE 将其计为 0 秒。
drop index customer_idx;
drop index product_idx;
create bitmap index customer_product_bjix
on product(product.category, customer.region)
FROM product, customer
where product.CUSTOMER_ID = customer.customer_id;
begin
dbms_stats.gather_table_stats(user, 'CUSTOMER');
dbms_stats.gather_table_stats(user, 'PRODUCT');
end;
/
指数成本
位图连接索引的创建时间比 b 树或位图索引要长一些。与位图或位图连接索引相比,b 树索引非常大。
select segment_name, bytes/1024/1024 MB
from dba_segments
where segment_name in ('CUSTOMER_IDX', 'PRODUCT_IDX'
,'CUSTOMER_BIDX', 'PRODUCT_BIDX', 'CUSTOMER_PRODUCT_BJIX');
SEGMENT_NAME MB
------------ --
CUSTOMER_IDX 726
PRODUCT_IDX 792
CUSTOMER_BIDX 88
PRODUCT_BIDX 96
CUSTOMER_PRODUCT_BJIX 184
查询方式
这不会影响性能,但您可以像这样缩小查询:
SELECT count(*)
FROM product
JOIN customer ON product.CUSTOMER_ID = customer.customer_id
WHERE (product.category, customer.region)
in (('Category 1', 'Region 1'),
('Category 2', 'Region 2'),
('Category 888', 'Region 888'));