当我得到很多答案时:)我做了更多的调查,现在有了最终的方法。
答案是使用嵌套树来执行快速查询。
我做了什么:
创建用户表。
CREATE TABLE [dbo].[actor_users](
[id] [int] NOT NULL,
[manager_id] [int] NULL,
[deputy_id] [int] NULL,
[username] [nvarchar](48) NOT NULL,
[pwd] [nvarchar](40) NULL,
[pwd_url] [char](38) NULL,
[guid] [char](38) NULL,
[deactivated] [smallint] NULL,
[lastname] [nvarchar](48) NULL,
[middlename] [nvarchar](48) NULL,
[firstname] [nvarchar](48) NULL,
[acronym] [nvarchar](16) NULL,
[employee_nr] [nvarchar](16) NULL,
[department] [nvarchar](250) NULL,
[cost_unit] [nvarchar](16) NULL,
[desc] [nvarchar](max) NULL,
[email] [nvarchar](192) NULL,
[sex] [int] NULL,
[group_ids] [nvarchar](4000) NULL,
[picture_id] [int] NULL,
[lcid] [int] NULL,
CONSTRAINT [ct_actor_users] PRIMARY KEY CLUSTERED
( [id] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
创建组表。
CREATE TABLE [dbo].[actor_groups](
[id] [int] NOT NULL,
[parent_id] [int] NULL,
[group_name] [nvarchar](max) NULL,
[group_type] [int] NOT NULL,
[group_reference] [int] NOT NULL,
[description] [nvarchar](max) NULL,
[depth] [int] NULL,
[left] [int] NULL,
[right] [int] NULL,
[id_path] [nvarchar](max) NULL,
[name_path] [nvarchar](max) NULL,
CONSTRAINT [ct_actor_groups] PRIMARY KEY CLUSTERED
( [id] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
我用大约 10k 行随机用户填充了用户表。
现在我创建了一些随机树
-- create some random tree for testing
DECLARE @id int;
DECLARE @name nvarchar(max);
WHILE (SELECT COUNT(*) FROM actor_groups)<1000
BEGIN
SET @id = (SELECT ISNull(MAX(id),0) + 1 FROM actors);
SET @name = 'random_ou' + CAST(NEWID() AS nvarchar(40));
INSERT INTO actor_groups (id, parent_id, group_name, group_type, group_reference, [description], depth, [left], [right], id_path, name_path)
SELECT @id
, (SELECT TOP 1 id FROM actor_groups ORDER BY NEWID()) AS parent_id
, @name group_name
, 3 group_type
, -1 group_reference
, '' [description]
, 0 depth
, 0 [left]
, 0 [right]
, '' id_path
, '' name_path
END
在此之后我必须更新嵌套的集合关系....
-- update tree
WHILE EXISTS (SELECT * FROM actor_groups WHERE depth IS NULL)
UPDATE tr SET
tr.depth = par.depth + 1 ,
tr.id_path = par.id_path + ',' + CAST(tr.id AS nvarchar(255)) ,
tr.name_path = (CASE par.id WHEN 40 THEN '' ELSE par.name_path + '/' END) + tr.group_name
FROM actor_groups AS tr
INNER JOIN actor_groups AS par ON (tr.parent_id = par.id)
WHERE par.depth >=0 AND tr.depth IS NULL
GO
-- left, right nested set
WITH treerows AS
( SELECT actor_groups.*, ROW_NUMBER() OVER (ORDER BY id_path) AS Row FROM actor_groups )
UPDATE actor_groups
SET [left] = tbl.Lft
, [right] = tbl.Rgt
FROM actor_groups
JOIN (SELECT
ER.id,
ER.id_path,
ER.depth,
ER.Row,
(ER.Row * 2) - ER.depth AS Lft,
((ER.Row * 2) - ER.depth) +
(
SELECT COUNT(*) * 2
FROM treerows ER2
WHERE ER2.id_path LIKE ER.id_path + ',%'
) + 1 AS Rgt
FROM treerows ER
) tbl ON tbl.id = actor_groups.id
现在我做了一些随机映射......
-- do some random mappings
DECLARE @map int;
DECLARE @mapuser int;
DECLARE @counter int;
SET @counter = 1;
WHILE @counter<1000
BEGIN
SET @map = (SELECT TOP 1 id FROM actor_groups ORDER BY NEWID())
SET @mapuser = (SELECT TOP 1 id FROM actor_users ORDER BY NEWID())
INSERT INTO actor_mappings ([group_id], [user_id], imported) VALUES (@map, @mapuser, 0)
SET @counter = @counter + 1;
END
所以现在我有一个组和一个用户表。用户有 10.000 个用户,我的树有大约 1.000 个节点。我确实启动了几次随机映射 SQL,所以我有大约 100.000 个映射。
我的查询:
SELECT DISTINCT
m.[user_id] AS luserid
, org.[id] AS lgroupid
, m.imported AS bimported
FROM [test].[dbo].[actor_groups] org
JOIN [actor_groups] org2 ON org2.[left] BETWEEN org.[left] AND org.[right]
JOIN actor_mappings m ON org2.id = m.group_id
如果我不缩小查询范围,查询将在大约 700 毫秒内运行。在我的测试中,寻找一个特殊的节点或用户大约需要 150-300 毫秒。
解决方法:
这可以使用嵌套集来完成。
用我的 1.000 个节点更新树大约需要 1 秒,并且在我的表上没有任何附加索引的情况下查询数据也总是低于 1 秒。
希望这可以帮助其他面临同样问题的人。