在关系数据库中存储树有多种选择。为了获得良好的概述,我推荐 Bill Karwin 的幻灯片。
既然您提到读取速度是最重要的,那么闭包表将是一种合适的、强大的编码。闭包表是一个多对多关系,它为每个路径(例如,/a/b/c)存储所有父/子(传递地)。这样,对树的许多查询可以通过一个 SQL 查询(非递归)完成。
那看起来像
create table nodes (
path varchar primary key
/* your other attributes here, can be null */
);
create table parents_children (
parent_path varchar,
child_path varchar,
primary key(parent_path,child_path),
foreign key (parent_path) references nodes (path),
foreign key (child_path) references nodes (path)
);
要在目录 /a/b/ 下插入新文件 /a/b/c,您可以:
insert into nodes values ('/a/b/c');
insert into parents_children
select parent_path, '/a/b/c' from parents_children where child_path = '/a/b/'
union all select '/a/b/c','/a/b/c';
例如,要以递归方式查询 '/a' 中的所有子项,您将执行以下操作:
select *
from nodes join parents_children on path = child_path
where parent_path = '/a';
一个更详尽的示例,它存储以下文件树:
/
/a/
/a/b/
/a/b/d
/a/c
/b
要插入数据:
insert into nodes values ('/');
insert into parents_children values ('/','/');
insert into nodes values ('/a/');
insert into parents_children
select parent_path, '/a/' from parents_children where child_path = '/'
union all select '/a/','/a/';
insert into nodes values ('/a/b/');
insert into parents_children
select parent_path, '/a/b/' from parents_children where child_path = '/a/'
union all select '/a/b/','/a/b/';
insert into nodes values ('/a/c');
insert into parents_children
select parent_path, '/a/c' from parents_children where child_path = '/a/'
union all select '/a/c','/a/c';
insert into nodes values ('/a/b/d');
insert into parents_children
select parent_path, '/a/b/d' from parents_children where child_path = '/a/b/'
union all select '/a/b/d','/a/b/d';
insert into nodes values ('/b');
insert into parents_children
select parent_path, '/b' from parents_children where child_path = '/'
union all select '/b','/b';
查询 /a/ 的所有子项
select node.*
from nodes join parents_children on path = child_path
where parent_path = '/a/';
path
----------
/a/
/a/b/
/a/b/d
/a/c