我在 Datomic 数据库中有一个与此类似的模式:
; --- tenant
{:db/id #db/id[:db.part/db]
:db/ident :tenant/guid
:db/unique :db.unique/identity
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :tenant/name
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :tenant/taks
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/many
:db.install/_attribute :db.part/db}
; --- task
{:db/id #db/id[:db.part/db]
:db/ident :task/guid
:db/unique :db.unique/identity
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :task/createdAt
:db/valueType :db.type/instant
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :task/name
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :task/subtasks
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/many
:db.install/_attribute :db.part/db}
; --- subtask
{:db/id #db/id[:db.part/db]
:db/ident :subtask/guid
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db/unique :db.unique/identity
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :subtask/type
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :subtask/startedAt
:db/valueType :db.type/instant
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :subtask/completedAt
:db/valueType :db.type/instant
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :subtask/participants
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/many
:db.install/_attribute :db.part/db}
; --- participant
{:db/id #db/id[:db.part/db]
:db/ident :participant/guid
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db/unique :db.unique/identity
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :participant/name
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
随着时间的推移,这些任务是相当静态的,但每个任务平均每 5 分钟添加和删除一次子任务。我想说的是,在任何给定时间,每个任务平均有大约 40 个子任务,其中包含(几乎总是,但也有一些例外)一个参与者。我使用 Datomic 的唯一目的是能够看到任务是如何随着时间演变的,即我想看看任务在给定时间的样子。为了实现,我目前正在做类似的事情:
(defn find-tasks-by-tenant-at-time
[conn tenant-guid ^long time-epoch]
(let [db-conn (-> conn d/db (d/as-of (Date. time-epoch)))
task-ids (->> (d/q '[:find ?taskIds
:in $ ?tenantGuid
:where
[?tenantId :tenant/guid ?tenantGuid]
[?tenantId :tenant/tasks ?taskIds]]
db-conn tenant-guid)
vec flatten)
task-entities (map #(d/entity db-conn %) task-ids)
dtos (map (fn [task]
(letfn [(participant-dto [participant]
{:id (:participant/guid participant)
:name (:participant/name participant)})
(subtask-dto [subtask]
{:id (:subtask/guid subtask)
:type (:subtask/type subtask)
:participants (map participant-dto (:subtask/participants subtask))})]
{:id (:task/guid task)
:name (:task/name task)
:subtasks (map subtask-dto (:task/subtasks task))})) task-entities)]
dtos))
不幸的是,这非常慢。如果租户有很多任务(比如 20 个),每个任务包含大约 40 个子任务,则从这个函数返回可能需要将近 60 秒。我在这里做明显错误的事情吗?有可能加快这个速度吗?
更新:整个数据集大约为 2 Gb,对等方有 3.5 Gb 的内存(但如果我将其减少到 1.5 Gb 似乎没有任何区别)并且交易者有 1 Gb 的内存。我正在使用免费的 Datomic。