请注意,我正在使用 postgresql
我有一张organizations
桌子,一张users
桌子,一张jobs
桌子,一张documents
桌子。我想获取按他们有权访问的文件总数排序的组织列表。
organizations
------------
id (pk)
company_name
users
------------
id (pk)
organization_id
jobs
------------
id (pk)
client_id (id of an organization)
server_id (id of an organization)
creator_id (id of a user)
documents
------------
id (pk)
job_id
期望的结果
organizations.id | organizations.company_name | document_count
85 | Big Corporation | 84
905 | Some other folks | 65
403 | ACME, Inc | 14
如您所见,组织可以通过 3 种不同的路径连接到文档:
organizations.id
=>jobs.client_id
=>documents.job_id
organizations.id
=>jobs.server_id
=>documents.job_id
organizations.id
=>users.organization_id
=>jobs.creator_id
=>documents.job_id
但我想要一个查询来计算每家公司有权访问的所有文件的数量......
我尝试了几件事......像这样:
SELECT COUNT(documents.id) document_count, organizations.id, organizations.company_name
FROM organizations
INNER JOIN users ON organizations.id = users.organization_id
INNER JOIN jobs ON (
jobs.client_id = organizations.id OR
jobs.server_id = organizations.id OR
jobs.creator_id = users.id
)
INNER JOIN documents ON documents.job_id = jobs.id
GROUP BY organizations.id, organizations.company_name
ORDER BY document_count DESC
LIMIT 10
The query takes awhile to run, but it's not horrible since i'm doing it for a one-time report, but the results... cannot possibly be correct.
The first listed organization has a reported count of 129,834 documents -- but that's impossible since there's only 32,820 records in the documents
table. I feel like it must be counting drastic quantities of duplicates (due to an error in one of my joins?) but I'm not sure where I've gone wrong.
The order appears correct since the highest volume user of the system is clearly at the top of the list... but the value is inflated somehow.