我会对所有五个表执行 UNION 以将它们作为一个行集(内联视图),然后对其运行查询,从这样的内容开始......
SELECT SUM(IF(t.source='MT',t.pagevisits,0)) AS table1
, SUM(IF(t.source='CT1',t.pagevisits,0)) AS table2
, t.commodity
FROM ( SELECT 'MT' as source, table1.* FROM table1
UNION ALL
SELECT 'CT1', table2.* FROM table2
UNION ALL
SELECT 'CT2', table3.* FROM table3
UNION ALL
SELECT 'CT3', table4.* FROM table4
UNION ALL
SELECT 'CT4', table5.* FROM table5
) t
GROUP BY t.commodity
(但我会为这些表中的每一个指定列列表,而不是使用“。*”并且让我的查询不依赖于任何人在这些表中添加/删除/重命名/重新排序列。)
I include an "extra" literal value (aliased as "source") to identify which table the row came from. I can use a conditional test in an expression in the SELECT list, to figure out whether the row came from a particular table.
This approach is particularly flexible, and can be used to get more complicated resultsets. For example, if I also wanted to get a total number page visits from table3, 4 and 5 added together, along with the individual counts.
SUM(IF(t.source IN ('CT2','CT3','CT4'),t.pagevisits,0) AS total_345
To get the equivalent of your COUNT(DISTINCT item) + COUNT(DISTINCT item) + ...
expression...
I would use an expression that makes a single value from both the "source" and "item" columns, being careful to have some sort of guarantee that any particular "source"+"item" will not create a duplicate of some other "source"+"item". (If we just concatenate strings, for example, we don't have any way to distinguish between 'A'+'11' and 'A1'+'1'.) The most common approach I see here is a carefully chosen delimiter which is guaranteed not to appear in either value. We can distinguish between 'A::11' and 'A1::1', so something like this will work:
COUNT(DISINCT CONCAT(t.source,'::',t.item))
In your current query, if item
is NULL, then the row doesn't get included in the COUNT. To fully replicate that behavior, you would need something like this:
COUNT(DISINCT IF(t.item IS NOT NULL,CONCAT(t.source,'::',t.item),NULL)) AS Total
Or course, getting a count of distinct item values over the whole set of five tables is much simpler (but then, it does return a different result)
COUNT(DISINCT t.item)
But to answer your question about the use of the LEFT JOIN
, the left side table is the "driver" so a matching row has to be in that table for a corresponding row to be retrieved from a table on the right. That is, unmatched rows from the tables on the right side will not be returned.
If what you have is basically five "partitions", and you want to process all of the rows whether or not a matching row appears in any of the other "partitions", I would go with the UNION ALL
approach to simply concatenate all of the rows from all of those tables together, and process the rows as if they were from a single table.
NOTE: For very large tables, this may not be a feasible approach, since MySQL is going to have to materialize that inline view. There are other approaches which don't require concatenating all of the rows together.
Specifying a list of only the columns you need, in the SELECT from each table, may help performance, if there are columns in those tables you don't need to reference in your query.