sql - Summarizing three variables using sql

Question

Here is the raw data

Book | Author | Year

A    | A1     | 1985

A    | B1     | 1985

B    | A1     | 1988

B    | C1     | 1988

D    | A1     | 1990

D    | C1     | 1990

D    | B1     | 1990

Here is what output I am looking for,

Author1 | Author2 | year | count

A1      | B1      | 1985 | 1

A1      | C1      | 1985 | 1

A1      | C1      | 1988 | 1

A1      | B1      | 1990 | 1

A1      | C1      | 1990 | 1

B1      | C1      | 1990 | 1

Any help is deeply appreciated. Thanks

score 0 · Accepted Answer

SELECT A.author AS author1, 
       B.author AS author2,
       A.year,
       COUNT(*) AS "count"
  FROM Author A
       LEFT JOIN Author B
          ON B.book = A.Book
          AND B.author > A.Author
       GROUP BY A.author, B.author, A.year
       ORDER BY A.author, B.author, A.year

This will work okay only as long as there are no more than two rows per book in your Author table. Otherwise, it will produce multiple lines per book. If that is possibly the case, you should indicate what flavor of SQL should be used, as the means to limit the results from Table B differ from implementation to implementation. I have arbitrarily chosen to list the authors in alphabetical order, since there appears to be no indicator of which author is "primary."

I would hope that there are come additional columns in the table that you are not telling us about--most specifically a primary key, and perhaps some attribute indicating the "billing order" of the authors with respect to a given book.

You might want to reconsider your table design, if that's possible: it's in a non-normalized form that makes data integrity hard to enforce.

score 0 · Accepted Answer

The query you are looking for is a self join with an aggregation:

select t1.author as author1, t2.author as author2, t1.year, count(*) as `count`
from t t1 join
     t t2
     on t1.book = t2.book and
        t1.author < t2.author
group by t1.author, t2.author, t1.year
order by t1.author, year;

sql - Summarizing three variables using sql

2 回答 2

Related

Reference