-1

我有以下格式的数据:

|------------------------|
| Product | Color | Year |
|------------------------|
|  Ball   | Blue  | 1999 |
|  Ball   | Blue  | 2000 |
|  Ball   | Blue  | 2001 |
|  Stick  | Green | 1984 |
|  Stick  | Green | 1985 |
|------------------------|

如何将其转换为以下内容:

|-----------------------------|
| Product | Color | Year Range|
|-----------------------------|
|  Ball   | Blue  | 1999-2001 |
|  Stick  | Green | 1984-1985 |
|-----------------------------|

数据位于 PostgreSQL 表中,包含 187,000 多行迫切需要以这种方式合并的行。我如何使用 Python 2.7 来解决这个问题?

4

1 回答 1

2

数据位于 PostgreSQL 表中,包含 187,000 多行迫切需要以这种方式合并的行。

它可能迫切需要以这种方式合并以进行报告,但几乎可以肯定它不需要以这种方式合并以进行存储。在这里轻轻地走。

GROUP BY只需使用一个子句,您就可以获取大致该格式的数据。(我使用“product_color_years”作为表名。)

select product, color, min(year), max(year)
from product_color_years
group by product, color

要将年份合并到单个列中,请使用连接运算符。

select product, color, min(year) || '-' || max(year) year_range
from product_color_years
group by product, color

This works only as long as

  • there aren't any gaps in the year range, or
  • there are gaps, but you don't care.

If there are gaps that you'd like to see reported like this:

product  color  year_range
--
Ball     Blue   1999-2001
Ball     Blue   2003-2005
Stick    Mauve  2000, 2010

then you're probably better off using a report writer. (For example, Google "python reports".) The SQL above will report these blue balls as Ball Blue 1999-2005, which might not be what you want.

于 2012-07-30T21:21:21.907 回答