-3

我有贷款数据,我想按日期分组并通过不同的产品获取金额

我的数据看起来像这样

disbursementdate | amount | product | cluster
2017-01-01       | 1000   | HL      | West
2018-02-01       | 1000   | PL      | East

所以查询后,我希望结果看起来像这样

   Month            | HL   | PL
   January 2018     | 1000 | 0
   February 2018    | 100  | 1000

请注意,可能会有更多产品,并且无法知道有多少独特的……所以sum case when行不通

我正在为查询而苦苦挣扎

4

2 回答 2

0

例如,您可以通过构建代码来在 mysql 中执行此操作

DROP TABLE IF EXISTS T;
CREATE TABLE T(disbursementdate DATE, amount INT, product VARCHAR(2), cluster VARCHAR(4));
INSERT INTO T VALUES
('2017-01-01'       , 1000   , 'HL'    ,   'West'),
('2017-01-01'       , 1000   , 'OL'    ,   'West'),
('2018-02-01'       , 1000   , 'PL'    ,   'East'),
('2018-02-01'       , 100    , 'HL'    ,   'West'),
('2018-02-01'       , 1000   , 'HL'    ,   'West');


SET @SQL = 

(SELECT CONCAT('SELECT DISBURSEMENTDATE,',
GROUP_CONCAT(CONCAT('SUM(CASE WHEN PRODUCT = ', CHAR(39),S.PRODUCT, CHAR(39),' THEN AMOUNT ELSE 0 END) AS ',S.PRODUCT))
,' FROM T GROUP BY DISBURSEMENTDATE;')
FROM 
(SELECT DISTINCT PRODUCT FROM T) S
)
;

PREPARE SQLSTMT FROM @SQL;
EXECUTE SQLSTMT;
DEALLOCATE PREPARE SQLSTMT;

+------------------+------+------+------+
| DISBURSEMENTDATE | HL   | OL   | PL   |
+------------------+------+------+------+
| 2017-01-01       | 1000 | 1000 |    0 |
| 2018-02-01       | 1100 |    0 | 1000 |
+------------------+------+------+------+
2 rows in set (0.00 sec)
于 2018-06-14T10:47:02.877 回答
0

您可以使用 Pandas 和专用方法pd.DataFrame.pivot_table

import pandas as pd

# read data
df = pd.read_csv('file.csv')

# extract month
df['Month'] = pd.to_datetime(df['disbursementdate']).apply(lambda x: x.replace(day=1))

# pivot results
res = df.pivot_table(index='Month', columns='product', values='amount',
                     aggfunc='sum', fill_value=0).reset_index()

# reformat month
res['Month'] = res['Month'].dt.strftime('%B %Y')

print(res)

product          Month    HL    PL
0         January 2017  1000     0
1        February 2018     0  1000
于 2018-06-14T10:26:22.170 回答