11

现实描述:我们确实有一份项目清单。在每个项目中都有很多帐户。您可以对每个帐户执行很多操作。我确实定义了以下维度和事实表(简化):

Dimensions and attributes:
 Project
  project_key
  project_name
  industry
  number_of_accounts
 Distance
  distance_key
  distance_in_months
  distance_in_quarters
 Account
  account_key
  project_key
  account_id
Fact Table and attributes:
 Action_Fact_Table
  project_key
  distance_key
  account_key
  action_id

现在,我想用径流三角形的方法来分析数据(它可能不是真正的径流三角形,但方法是一样的)。最简单的三角形如下所示:

              Distance in Months
Project name|     1    2    3    4    5    6    7    8    9    10
-------------------------------------------------------------------------
 Project1   |     5   10   15   20   25   30   35   40   45    50
 Project2   |     7   14   21   28   35   42   49   56   63
 Project3   |     2    5    8   11   14   20   25   30
 Project4   |     0    2    5   10   18   23   40
 Project5   |     5   12   18   20   21   30

有按行计算的动作总和。以月为单位的距离显示了行动日期和项目开始日期之间的距离。您显然可以使用四分之一的距离(或距离维度中定义的任何其他周期)创建类似的三角形。

您还可以为项目维度中的不同层次结构创建三角形,例如行业(Project1-Project3 = Industry1,Project4-Project5 = Industry2):

              Distance in Months
Project name|     1    2    3    4    5    6    7    8    9    10
-------------------------------------------------------------------------
 Industry1   |    14   29   44   59   74   92  109  126  108   50
 Industry2   |     5   14   23   30   39   53   40   

还有更高级的径流三角形,您可以在其中将操作的运行总和除以帐户数。假设我们的项目有以下数量的帐户:

Project_name number_of_accounts  
-----------------------------
Project1     100
Project2     100
Project3     100
Project4     100
Project5     200

然后我想得到以下三角形:

              Distance in Months
Project    |     1    2    3    4    5    6    7    8    9    10
------------------------------------------------------------------------
 Project1  |   .05  .01  .15  .20  .25  .30  .35  .40  .45   .50
 Project2  |    .7  .14  .21  .28  .35  .42  .49  .56  .63
 Project3  |    .2   .5   .8  .11  .14  .20  .25  .30
 Project4  |    .0   .2   .5  .10  .18  .23  .40
 Project5  |   .05  .06  .09  .10 .105  .15

当您想要比较项目及其操作时,如果项目中的帐户数量对于所有项目都不相同,这尤其有用。

问题是是否可以在 OLAP 中创建这样的计算。我在想我可以在项目表中使用 number_of_accounts ,但我想不通。另一种选择是聚合帐户维度中的数据。我也无法使用谷歌找到任何东西,也许是因为我问错了问题。

这个问题的解决方案广泛适用于许多行业,尤其​​是在保险和银行业中至关重要。它可以在流程具有较长性能窗口的任何地方使用,并且可以通过定义明确的、可比较的单位批次进行跟踪。

(我们使用的是PostgreSQL,Saiku,Cubes是在Schema Workbench中定义的)

测试数据(PostgreSQL 语法,如果您需要其他内容,请告诉我)

--drop table if exists project cascade;
create table project (
  project_key int primary key,
  project_name character varying,
  industry character varying,
  number_of_accounts int
);

--drop table if exists distance cascade;
create table distance (
  distance_key int primary key,
  distance_in_months int,
  distance_in_quarters int);

--drop table if exists account cascade;
create table account (
  account_key int primary key,
  project_key int references project (project_key)
);

--drop table if exists action_fact_table cascade;
create table action_fact_table (
  project_key int references project (project_key),
  distance_key int references distance (distance_key),
  account_key int references account (account_key),
  action_id int
);

-- project data
insert into project values (1,'Project1','Industry1',100);
insert into project values (2,'Project2','Industry1',100);
insert into project values (3,'Project3','Industry1',100);
insert into project values (4,'Project4','Industry2',100);
insert into project values (5,'Project5','Industry2',200);

-- distance data
insert into distance values(1,1,1);
insert into distance values(2,2,1);
insert into distance values(3,3,1);
insert into distance values(4,4,2);
insert into distance values(5,5,2);
insert into distance values(6,6,2);
insert into distance values(7,7,3);
insert into distance values(8,8,3);
insert into distance values(9,9,3);
insert into distance values(10,10,4);
insert into distance values(11,11,4);
insert into distance values(12,12,4);

-- account data
/* let me know if you need insert statement for every row */
insert into account (
select generate_series (1,100), 1 union all
select generate_series (101,200), 2 union all
select generate_series (201,300), 3 union all
select generate_series (301,400), 4 union all
select generate_series (401,600), 5
);

insert into action_fact_table values(1,1,90,10001);
insert into action_fact_table values(1,1,32,10002);
insert into action_fact_table values(1,1,41,10003);
insert into action_fact_table values(1,1,54,10004);
insert into action_fact_table values(1,1,45,10005);
insert into action_fact_table values(1,2,22,10006);
insert into action_fact_table values(1,2,29,10007);
insert into action_fact_table values(1,2,41,10008);
insert into action_fact_table values(1,2,89,10009);
insert into action_fact_table values(1,2,15,10010);
insert into action_fact_table values(1,3,32,10011);
insert into action_fact_table values(1,3,100,10012);
insert into action_fact_table values(1,3,72,10013);
insert into action_fact_table values(1,3,80,10014);
insert into action_fact_table values(1,3,10,10015);
insert into action_fact_table values(1,4,12,10016);
insert into action_fact_table values(1,4,45,10017);
insert into action_fact_table values(1,4,83,10018);
insert into action_fact_table values(1,4,42,10019);
insert into action_fact_table values(1,4,33,10020);
insert into action_fact_table values(1,5,22,10021);
insert into action_fact_table values(1,5,27,10022);
insert into action_fact_table values(1,5,59,10023);
insert into action_fact_table values(1,5,32,10024);
insert into action_fact_table values(1,5,70,10025);
insert into action_fact_table values(1,6,32,10026);
insert into action_fact_table values(1,6,5,10027);
insert into action_fact_table values(1,6,15,10028);
insert into action_fact_table values(1,6,70,10029);
insert into action_fact_table values(1,6,43,10030);
insert into action_fact_table values(1,7,59,10031);
insert into action_fact_table values(1,7,9,10032);
insert into action_fact_table values(1,7,99,10033);
insert into action_fact_table values(1,7,79,10034);
insert into action_fact_table values(1,7,31,10035);
insert into action_fact_table values(1,8,56,10036);
insert into action_fact_table values(1,8,34,10037);
insert into action_fact_table values(1,8,48,10038);
insert into action_fact_table values(1,8,79,10039);
insert into action_fact_table values(1,8,42,10040);
insert into action_fact_table values(1,9,10,10041);
insert into action_fact_table values(1,9,10,10042);
insert into action_fact_table values(1,9,49,10043);
insert into action_fact_table values(1,9,61,10044);
insert into action_fact_table values(1,9,49,10045);
insert into action_fact_table values(1,10,99,10046);
insert into action_fact_table values(1,10,69,10047);
insert into action_fact_table values(1,10,84,10048);
insert into action_fact_table values(1,10,99,10049);
insert into action_fact_table values(1,10,3,10050);
insert into action_fact_table values(2,1,182,10051);
insert into action_fact_table values(2,1,127,10052);
insert into action_fact_table values(2,1,197,10053);
insert into action_fact_table values(2,1,174,10054);
insert into action_fact_table values(2,1,187,10055);
insert into action_fact_table values(2,1,144,10056);
insert into action_fact_table values(2,1,160,10057);
insert into action_fact_table values(2,2,155,10058);
insert into action_fact_table values(2,2,153,10059);
insert into action_fact_table values(2,2,119,10060);
insert into action_fact_table values(2,2,188,10061);
insert into action_fact_table values(2,2,125,10062);
insert into action_fact_table values(2,2,147,10063);
insert into action_fact_table values(2,2,123,10064);
insert into action_fact_table values(2,3,136,10065);
insert into action_fact_table values(2,3,163,10066);
insert into action_fact_table values(2,3,187,10067);
insert into action_fact_table values(2,3,138,10068);
insert into action_fact_table values(2,3,168,10069);
insert into action_fact_table values(2,3,132,10070);
insert into action_fact_table values(2,3,138,10071);
insert into action_fact_table values(2,4,158,10072);
insert into action_fact_table values(2,4,171,10073);
insert into action_fact_table values(2,4,153,10074);
insert into action_fact_table values(2,4,141,10075);
insert into action_fact_table values(2,4,182,10076);
insert into action_fact_table values(2,4,165,10077);
insert into action_fact_table values(2,4,143,10078);
insert into action_fact_table values(2,5,190,10079);
insert into action_fact_table values(2,5,181,10080);
insert into action_fact_table values(2,5,163,10081);
insert into action_fact_table values(2,5,134,10082);
insert into action_fact_table values(2,5,145,10083);
insert into action_fact_table values(2,5,190,10084);
insert into action_fact_table values(2,5,198,10085);
insert into action_fact_table values(2,6,137,10086);
insert into action_fact_table values(2,6,133,10087);
insert into action_fact_table values(2,6,135,10088);
insert into action_fact_table values(2,6,103,10089);
insert into action_fact_table values(2,6,187,10090);
insert into action_fact_table values(2,6,127,10091);
insert into action_fact_table values(2,6,117,10092);
insert into action_fact_table values(2,7,116,10093);
insert into action_fact_table values(2,7,139,10094);
insert into action_fact_table values(2,7,111,10095);
insert into action_fact_table values(2,7,150,10096);
insert into action_fact_table values(2,7,151,10097);
insert into action_fact_table values(2,7,181,10098);
insert into action_fact_table values(2,7,109,10099);
insert into action_fact_table values(2,8,102,10100);
insert into action_fact_table values(2,8,101,10101);
insert into action_fact_table values(2,8,118,10102);
insert into action_fact_table values(2,8,147,10103);
insert into action_fact_table values(2,8,186,10104);
insert into action_fact_table values(2,8,136,10105);
insert into action_fact_table values(2,8,160,10106);
insert into action_fact_table values(2,9,149,10107);
insert into action_fact_table values(2,9,119,10108);
insert into action_fact_table values(2,9,169,10109);
insert into action_fact_table values(2,9,176,10110);
insert into action_fact_table values(2,9,195,10111);
insert into action_fact_table values(2,9,183,10112);
insert into action_fact_table values(2,9,140,10113);
insert into action_fact_table values(3,1,224,10114);
insert into action_fact_table values(3,1,241,10115);
insert into action_fact_table values(3,2,295,10116);
insert into action_fact_table values(3,2,249,10117);
insert into action_fact_table values(3,2,260,10118);
insert into action_fact_table values(3,3,298,10119);
insert into action_fact_table values(3,3,267,10120);
insert into action_fact_table values(3,3,297,10121);
insert into action_fact_table values(3,4,211,10122);
insert into action_fact_table values(3,4,253,10123);
insert into action_fact_table values(3,4,214,10124);
insert into action_fact_table values(3,5,248,10125);
insert into action_fact_table values(3,5,223,10126);
insert into action_fact_table values(3,5,288,10127);
insert into action_fact_table values(3,6,207,10128);
insert into action_fact_table values(3,6,296,10129);
insert into action_fact_table values(3,6,221,10130);
insert into action_fact_table values(3,6,201,10131);
insert into action_fact_table values(3,6,227,10132);
insert into action_fact_table values(3,6,209,10133);
insert into action_fact_table values(3,7,267,10134);
insert into action_fact_table values(3,7,282,10135);
insert into action_fact_table values(3,7,215,10136);
insert into action_fact_table values(3,7,285,10137);
insert into action_fact_table values(3,7,212,10138);
insert into action_fact_table values(3,8,239,10139);
insert into action_fact_table values(3,8,294,10140);
insert into action_fact_table values(3,8,296,10141);
insert into action_fact_table values(3,8,251,10142);
insert into action_fact_table values(3,8,281,10143);
insert into action_fact_table values(4,2,392,10144);
insert into action_fact_table values(4,2,347,10145);
insert into action_fact_table values(4,3,318,10146);
insert into action_fact_table values(4,3,400,10147);
insert into action_fact_table values(4,3,378,10148);
insert into action_fact_table values(4,4,315,10149);
insert into action_fact_table values(4,4,318,10150);
insert into action_fact_table values(4,4,394,10151);
insert into action_fact_table values(4,4,382,10152);
insert into action_fact_table values(4,4,317,10153);
insert into action_fact_table values(4,5,314,10154);
insert into action_fact_table values(4,5,354,10155);
insert into action_fact_table values(4,5,338,10156);
insert into action_fact_table values(4,5,375,10157);
insert into action_fact_table values(4,5,317,10158);
insert into action_fact_table values(4,5,329,10159);
insert into action_fact_table values(4,5,342,10160);
insert into action_fact_table values(4,5,380,10161);
insert into action_fact_table values(4,6,313,10162);
insert into action_fact_table values(4,6,311,10163);
insert into action_fact_table values(4,6,336,10164);
insert into action_fact_table values(4,6,380,10165);
insert into action_fact_table values(4,6,355,10166);
insert into action_fact_table values(4,7,386,10167);
insert into action_fact_table values(4,7,322,10168);
insert into action_fact_table values(4,7,311,10169);
insert into action_fact_table values(4,7,367,10170);
insert into action_fact_table values(4,7,350,10171);
insert into action_fact_table values(4,7,384,10172);
insert into action_fact_table values(4,7,391,10173);
insert into action_fact_table values(4,7,331,10174);
insert into action_fact_table values(4,7,373,10175);
insert into action_fact_table values(4,7,314,10176);
insert into action_fact_table values(4,7,305,10177);
insert into action_fact_table values(4,7,331,10178);
insert into action_fact_table values(4,7,350,10179);
insert into action_fact_table values(4,7,376,10180);
insert into action_fact_table values(4,7,387,10181);
insert into action_fact_table values(4,7,312,10182);
insert into action_fact_table values(4,7,397,10183);
insert into action_fact_table values(5,1,404,10184);
insert into action_fact_table values(5,1,562,10185);
insert into action_fact_table values(5,1,511,10186);
insert into action_fact_table values(5,1,594,10187);
insert into action_fact_table values(5,1,541,10188);
insert into action_fact_table values(5,2,506,10189);
insert into action_fact_table values(5,2,427,10190);
insert into action_fact_table values(5,2,481,10191);
insert into action_fact_table values(5,2,463,10192);
insert into action_fact_table values(5,2,579,10193);
insert into action_fact_table values(5,2,455,10194);
insert into action_fact_table values(5,2,527,10195);
insert into action_fact_table values(5,3,465,10196);
insert into action_fact_table values(5,3,562,10197);
insert into action_fact_table values(5,3,434,10198);
insert into action_fact_table values(5,3,401,10199);
insert into action_fact_table values(5,3,464,10200);
insert into action_fact_table values(5,3,500,10201);
insert into action_fact_table values(5,4,554,10202);
insert into action_fact_table values(5,4,600,10203);
insert into action_fact_table values(5,5,483,10204);
insert into action_fact_table values(5,6,552,10205);
insert into action_fact_table values(5,6,565,10206);
insert into action_fact_table values(5,6,586,10207);
insert into action_fact_table values(5,6,544,10208);
insert into action_fact_table values(5,6,436,10209);
insert into action_fact_table values(5,6,531,10210);
insert into action_fact_table values(5,6,409,10211);
insert into action_fact_table values(5,6,524,10212);
insert into action_fact_table values(5,6,564,10213);

样品立方体(蒙德里安):

<Schema name="RunoffTriangleSchema">
  <Cube name="RunoffTriangleCube" visible="true" cache="true" enabled="true">
    <Table name="action_fact_table" schema="public">
    </Table>
    <Dimension type="StandardDimension" visible="true" foreignKey="project_key" name="Project">
      <Hierarchy name="Project" visible="true" hasAll="true">
        <Table name="project" schema="public" alias="">
        </Table>
        <Level name="Industry" visible="true" column="industry" uniqueMembers="false">
        </Level>
        <Level name="Project Name" visible="true" column="project_name" uniqueMembers="false">
        </Level>
      </Hierarchy>
    </Dimension>
    <Dimension type="StandardDimension" visible="true" foreignKey="distance_key" name="Distance">
      <Hierarchy name="Distance" visible="true" hasAll="true">
        <Table name="distance" schema="public" alias="">
        </Table>
        <Level name="Distance In Quarters" visible="true" column="distance_in_quarters" uniqueMembers="false">
        </Level>
        <Level name="Distance In Months" visible="true" column="distance_in_months" uniqueMembers="false">
        </Level>
      </Hierarchy>
    </Dimension>
    <Dimension type="StandardDimension" visible="true" foreignKey="account_key" name="Account">
      <Hierarchy name="Account" visible="true" hasAll="true">
        <Table name="account" schema="public">
        </Table>
        <Level name="Account Key" visible="true" column="account_key" uniqueMembers="false">
        </Level>
      </Hierarchy>
    </Dimension>
    <Measure name="CountActions" column="action_id" aggregator="count" visible="true">
    </Measure>
  </Cube>
</Schema>
4

2 回答 2

2

两次赏金,没有答案,我很惊讶。我找到了一个变通解决方案 - 使用 SQL 和 BIRT 引擎,我现在接近我正在寻找的东西。我仍然希望有人可以为OLAP解决这个问题。


为了完成这项工作,我有:

  • 自定义函数以返回动态选择的列
  • SQL 根据选定的列计算径流三角形数据
  • BIRT 2.6.1 中的报表显示结果并提供参数选择界面

动态返回列

    CREATE or replace FUNCTION bizdata.getColumns(_column1 text, _column2 text, _column3 text, _column4 text, _table text, _rqdl text)
      RETURNS TABLE(cmf1 text, cmf2 text, cmf3 text, outval numeric, rqdl text) AS $$
    BEGIN
        RETURN QUERY EXECUTE 
            'SELECT ' 
                || case when _column1 = 'None' then quote_literal('None') else quote_ident(_column1) end || '::text as cmf1,' 
                || case when _column2 = 'None' then quote_literal('None') else quote_ident(_column2) end || '::text as cmf2,' 
                || case when _column3 = 'None' then quote_literal('None') else quote_ident(_column3) end || '::text as cmf3,'   
                || quote_ident(_column4) || '::numeric as baseline,'
                || case when _rqdl = 'None' then 0::text else quote_ident(_rqdl)::text end || '::text as rqdl'  
            ' FROM '
                || 'bizdata.' || _table; 
    END;
     $$ LANGUAGE plpgsql;

Thi function takes the following as input variables:

 - _column1 - common mapping field number 1
 - _column2 - common mapping field number 2
 - _column3 - common mapping field number 3
 - _column4 - column used for aggregation (sum)
 - _table - table used for getting data
 - _rqdl - requested distance level

计算数据

Using bizdata.getColumns() function I can calculate triangle data using the following statement:


with 

params as (
    select 'cmf1'::varchar as prm_name, 'project_owner_name_short'::varchar as prm_value union all 
    select 'cmf2'::varchar as prm_name, 'project_source_name_short'::varchar as prm_value union all
    select 'cmf3'::varchar as prm_name, 'None'::varchar as prm_value union all
    select 'fact'::varchar as prm_name, 'amount'::varchar as prm_value union all    
    select 'fact_table'::varchar as prm_name, 'dwv_daily_allocation_fact'::varchar as prm_value union all       
    select 'baseline'::varchar as prm_name, 'tmp_nominal_value'::varchar as prm_value union all 
    select 'baseline_table'::varchar as prm_name, 'dw_project'::varchar as prm_value union all
    select 'rqdl'::varchar as prm_name, 'year_distance'::varchar as prm_value 
)

,baseline_data as (
    select 
        cmf1,
        cmf2,
        cmf3,
        sum(coalesce(outval,0)) as baseline
    from 
        bizdata.getColumns(
            (select prm_value from params where prm_name = 'cmf1'::text),
            (select prm_value from params where prm_name = 'cmf2'::text),
            (select prm_value from params where prm_name = 'cmf3'::text),
            (select prm_value from params where prm_name = 'baseline'::text),
            (select prm_value from params where prm_name = 'baseline_table'::text), 
            'None'
            )
    group by
        cmf1,
        cmf2,
        cmf3

)




,fact_data as (
    select 
        cmf1,
        cmf2,
        cmf3,
        rqdl::int as rqdl,
        sum(coalesce(outval,0)) as fact
    from 
        bizdata.getColumns(
            (select prm_value from params where prm_name = 'cmf1'::text),
            (select prm_value from params where prm_name = 'cmf2'::text),
            (select prm_value from params where prm_name = 'cmf3'::text),
            (select prm_value from params where prm_name = 'fact'::text),
            (select prm_value from params where prm_name = 'fact_table'::text),
            (select prm_value from params where prm_name = 'rqdl'::text)
            )
    group by
        cmf1,
        cmf2,
        cmf3,
        rqdl

)

select 
    case when cmf1 = 'None' then null else cmf1 end as cmf1,
    case when cmf2 = 'None' then null else cmf1 end as cmf,
    case when cmf3 = 'None' then null else cmf1 end as cmf1,
    rqdl,
    fact,
    baseline,
    sum(fact) over (partition by cmf1, cmf2, cmf3 order by rqdl) as cfact,
    sum(fact) over (partition by cmf1, cmf2, cmf3 order by rqdl) / baseline as cfactpct
from 
    fact_data 
    join baseline_data using (cmf1, cmf2, cmf3)

您可以看到我最多可以使用 3 个分组变量(cmf1、cmf2、cmf3)并选择任何距离属性(只要该属性在 dwv_daily_allocation_fact 中可用。分组变量应该在基线表和事实表中都可用(以获得普通组级别)

报告

最后一步是在 BIRT (2.6.1) 中创建报表,其中 SQL 的 params 部分中的参数被数据集参数替换并链接到报表参数。使用 BIRT 的人可能明白,其他人必须另谋出路。

参数选择界面 参数选择界面

输出报告 在此处输入图像描述

我仍然必须弄清楚表格的正确排序(所以历史最长的组是第一位的。

编辑:我已经想出了在 BIRT 交叉表中排序,现在它看起来像真正的三角形:

在此处输入图像描述

如果您需要有关我如何做到这一点的更详细描述,请告诉我。

于 2013-02-04T15:51:27.677 回答
0

我创建了 R 包,称为pgvint可以轻松计算复古曲线(径流三角形)的地方。该软件包位于github 上,目前仅支持 PostgreSQL 作为数据源。

示例输出: 在此处输入图像描述

在此处输入图像描述

此外,还有一个 Shiny 应用程序,可以在其中以不同的布局交互地显示复古数据:

在此处输入图像描述

于 2013-10-22T13:44:14.417 回答