0

我们有一个 sql server 查询,我们需要为越来越多的变量生成 ntile,以便变量以各种排列方式相互组合。这是一个摘录,可以说明我的意思:

声明1:

ntile(10) over (partition by  MAUorALL, User_Type, fsi.Month_ID 
                    order by Objects_Created) AS Ntile_Mon_Objects_Created,

声明2:

ntile(10) over (partition by  MAUorALL, User_Type, fsi.Month_ID, *Country*
          order by Objects_Created) AS Ntile_Country_Objects_Created

声明 3:

ntile(10) over (partition by  MAUorALL, User_Type, fsi.Month_ID, *User*_Type
                 order by Objects_Created) AS Ntile_UT_Objects_Created

您可以看到语句是相同的,除了在第二个和第三个中创建了斜体列“国家”和“用户类型”。所以我们在不同的特异性水平上为同一个变量“Objects_Created”取 ntiles,我们还必须为这些变量的各种可能排列取 ntiles,例如:

声明4:

ntile(10) over (partition by  MAUorALL, User_Type, fsi.Month_ID, *Country, User_Type*
            order by Objects_Created) AS Ntile_Country_UT_Objects_Created

我们可以手动对这些排列进行编码,但如果我们可以使用 sqlalchemy 来执行这些变量的所有排列,它可能会使事情变得更容易。有人有我可以重新使用的例子吗?

谢谢你的帮助!

4

1 回答 1

0

我不知道fsi与其他列的关系如何,但假设所有数据都在一个模型中(很容易通过sqlalchemy查询扩展),如下所示:

class User(Base):
    __tablename__ = 't_users'
    id = Column(Integer, primary_key=True)
    MAUorALL = Column(String)
    User_Type = Column(String)
    Country = Column(String)
    Month_ID = Column(Integer)
    Objects_Created = Column(Integer)

该任务是通过简单地使用itertools.permutations(或itertools.combinations,取决于您想要实现的目标)创建查询来完成的。下面的代码将生成一个查询User表,其中包含各种不同的表ntiles。我假设阅读代码足以了解正在发生的事情:

# configuration: {label: Column}
column_labels = {
        'Country': User.Country,
        'UT': User.User_Type,
        }

def get_ntile(additional_columns=None):
    """ @return: sqlalchemy expression for selecting a given ntile() using
    predefined as well as *additional* columns.
    """
    partition_by = [
        User.MAUorALL,
        User.User_Type,
        User.Month_ID,
        ]
    label = "Ntile_Objects_Created"
    if additional_columns:
        lbls = []
        for col_name in additional_columns:
            col = column_labels[col_name]
            partition_by.append(col)
            lbls.append(col_name)
        label = "Ntile_{}_Objects_Created".format("_".join(lbls))
    xprs = over(
            func.ntile(10),
            partition_by = partition_by,
            order_by = User.Objects_Created,
            ).label(label)
    return xprs

def get_query(additional_columns=['UT', 'Country']):
    """ @return: a query object which selects a User with additional ntiles
    for predefined columns (fixed) and all possible permutations of
    *additional_columns*
    """
    from itertools import permutations#, combinations
    tiles = [get_ntile(comb)
            for r in range(len(additional_columns) + 1)
            for comb in permutations(additional_columns, r)
            ]
    q = session.query(User, *tiles)
    return q

q = get_query()
print [_c["name"] for _c in q.column_descriptions]
# >>> ['User', 'Ntile_Objects_Created', 'Ntile_UT_Objects_Created', 'Ntile_Country_Objects_Created', 'Ntile_UT_Country_Objects_Created', 'Ntile_Country_UT_Objects_Created']

for tile in q.all():
    print tile
于 2014-03-06T08:55:29.897 回答