我目前正在尝试建模营销多渠道归因。我遇到的所有文章和包都使用特殊的“开始”状态,并且使用此处给出的以下矩阵运算([Python 中的马尔可夫链])[1] 基于该开始状态计算移除效果。
def removal_effects(df, conversion_rate):
removal_effects_dict = {}
channels = [channel for channel in df.columns if channel not in ['Start',
'Null',
'Conversion']]
for channel in channels:
removal_df = df.drop(channel, axis=1).drop(channel, axis=0)
for column in removal_df.columns:
row_sum = np.sum(list(removal_df.loc[column]))
null_pct = float(1) - row_sum
if null_pct != 0:
removal_df.loc[column]['Null'] = null_pct
removal_df.loc['Null']['Null'] = 1.0
removal_to_conv = removal_df[
['Null', 'Conversion']].drop(['Null', 'Conversion'], axis=0)
removal_to_non_conv = removal_df.drop(
['Null', 'Conversion'], axis=1).drop(['Null', 'Conversion'], axis=0)
removal_inv_diff = np.linalg.inv(
np.identity(
len(removal_to_non_conv.columns)) - np.asarray(removal_to_non_conv))
removal_dot_prod = np.dot(removal_inv_diff, np.asarray(removal_to_conv))
removal_cvr = pd.DataFrame(removal_dot_prod,
index=removal_to_conv.index)[[1]].loc['Start'].values[0]
removal_effect = 1 - removal_cvr / conversion_rate
removal_effects_dict[channel] = removal_effect
return removal_effects_dict
我的问题主要分为两部分:
- 我们可以将第一次触摸视为每条路径的开始状态吗?
- 如果没有开始状态,我们如何计算移除效果(任何文档或公式的解释都会非常有帮助)
我的参考资料:
https://gist.github.com/MortenHegewald/fb1d8051cd818c25283cbcbc4b587e5c#file-removal_effects-py