我有以下数据集:
dataset = [[0, 'Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
[0, 'Dill', 'none', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Eggs'],
[0,'Dill', 'milk'],
[1,'Dill'],
[1,'Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs'],
['......','........','........','........','........','........','........',],
[24,'Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]
df = pd.DataFrame(dataset)
0 1 2 3 4 5 6
0 0 Milk Onion Nutmeg Kidney Beans Eggs Yogurt
1 0 Dill none Nutmeg Kidney Beans Eggs Eggs
2 0 Dill milk None None None None
3 1 Dill None None None None None
4 1 Corn Onion Onion Kidney Beans Ice cream Eggs
5 ...... ........ ..... ........ ........ .......
6 24 Corn Onion Onion Kidney Beans Ice cream Eggs
适合关联规则学习算法
from mlxtend.frequent_patterns import fpgrowth
frequent_itemsets_fp=fpgrowth(df, min_support=0.001, use_colnames=True)
from mlxtend.frequent_patterns import association_rules
rules_fp = association_rules(frequent_itemsets_fp, metric="lift").sort_values ("lift", ascending=True).reset_index(drop=True
列0
指定这些事务发生的时间。小时有一天的持续时间-->0-23 hour
我想要的是使用每次作为输入对应于一小时的行来训练算法
所以首先我想使用O
小时的所有行训练算法并保存结果,然后是小时的行,直到1
小时结束。
有任何想法吗?