0

希望学习如何以更优雅的方式编写此解决方案。需要将一组行拆分为较小的部分并控制利用率以及计算余额。当前的解决方案无法正确生成余额

import pandas as pd
import numpy as np

box_list = [['Box0', 0.2],
               ['Box1', 1.0],
               ['Box2', 1.8],
               ['Box4', 2.0],
               ['Box8', 4.01],]
  
sdf = pd.DataFrame(box_list, columns = ['Name', 'Size'])

print(sdf)
姓名 尺寸
1 盒子1 1.00
2 方框2 1.80
3 方框4 2.00
4 Box8 4.01
df = pd.DataFrame({'Name': np.repeat(sdf['Name'], sdf['Size'].apply(np.ceil)),
                    'Size': np.repeat(sdf['Size'], sdf['Size'].apply(np.ceil)),})

df['Max_Units']=df['Size'].apply(lambda x: np.ceil(x) if x>1.0 else 1.0) 
df = df.reset_index()
df['Utilization'] =df['Size'].apply(lambda x: x-int(x) if x>1.0 else (x if x<1.0 else 1.0))  
df['Balance'] =df['Max_Units'] 

g = df.groupby(['index'], as_index=0, group_keys=0)

df['Utilization'] = g.apply(lambda x: 
                           pd.Series(np.where((x.Balance.shift(1) >= 1.0), 
                           1.0, 
                           x.Utilization))).values
df.loc[(df.Utilization == 0.0), ['Utilization']] = 1.0

df['Balance'] = g.apply(lambda x: 
                           pd.Series(np.where((x.Balance.shift(1) >= 1.0), 
                           x.Max_Units-x.Utilization, 
                           0))).values
print(df)
指数 姓名 尺寸 Max_Units 利用率 平衡
0 0 盒子0 0.20 1.0 0.20 0.0
1 1 盒子1 1.00 1.0 1.00 0.0
2 2 方框2 1.80 2.0 0.80 0.0
3 2 方框2 1.80 2.0 1.00 1.0
4 3 方框4 2.00 2.0 1.00 0.0
5 3 方框4 2.00 2.0 1.00 1.0
6 4 Box8 4.01 5.0 0.01 0.0
7 4 Box8 4.01 5.0 1.00 4.0
8 4 Box8 4.01 5.0 1.00 4.0
9 4 Box8 4.01 5.0 1.00 4.0
10 4 Box8 4.01 5.0 1.00 4.0
4

1 回答 1

0

我不确定我是否完全理解所有这些值应该代表什么。

但是,我已经以更直接的方式为您的样本集实现了正确的期望输出:

import pandas as pd
import numpy as np

box_list = [['Box0', 0.2],
            ['Box1', 1.0],
            ['Box2', 1.8],
            ['Box4', 2.0],
            ['Box8', 4.01], ]

df = pd.DataFrame(box_list, columns=['Name', 'Size'])

# Set ceil column to ceil of size since it's used more than once
df['ceil'] = df['Size'].apply(np.ceil)

# Duplicate Rows based on Ceil of Size
df = df.loc[df.index.repeat(df['ceil'])]

# Get Max Units by comparing it to the ceil column
df['Max_Units'] = df.apply(lambda s: max(s['ceil'], 1), axis=1)

# Extract Decimal Portion By Using % 1 (Catch Special Case of x == 1)
df['Utilization'] = df['Size'].apply(lambda x: 1 if x == 1 else x % 1)

# Everywhere Max_Units cumcount is not 0 set Utilization to 1
df.loc[df.groupby(df['Max_Units']).cumcount().ne(0), 'Utilization'] = 1

# Set Balance to index cumcount as float
df['Balance'] = df.groupby(df.index).cumcount().astype(float)

# Drop Unnecessary Column and reset index for output
df = df.drop(columns=['ceil']).reset_index()

# For Display
print(df)

输出:

指数 姓名 尺寸 Max_Units 利用率 平衡
0 0 盒子0 0.20 1.0 0.20 0.0
1 1 盒子1 1.00 1.0 1.00 0.0
2 2 方框2 1.80 2.0 0.80 0.0
3 2 方框2 1.80 2.0 1.00 1.0
4 3 方框4 2.00 2.0 1.00 0.0
5 3 方框4 2.00 2.0 1.00 1.0
6 4 Box8 4.01 5.0 0.01 0.0
7 4 Box8 4.01 5.0 1.00 1.0
8 4 Box8 4.01 5.0 1.00 2.0
9 4 Box8 4.01 5.0 1.00 3.0
10 4 Box8 4.01 5.0 1.00 4.0
于 2021-04-25T06:29:10.643 回答