0

成像有一个数据框,其中缺少很多 transaction_total 和 balance_total 和 date

id,date,transaction_total,balance_total
1,01/01/2019,-1,102
1,01/02/2019,-2,100
1,01/03/2019,-3,
1,01/04/2019,,
1,01/05/2019,-4,
2,01/01/2019,-2,200
2,01/02/2019,-2,100
2,01/04/2019,,
2,01/05/2019,-4,

这是创建输入脚本:

import pandas as pd
import numpy as np

users=pd.DataFrame(
                [
                {'id':1,'date':'01/01/2019', 'transaction_total':-1, 'balance_total':102},
                {'id':1,'date':'01/02/2019', 'transaction_total':-2, 'balance_total':100},
                {'id':1,'date':'01/03/2019', 'transaction_total':-3, 'balance_total':''},
                {'id':1,'date':'01/04/2019', 'transaction_total':'', 'balance_total':''},
                {'id':1,'date':'01/05/2019', 'transaction_total':-4, 'balance_total':''},
                {'id':2,'date':'01/01/2019', 'transaction_total':-2, 'balance_total':200},
                {'id':2,'date':'01/02/2019', 'transaction_total':-2, 'balance_total':100},
                {'id':2,'date':'01/04/2019', 'transaction_total':'', 'balance_total':''},
                {'id':2,'date':'01/05/2019', 'transaction_total':-4, 'balance_total':''}  
                ]
                )

目标是实现以下目标:

所需的最终输出:

id,date,balance_total
1,01/01/2019,102
1,01/02/2019,100
1,01/03/2019,97
1,01/04/2019,97
1,01/05/2019,93
2,01/01/2019,200
2,01/02/2019,100
2,01/03/2019,97
2,01/04/2019,97
2,01/05/2019,93

(1)如果缺少日期,请用前一个日期的余额填写日期(我认为此链接中的重新索引解决方案可能会起作用Pandas 填充组中缺少的日期和值

(2)如果有有效的'date'和'transaction_total'时缺少balance_total,则在“balance_total”中填写“上一个日期的balance_total-balance_total缺失时的那一天的transaction_total”(第3行的情况) : 100+ (-3)=97)

(3) 如果有一个有效的日期,但是transaction_total和balance_total都是NaN,只需填写最后一个日期的balance_total(例如第4行:因为根据之前的计算,01/03/2019的total_balance将是97, 2019 年 1 月 4 日余额将为 97,因为没有 transaction_total。)

所需的元数据输出:

id,date,transaction_total,balance_total
1,01/01/2019,-1,102
1,01/02/2019,-2,100
1,01/03/2019,-3,97
1,01/04/2019,0,97
1,01/05/2019,-4,93
2,01/01/2019,-2,200
2,01/02/2019,-2,100
2,01/03/2019,-3,97
2,01/04/2019,,97
2,01/05/2019,-4,93
4

0 回答 0