I have a table which looks something like this
import numpy as np
import pandas as pd
tmp=[["","5-9",""],["","",""],["17-","","4- -9 27-"],["-6","",""],["","","-15"]]
dat=pd.DataFrame(tmp).rename(columns={0:"V0",1:"V1",2:"V2"})
dat["Month"]=np.arange(1,6)
dat["Year"]=np.repeat(2015,5)
V0 V1 V2 Month Year
0 5-9 1 2015
1 2 2015
2 17- 4- -9 27- 3 2015
3 -6 4 2015
4 -15 5 2015
...
The numbers in the table represent the days (in the month) when a certain event happened. Note: months can have multiple events and events can span over multiple months.
V1, V2 and V3 are three different devices, each having its own separate events. So we have three different time series.
I would like to convert this table to a time series data frame, that is break it down per day for each device. Each row would be one day for one month (for one year) and each column would now only have values of 0 or 1, 0 if no event happened on that day, 1 otherwise (dummy variable). The result would contain three different time series, one for each device. How would I do that?
This is what the output would look like
V0 V1 V2 Day Month Year
0 0 0 0 1 1 2015
1 0 0 0 2 1 2015
2 0 0 0 3 1 2015
3 0 0 0 4 1 2015
4 0 0 0 5 1 2015
5 0 1 0 6 1 2015
6 0 1 0 7 1 2015
7 0 1 0 8 1 2015
8 0 1 0 9 1 2015
9 0 1 0 10 1 2015
10 0 0 0 11 1 2015
11 0 0 0 12 1 2015
12 0 0 0 13 1 2015
...