I have a dataset that contains 4 different events-types (A, B, C, D) that happens a lot of times daily. I have such a log for over a year. The "EventType" attribute is a 'factor'.
For eg, my dataset looks like this:
DateTime,EventType
6/5/2013 9:35,B
6/5/2013 9:35,A
6/5/2013 9:35,B
6/5/2013 9:36,D
6/5/2013 9:39,A
6/5/2013 9:40,B
7/5/2013 9:35,B
7/5/2013 9:35,A
7/5/2013 9:35,B
7/5/2013 9:36,D
7/5/2013 9:39,A
7/5/2013 9:40,B
8/5/2013 9:35,A
8/5/2013 9:35,A
8/5/2013 9:35,B
8/5/2013 9:36,B
8/5/2013 9:39,A
8/5/2013 9:40,B
9/5/2013 9:35,B
9/5/2013 9:35,B
9/5/2013 9:35,B
9/5/2013 9:36,D
9/5/2013 9:39,A
9/5/2013 9:40,A
I want to plot the total-count of all the event-types on a daily basis. x-axis: date-time, Y-axis: count.
I like to try ddply to accomplish this, but, I am not very sure how to go about it. This is what I have done:
data <- read.csv("C:/analytics/mydata.csv", sep=",", header=TRUE)
k <- ddply(data, "data$DateTime", function(x) count = nrow(x))
The above gives the following output:
data$DateTime V1
1 6/5/2013 9:35,A 1
2 6/5/2013 9:35,B 2
3 6/5/2013 9:36,D 1
4 6/5/2013 9:39,A 1
5 6/5/2013 9:40,B 1
6 7/5/2013 9:35,A 1
7 7/5/2013 9:35,B 2
8 7/5/2013 9:36,D 1
9 7/5/2013 9:39,A 1
10 7/5/2013 9:40,B 1
11 8/5/2013 9:35,A 2
12 8/5/2013 9:35,B 1
13 8/5/2013 9:36,B 1
14 8/5/2013 9:39,A 1
15 8/5/2013 9:40,B 1
16 9/5/2013 9:35,B 3
17 9/5/2013 9:36,D 1
18 9/5/2013 9:39,A 1
19 9/5/2013 9:40,A 1
My Question: How do I achieve the same behavior if I want to get the counts by day or month? I want to use lubridate to get day or month, but, after that, I do not know how to use that to group and subsequently to get the counts.
Something like k <- ddply(data, "day(data$EventType)", function(x) count = nrow(x))
Once I have it, I can believe I can plot them nicely. Your inputs are very much appreciated.
Thanks.