我遇到了一个潜在的问题,希望你能帮助我:)
例如,我有以下显示多个商店的数据表,每次访问者进入商店时,都会记录时间和日期。这意味着每一行/每一行都是 1 位进入其中一家商店的访客。
data <- structure(list(store.ID = c("1", "1", "1", "1", "1",
"2", "2", "2", "2", "2", "3", "3", "3",
"3", "3", "4", "4", "4", "4", "4"), Time = structure(c(6L,
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 1L, 2L, 3L, 4L, 5L,
16L, 17L, 18L, 19L, 20L), .Label = c(" 12:09:19", " 12:09:25",
" 13:09:30", " 13:09:35", " 14:09:40", " 12:00:03", " 12:00:09",
" 12:00:14", " 14:00:25", " 16:00:32", " 12:27:19", " 13:27:25",
" 14:27:41", " 14:27:46", " 17:27:59", " 12:46:10", " 12:46:19", " 13:46:29",
" 14:46:39", " 15:46:50"), class = "factor"), Date = structure(c(1351728000,
1351728000, 1351728000, 1351728000, 1351728000, 1351814400, 1351814400,
1351814400, 1351814400, 1351814400, 1351814400, 1351814400, 1351814400,
1351814400, 1351814400, 1351814400, 1351814400, 1351814400, 1351814400,
1351814400), class = c("POSIXct", "POSIXt"), tzone = "UTC")), .Names = c("storeID", "Time", "Date"), class = "data.frame", row.names = c(NA,
-20L))
[EDIT] The stores are open 24/7. Now I would like is to have a solution / way that assigns each visit / row to one of the 24 hour periods in a day (i.e., 09.00-10.00 being 1, 10.00-11.00 being 2, etc). Then I would like to have the number of visitors per hour period over two consecutive days. I would like to be able to separate this for certain fixed factors, e.g., storeID and City (not shown in this example). Also, if no visitors enter the store, I would like the data file to show that within this time interval there was no visitor, which should in this case return 0). [EDIT]
Note that my data file is huge, having over 700k rows.
I hope I made my issue clear.
MvZB