1

I'm using ggplot2 and trying to change the order of bins. I'm using the data for NY's Stop and Frisk program found here: http://www.nyclu.org/content/stop-and-frisk-data

The times are given as integers (ex: 5 = 12:05 AM, 355 = 3:55 AM, 2100 = 9 PM).

I used the following to create a histogram of the times of stops

myplot <- ggplot(Stop.and.Frisk.2011) + geom_histogram(aes(x=timestop),binwidth=300)

This gave me a fairly good graph of times, with the bins going from Midnight-3 AM, 3AM - 6 AM, 6 AM - 9 AM, etc.

However, I'm hoping to move the first two bins (Midnight - 3 AM and 6 AM - 9 AM) to the end to simulate more of a normal work day.

Is there a simple way to change the order of the bins? I've tried using the breaks function, but can't find a way to get it to loop back around.

Essentially, I want the bins to be in the following order: 600-900, 900-1200, 1200-1500, 1500-1800, 1800-2100, 2100-2400, 0-300, 300-600.

Thanks in advance!

4

2 回答 2

0

一种方法是在调用ggplot. 这是一个使用该cut函数创建 3 小时间隔的示例:

# Load ggplot2 for plotting
library(ggplot2)

# Read in the data
df <- read.csv('SQF 2012.csv', header = TRUE)

# Create intervals every 3 hours based
# on the `timestop` variable
df$intervals <- cut(df$timestop,
                    breaks = c(0, 300, 600,
                               900, 1200, 1500,
                               1800, 2100, 2400))

# Re-order the sequence prior to plotting
df$sequence <- ifelse(df$intervals == '(600,900]', 1, NA)
df$sequence <- ifelse(df$intervals == '(900,1.2e+03]', 2, df$sequence)
df$sequence <- ifelse(df$intervals == '(1.2e+03,1.5e+03]', 3, df$sequence)
df$sequence <- ifelse(df$intervals == '(1.5e+03,1.8e+03]', 4, df$sequence)
df$sequence <- ifelse(df$intervals == '(1.8e+03,2.1e+03]', 5, df$sequence)
df$sequence <- ifelse(df$intervals == '(2.1e+03,2.4e+03]', 6, df$sequence)
df$sequence <- ifelse(df$intervals == '(0,300]', 7, df$sequence)
df$sequence <- ifelse(df$intervals == '(300,600]', 8, df$sequence)
df$sequence <- as.numeric(df$sequence)

# Create the plot
ggplot(df, aes(x = sequence)) +
  geom_histogram(binwidth = 0.5) +
  scale_x_continuous(breaks = c(1, 2, 3, 4, 5, 6, 7, 8),
                     labels = c('6AM-9AM', '9AM-12PM', '12PM-3PM', '3PM-6PM',
                                '6PM-9PM', '9PM-12AM', '12AM-3AM', '3AM-6AM')) +
  xlab('Time') +
  ylab('Number\n') + 
  theme(axis.text = element_text(size = rel(1.1))) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  theme(axis.title = element_text(size = rel(1.1), face = 'bold'))

输出

于 2014-10-30T01:11:51.160 回答
0

这是一种方法。我将 2400 添加到 0 到 599 之间的所有时间停止值。通过这种方式,我将您想要的时间范围移动到图表的末尾(即右侧)。当我绘制图形时,我为您修改了 x 轴。

library(data.table)
library(dplyr)

# Read the file
foo <- fread("SQF 2012.csv", header = TRUE, na.strings="NA", colClasses="character")

# Change timestop values
ana <- setDF(foo) %>%
       select(datestop,timestop) %>%
       mutate(timestop = as.numeric(timestop), 
              timestop = ifelse(timestop >= 0 & timestop < 600, 2400 + timestop, timestop))

# Draw the graph
ggplot(data = ana, aes(x = timestop)) +
    geom_histogram() +
    scale_x_continuous(limit = c(600, 3000),
                       breaks = c(600, 900, 1200, 1500,
                                  1800, 2100, 2400, 2700, 3000),
                       labels = c("6:00", "9:00", "12:00", "15:00",
                                  "18:00", "21:00", "24:00", "3:00", "6:00")) +
    xlab("Time")

在此处输入图像描述

于 2014-10-30T01:14:57.843 回答