经过 8 年多的时间后,这仍然是一个有效且经典的问题。现在这是一个完整的答案,@chthonicdaemon 提供了极好的线索。
library(ggplot)
library(data.table)
### I use a preloaded data.table. You can use any data.table with one numeric column x.
### Extract counts & breaks of the histogram bins.
### I have taken breaks as 40 but you can take any number as needed.
### But do keep a large number of breaks so that you get multiple peaks.
counts <- hist(dt1$x,breaks = 40)$counts
breaks <- hist(dt1$x, breaks = 40)$breaks
### Note: the data.table `dt1` should contain at least one numeric column, x
### now name the counts vector with the corresponding breaks
### note: the length of counts is 1 less than the breaks
names(counts) <- breaks[-length(breaks)]
### Find index for those counts that are the peaks
### (see previous classic clue to take a double diff)
### note: the double diff causes the 2 count shrink, hence
#### I have added a FALSE before and after the results
### to align the T/F vector with the count vector
peak_indx <- c(F,diff(sign(c(diff(counts))))==-2,F) %>% which()
topcounts <- counts[peak_indx]
topbreaks <- names(topcounts) %>% as.numeric()
### Now let's use ggplot to plot the histogram along with visualised peaks
dt1 %>%
ggplot() +
geom_histogram(aes(x),bins = 40,col="grey51",na.rm = T) +
geom_vline(xintercept = topbreaks + 50,lty = 2)
# adjust the value 50 to bring the lines in the centre