2

I am trying to plot the frequencies of different journals in a list of research papers I fetched. Each row in my data frame corresponds to a paper, for which I have the associated journal.

I did the following to plot the levels (bins) in a histogram:

journal = main$Publication.Journal
tb <- table(journal)
barplot(tb[order(tb, decreasing=T)])
axis(2,at=seq(0, 12, 1), lab=seq(0, 12, 1))

journal_bins

Only problem is, I want to cut out from the graph (or table itself) the journals with a frequency of 1, since I am trying to observe only the most frequent journals (hence the ordered barplot). Any insight on how I can do this?

Many thanks! Nathanael

4

3 回答 3

2

It's hard to answer your specific problem without the dataset in your example so here's one solution using a mock example:

x <- rpois(100,100)
xt <- table(x)
xtd <- as.data.frame(xt)
xtds <- subset(xtd, Freq>1)  # use subset, as noted by @baptiste
plot(Freq ~ x, xtd, type="h", ylim=c(0,10))
lines(Freq ~ x, xtds, type="h", col="red")

enter image description here

I don't know if you can easily coerce a data.frame to a table, as far as I know, so you may want a different solution. Also, note the results of the logical test, xt > 1 for example, might be useful to you.

于 2013-05-29T00:53:05.727 回答
2

Or very simply

tb <- tb[tb>1]

table objects are subsettable the same ways any array objects are.

于 2013-05-29T14:25:38.083 回答
1

You can try something like this:

journal <- read.table(
  header=TRUE, text='Name  Article
JAMA    A
MAD B
Cigar_Afficianado   C
Bowling_Weekly  D
JAMA    E
MAD F
Cigar_Afficianado   G
JAMA    H
MAD I
Cigar_Afficianado   J
')# create data set
library(plyr)
table(journal$Name) # as in your example
journal <- ddply(journal, .(Name), transform, Article_count = length(Article))
journal #shows new column from transform in plyr with a count of articles
journal <- journal[journal$Article_count > 1, ] #removes the low counts
journal #shows that the low counts are removed
于 2013-05-29T13:53:54.437 回答