1

I have data over 3 years that I would like to plot.

However I would like to plot each year side by side.

In order to do this, I'd like to make the date 03/17/2010 become 03/17, so that it lines up with 03/17/2011.

any ideas how to do that in R?

Here is an image of what I'd like it to look like: enter image description here

4

4 回答 4

4

R has its own Date representation, which you should use. Once you convert data to Date it is easy to manipulate their format using the format function.

http://www.statmethods.net/input/dates.html

as an example

> d <- as.Date( "2010-03-17" )
> d
[1] "2010-03-17"
> format( d, format="%m/%d")
[1] "03/17"

or with your data style

> format( as.Date("03/17/2010", "%m/%d/%Y"), format="%m/%d")
[1] "03/17"
于 2013-08-16T07:59:42.690 回答
1

You can use R's built in style for dates, using as.Date() and format to choose only month and day:

> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> format(as.Date(dates, "%m/%d/%y"), "%m/%d")
[1] "02/27" "02/27" "01/14" "02/28" "02/01"

For your example, just use your own dates.

I found this out using R's help where the previous was the example:

> ?as.Date
> ?format 
于 2013-08-16T07:59:52.043 回答
0

Here's my solution:

It involves formatting the date to a string (without year) and then back to a date, which will default all of the dates to the same (current year).

The code and sample input file are below:

Code

# Clear all
rm(list = ls())

# Load the library that reads xls files
library(gdata)

# Get the data in
data = read.csv('Readings.csv')

# Extract each Column
readings = data[,"Reading"]
dates = as.Date(data[,"Reading.Date"])

# Order the data correctly
readings = readings[order(dates)]
dates = dates[order(dates)]

# Calculate the difference between each date (in days) and readings
diff.readings = diff(readings)
diff.dates = as.numeric(diff(dates)) # Convert from days to an integer

# Calculate the usage per reading period
usage.per.period = diff.readings/diff.dates

# Get Every single day between the very first reading and the very last
# seq will create a sequence: first argument is min, second is max, and 3rd is the step size (which in this case is 1 day)
days = seq(min(dates),max(dates), 1)
# This creates an empty vector to get data from the for loop below
usage.per.day = numeric()

# The length of the diff.dates is the number of periods that exist.
for (period in 1:(length(diff.dates))){
    # to convert usage.per.period to usage.per.day, we need to replicate the 
    # value for the number of days in that period. the function rep will 
    # replicate a number: first argument is the number to replicate, and the 
    # second number is the number of times to replicate it. the function c will 
    # concatinate the current vector and the new period, sort of 
    # like value = value + 6, but with vectors. 
    usage.per.day = c(usage.per.day, rep(usage.per.period[period], diff.dates[period]))
}
# The for loop above misses out on the last day, so I add that single value manually
usage.per.day[length(usage.per.day)+1] = usage.per.period[period]

# Get the number of readings for each year
years = names(table(format(dates, "%Y")))

# Now break down the usages and the days by year
# list() creates an empty list
usage.per.day.grouped.by.year = list()
year.day = list()
# This defines some colors for plotting, rainbow(n) will give you 
colors = rainbow(length(years))
for (year.index in 1:length(years)){
    # This is a vector of trues and falses, to say whether a day is in a particular
    # year or not
    this.year = (days >= as.Date(paste(years[year.index],'/01/01',sep="")) &
                 days <= as.Date(paste(years[year.index],'/12/31',sep="")))
    usage.per.day.grouped.by.year[[year.index]] = usage.per.day[this.year]
    # We only care about the month and day, so drop the year
    year.day[[year.index]] = as.Date(format(days[this.year], format="%m/%d"),"%m/%d")
    # In the first year, we need to set up the whole plot
    if (year.index == 1){
        # create a png file with file name image.png
        png('image.png')
        plot(year.day[[year.index]], # x coords
             usage.per.day.grouped.by.year[[year.index]], # y coords
             "l", # as a line
             col=colors[year.index], # with this color
             ylim = c(min(usage.per.day),max(usage.per.day)), # this y max and y min
             ylab='Usage', # with this lable for y axis
             xlab='Date', # with this lable for x axis
             main='Usage Over Time') # and this title
    }
    else {
        # After the plot is set up, we just need to add each year
        lines(year.day[[year.index]], # x coords
            usage.per.day.grouped.by.year[[year.index]], # y coords
            col=colors[year.index]) # color
    }
}
# add a legend to the whole thing
legend("topright" , # where to put the legend
    legend = years, # what the legend names are
    lty=c(1,1), # what the symbol should look like
    lwd=c(2.5,2.5), # what the symbol should look like
    col=colors) # the colors to use for the symbols
dev.off() # save the png to file

Input file

Reading Date,Reading
1/1/10,10
2/1/10,20
3/6/10,30
4/1/10,40
5/7/10,50
6/1/10,60
7/1/10,70
8/1/10,75
9/22/10,80
10/1/10,85
11/1/10,90
12/1/10,95
1/1/11,100
2/1/11,112.9545455
3/1/11,120.1398601
4/1/11,127.3251748
5/1/11,134.5104895
6/1/11,141.6958042
7/1/11,148.8811189
8/1/11,156.0664336
9/17/11,190
10/1/11,223.9335664
11/1/11,257.8671329
12/1/11,291.8006993
1/1/12,325.7342657
2/1/12,359.6678322
3/5/12,375
4/1/12,380
5/1/12,385
6/1/12,390
7/1/12,400
8/1/12,410
9/1/12,420
于 2013-08-22T12:28:24.383 回答
0

seasonplot() does this very well!

于 2015-07-07T14:26:01.093 回答