0

I have recently started using R. So now I am trying to get some data out of it. However, the results I get are quite confusing. I have datas from the year 1961 to 1963 of everyday in the format 1961-04-25. I created a vector called: date

So when I try to use grep to just search for the period between April 10 and May 21 and display the dates I used this command:

date[date >= grep("196.-04-10", date, value = TRUE) & 
       date <= grep("196.-05-21", date, value = TRUE)] 

The results I get is are somehow confusing as it is making 3 days steps instead of giving me every single day... see below.

[1] "1961-04-10" "1961-04-13" "1961-04-16" "1961-04-19" "1961-04-22" "1961-04-25" "1961-04-28" "1961-05-01" "1961-05-04" "1961-05-07" "1961-05-10"
[12] "1961-05-13" "1961-05-16" "1961-05-19" "1962-04-12" "1962-04-15" "1962-04-18" "1962-04-21" "1962-04-24" "1962-04-27" "1962-04-30" "1962-05-03"
[23] "1962-05-06" "1962-05-09" "1962-05-12" "1962-05-15" "1962-05-18" "1962-05-21" "1963-04-11" "1963-04-14" "1963-04-17" "1963-04-20" "1963-04-23"
[34] "1963-04-26" "1963-04-29" "1963-05-02" "1963-05-05" "1963-05-08" "1963-05-11" "1963-05-14" "1963-05-17" "1963-05-20"
4

2 回答 2

2

我认为该grep策略被误导了,但也许这样的事情会起作用......基本上,我正在计算一年中的一天(朱利安日期,yday())并使用它进行比较。

z <- as.Date(c("1961-04-10","1961-04-11","1961-04-12",
               "1961-05-21","1961-05-22","1961-05-23",
               "1963-04-09","1963-04-12","1963-05-21","1963-05-22"))
library(lubridate)
z[yday(z)>=yday(as.Date("1961-04-10")) & yday(z)<=yday(as.Date("1961-05-21"))]
## [1] "1961-04-10" "1961-04-11" "1961-04-12" "1961-05-21" "1963-04-12"
## [6] "1963-05-21"yz <- year(z)

实际上,这个解决方案对于闰年来说是脆弱的......更好(?):

yz <- year(z)
z[z>=as.Date(paste0(yz,"-04-10")) & z<=as.Date(paste0(yz,"-05-21"))]

(你一定要自己测试这个,我没有仔细测试过!)

于 2012-11-25T17:55:38.587 回答
1

对变量使用日期格式是最好的选择。

## set up some test data
datevar <- seq.Date(as.Date("1961-01-01"),as.Date("1963-12-31"),by="day")
test <- data.frame(date=datevar,id=1:(length(datevar)))
head(test)

## which looks like:
> head(test)
        date id
1 1961-01-01  1
2 1961-01-02  2
3 1961-01-03  3
4 1961-01-04  4
5 1961-01-05  5
6 1961-01-06  6

## find the date ranges you want
selectdates <-  
    (format(test$date,"%m") == "04" & as.numeric(format(test$date,"%d")) >= 10) |
    (format(test$date,"%m") == "05" & as.numeric(format(test$date,"%d")) <= 21)

## subset the original data
result <- test[selectdates,]

## which looks as expected:    
> result
          date  id
100 1961-04-10 100
101 1961-04-11 101
102 1961-04-12 102
103 1961-04-13 103
104 1961-04-14 104
105 1961-04-15 105
106 1961-04-16 106
107 1961-04-17 107
108 1961-04-18 108
109 1961-04-19 109
110 1961-04-20 110
111 1961-04-21 111
112 1961-04-22 112
113 1961-04-23 113
114 1961-04-24 114
115 1961-04-25 115
116 1961-04-26 116
117 1961-04-27 117
118 1961-04-28 118
119 1961-04-29 119
120 1961-04-30 120
121 1961-05-01 121
122 1961-05-02 122
123 1961-05-03 123
124 1961-05-04 124
125 1961-05-05 125
126 1961-05-06 126
127 1961-05-07 127
128 1961-05-08 128
129 1961-05-09 129
130 1961-05-10 130
131 1961-05-11 131
132 1961-05-12 132
133 1961-05-13 133
134 1961-05-14 134
135 1961-05-15 135
136 1961-05-16 136
137 1961-05-17 137
138 1961-05-18 138
139 1961-05-19 139
140 1961-05-20 140
141 1961-05-21 141
465 1962-04-10 465
...
于 2012-11-26T00:31:06.770 回答