0

我是编码新手。我有一个超过 20 年的每日流量平均值数据集。下面是一个例子:

          DATE   FLOW
1    10/1/2001   88.2
2    10/2/2001   77.6
3    10/3/2001   68.4
4    10/4/2001   61.5
5    10/5/2001   55.3
6    10/6/2001   52.5
7    10/7/2001   49.7
8    10/8/2001   46.7
9    10/9/2001   43.3
10  10/10/2001   41.3
11  10/11/2001   39.3
12  10/12/2001   37.7
13  10/13/2001   35.8
14  10/14/2001   34.1
15  10/15/2001   39.8

我需要创建一个循环,将前 6 天和当天(滚动每周平均值)相加,并将其打印到指定水年的数组中。我已经创建了一个聚合函数,将年平均每日平均值分成指定的水年。

# Separating dates into specific water years

wtr_yr <- function(dates, start_month=9)
  # Convert dates into POSIXlt
  POSIDATE = as.POSIXlt(NEW_DATE)
  # Year offset
  offset = ifelse(POSIDATE$mon >= start_month - 1, 1, 0)
  # Water year
  adj.year = POSIDATE$year + 1900 + offset
  
# Aggregating the water year function to take the mean
  
mean.FLOW=aggregate(data_set$FLOW,list(adj.year), mean)
4

1 回答 1

0

似乎它可以更容易地完成。但首先我需要准备更多的数据。

library(tidyverse)
library(lubridate)

df = tibble(
  DATE = seq(mdy("1/1/2010"), mdy("12/31/2022"), 1),
  FLOW = rnorm(length(DATE), 40, 10)
) 

输出

# A tibble: 4,748 x 2
   DATE        FLOW
   <date>     <dbl>
 1 2010-01-01  34.4
 2 2010-01-02  37.7
 3 2010-01-03  55.6
 4 2010-01-04  40.7
 5 2010-01-05  41.3
 6 2010-01-06  57.2
 7 2010-01-07  44.6
 8 2010-01-08  27.3
 9 2010-01-09  33.1
10 2010-01-10  35.5
# ... with 4,738 more rows

现在让我们按年和周数进行聚合

df %>% 
  group_by(year(DATE), week(DATE)) %>% 
  summarise(mean = mean(FLOW))

输出

# A tibble: 689 x 3
# Groups:   year(DATE) [13]
   `year(DATE)` `week(DATE)`  mean
          <dbl>        <dbl> <dbl>
 1         2010            1  44.5
 2         2010            2  39.6
 3         2010            3  38.5
 4         2010            4  35.3
 5         2010            5  44.1
 6         2010            6  39.4
 7         2010            7  41.3
 8         2010            8  43.9
 9         2010            9  38.5
10         2010           10  42.4
# ... with 679 more rows

请注意,对于函数week,第一周从 1 月 1 日开始。如果您想根据 ISO 8601 标准对周数进行编号,请使用该isoweek功能。或者,您也可以使用epiweek与美国 CDC 兼容的产品。

df %>% 
  group_by(year(DATE), isoweek(DATE)) %>% 
  summarise(mean = mean(FLOW))

输出

# A tibble: 681 x 3
# Groups:   year(DATE) [13]
   `year(DATE)` `isoweek(DATE)`  mean
          <dbl>           <dbl> <dbl>
 1         2010               1  40.0
 2         2010               2  45.5
 3         2010               3  33.2
 4         2010               4  38.9
 5         2010               5  45.0
 6         2010               6  40.7
 7         2010               7  38.5
 8         2010               8  42.5
 9         2010               9  37.1
10         2010              10  42.4
# ... with 671 more rows

如果您想更好地了解这些功能是如何工作的,请按照下面的代码

df %>% 
  mutate(
    w1 = week(DATE),
    w2 = isoweek(DATE),
    w3 = epiweek(DATE)
  )

输出

# A tibble: 4,748 x 5
   DATE        FLOW    w1    w2    w3
   <date>     <dbl> <dbl> <dbl> <dbl>
 1 2010-01-01  34.4     1    53    52
 2 2010-01-02  37.7     1    53    52
 3 2010-01-03  55.6     1    53     1
 4 2010-01-04  40.7     1     1     1
 5 2010-01-05  41.3     1     1     1
 6 2010-01-06  57.2     1     1     1
 7 2010-01-07  44.6     1     1     1
 8 2010-01-08  27.3     2     1     1
 9 2010-01-09  33.1     2     1     1
10 2010-01-10  35.5     2     1     2
# ... with 4,738 more rows
于 2022-02-13T22:09:39.987 回答