0

我正在尝试使用包含“单”“双”引号的glue包创建R 一个字符串。

作为代表,请考虑SQL我要构建的以下类型的查询字符串:

CREATE TABLE fact_final_table AS 
(SELECT tab1.id,
    AVG(tab2."MV") FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date - integer '7'
                                                      AND tab1.start_date::date - integer '1') AS "mv_avg_1w",
    AVG(tab2."MV") FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date - integer '14'
                                                      AND tab1.start_date::date - integer '1') AS "mv_avg_2w"
FROM (SELECT id,
             start_date,
             point
      FROM base_tab
      WHERE mpfb.start_date::date >= '01-01-2000'::date) AS tab1
LEFT JOIN ghcnd_observations AS tab2
    ON (tab2.record_dt BETWEEN (tab1.start_date::date - integer '180')
                           AND (tab1.start_date::date - integer '1')
        AND ST_DWithin(tab1.point, tab2.location, 0.5))
GROUP BY tab1.id);

如您所见,它是单引号和双引号的组合,如上文所述,按字面意思保存很重要。例如,tab2."MV"有双引号,并且tab1.start_date::date - integer '7' AND tab1.start_date::date - integer '1'有需要按字面保留的单引号。

这个字符串也需要使用参数来构建。R我在using中尝试了以下操作glue,但无法正常工作。

var1       <- "MV"
var1_lowcase <- "mv"
lag_days   <- 180
var_date   <- as.Date("2000-01-01")
var_dwithin <- 0.5

glue::glue(
"CREATE TABLE fact_final_table AS 
(SELECT tab1.id,
    AVG(tab2."{var1}") FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date - integer '7'
                           AND tab1.start_date::date - integer '1') AS "{var1_lowcase}_avg_1w",
    AVG(tab2."{var1}") FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date - integer '14'
                           AND tab1.start_date::date - integer '1') AS "{var1_lowcase}_avg_2w"
    FROM (SELECT id,
          start_date,
          point
          FROM base_tab
          WHERE mpfb.start_date::date >= '{format(var_date, "%d-%m-%Y")}'::date) AS tab1
    LEFT JOIN ghcnd_observations AS tab2
    ON (tab2.record_dt BETWEEN (tab1.start_date::date - integer '{lag_days}')
        AND (tab1.start_date::date - integer '1')
        AND ST_DWithin(tab1.point, tab2.location, {var_dwithin}))
    GROUP BY tab1.id);")

不幸的是,由于单引号/双引号在glue::glue(...).

任何人都可以在这里协助以最少的中断所需的输出字符串吗?我不确定这是否容易实现。我将不胜感激任何其他tidy方法,例如使用stringr以及例如,因为我希望这是%>%友好的。我已经简要地查看了glue::glue_sql但不知道如何在这里直接应用它。如果适用,我将不胜感激在这里学习如何使用它。

4

2 回答 2

1

尝试转义双引号:

glue::glue(
   "CREATE TABLE fact_final_table AS 
   (SELECT tab1.id,
   AVG(tab2.\"{var1}\") FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date - integer '7'
   AND tab1.start_date::date - integer '1') AS \"{var1_lowcase}_avg_1w\",
   AVG(tab2.\"{var1}\") FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date - integer '14'
   AND tab1.start_date::date - integer '1') AS \"{var1_lowcase}_avg_2w\"
   FROM (SELECT id,
         start_date,
          point
   FROM base_tab
    WHERE mpfb.start_date::date >= '{format(var_date, \"%d-%m-%Y\")}'::date) 
  AS tab1
   LEFT JOIN ghcnd_observations AS tab2
   ON (tab2.record_dt BETWEEN (tab1.start_date::date - integer '{lag_days}')
   AND (tab1.start_date::date - integer '1')
   AND ST_DWithin(tab1.point, tab2.location, {var_dwithin}))
   GROUP BY tab1.id);")

返回:

#CREATE TABLE fact_final_table AS 
#(SELECT tab1.id,
#AVG(tab2."MV") FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date #- integer '7'
#AND tab1.start_date::date - integer '1') AS "mv_avg_1w",
#AVG(tab2."MV") FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date #- integer '14'
#AND tab1.start_date::date - integer '1') AS "mv_avg_2w"
#FROM (SELECT id,
#start_date,
#point
#FROM base_tab
#WHERE mpfb.start_date::date >= '01-01-2000'::date) AS tab1
#LEFT JOIN ghcnd_observations AS tab2
#ON (tab2.record_dt BETWEEN (tab1.start_date::date - integer '180')
#AND (tab1.start_date::date - integer '1')
#AND ST_DWithin(tab1.point, tab2.location, 0.5))
#GROUP BY tab1.id);
于 2020-05-07T03:35:11.393 回答
1

所以我从昨天开始更详细地研究了这一点,它glue确实具有使单引号和双引号显式的功能,即 ieglue::single_quote()glue::double_quote().

与@ronakshah 的(有用的)响应类似,我管理了以下内容,这更明确(为了代码可读性):

var1       <- "MV"
var1_lowcase <- "mv"
lag_days   <- 180
var_date   <- as.Date("2000-01-01")
var_dwithin <- 0.5

glue::glue(
    "CREATE TABLE fact_final_table AS 
   (SELECT tab1.id,
   AVG(tab2.{glue::double_quote(var1)}) FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date - integer '7'
   AND tab1.start_date::date - integer '1') AS {glue::double_quote(glue::glue({var1_lowcase},'_avg_1w'))},
   AVG(tab2.{glue::double_quote(var1)}) FILTER (WHERE tab2.record_dt BETWEEN tab1.start_date::date - integer '14'
   AND tab1.start_date::date - integer '1') AS {glue::double_quote(glue::glue({var1_lowcase},'_avg_2w'))}
   FROM (SELECT id,
         start_date,
          point
   FROM base_tab
    WHERE mpfb.start_date::date >= {glue::single_quote(format(var_date, '%d-%m-%Y'))}::date) 
  AS tab1
   LEFT JOIN ghcnd_observations AS tab2
   ON (tab2.record_dt BETWEEN (tab1.start_date::date - integer '{lag_days}')
   AND (tab1.start_date::date - integer '1')
   AND ST_DWithin(tab1.point, tab2.location, {var_dwithin}))
   GROUP BY tab1.id);")

它返回我需要的主字符串。希望这可以帮助其他有类似glue要求的人。

于 2020-05-07T17:14:02.337 回答