1

我已经编写了以下代码,我想将这些表连接成一个大表;那么如何使用 SQL 在 R 中做到这一点

user_lessthan10per  <- sqldf("select count(uid) as count_of_students
                       from adopted_user_point
                        where points_scored between 0 and (1469*0.1)")

接下来是

user_lessthan20per  <- sqldf("select count(uid) as count_of_students
                         from adopted_user_point
                         where points_scored >(1469*0.1) and points_scored <= (1469*0.2)")

,

user_lessthan30per  <- sqldf("select count(uid) as count_of_students
                         from adopted_user_point
                         where points_scored >(1469*0.2) and points_scored <= (1469*0.3)")

现在我想将它加入一个包含这三个表的 count_of_students 列的表中。

如何在 R 中执行此操作我有 UNION 命令,但它显示错误。

4

2 回答 2

2

您可以使用条件聚合。这将返回一行三列:

select sum(case when points_scored between 0 and (1469*0.1) then 1 else 0
           end) as cnt1,
       sum(case when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 1 else 0 
           end) as cnt2,
       sum(case when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 1 else 0
           end) as cnt3
from adopted_user_point;

如果你想要三行,你可以使用聚合group by

select (case when points_scored between 0 and (1469*0.1) then 'Group1'
             when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
             when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
             else 'Other'
        end) as cnt3, count(*) as count_of_students
from adopted_user_point
group by (case when points_scored between 0 and (1469*0.1) then 'Group1'
               when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
               when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
               else 'Other'
          end);
于 2013-08-17T11:03:16.113 回答
0

我会以不同的方式命名原始选择,也许是 'u_0_10、'u_10_20'、'u_20_30' 以明确“user_less than30per”实际上是“user_btwn20_30”,但现在它们是全球环境中的 R 数据帧,你不需要真的需要sdldf把它们放在一起:

user_under30per <- rbind(user_lessthan10per.
                        user_lessthan20per,
                        user_lessthan30per)

sqldf 函数确实提供了 UNION:

 one_and_two <- sqldf("select * from lessthan10per union all 
                                       select * from lessthan20per")
 all_three <- sqldf("select * from one_and_two union all 
                                       select * from lessthan30per")
于 2013-08-17T14:33:31.387 回答