0

I have a following sample code to make one data frame containing information for more than 1 ID. I want to sort them by defined categories. In which I want to see the percentage change at specific (given time for e.h here t=10) with respect to its baseline value and return the value of that found category in output. I have explained detailed step of my calculation below.

a=c(100,105,126,130,150,100,90,76,51,40)
t=c(0,5,10,20,30)
t=rep(t,2)
ID=c(1,1,1,1,1,2,2,2,2,2)
data=data.frame(ID,t,a)   

My desired Calculation

 1)for all ID  at t=0 "a" value is baseline
 2) Computation
    e.g At Given t=10 (Have to provide) take corresponding a value
   %Change(answer) = (taken a value - baseline/baseline)
 3) Compare the answer in the following define CATEGORIES..
   #category
   1-If answer>0.25
   2-If -0.30<answer<0.25
   3-If -1.0<answer< -0.30
   4-If answer== -1.0
 4)Return the value of category

Desired Output

 ID My_Answer
 1    1
 2    3

Can anybody help me in this.I do understand the flow of my computation but not awre of efficient way of doing it as i have so many ID in that data frame. Thank you

4

1 回答 1

1

用列做数学比用行更容易。所以第一步是将baseline数字移动到它们自己的列中,然后cut用来定义这些组:

library(dplyr)
library(tidyr)

foo <- data %>%
  filter(t == 0) %>%
  left_join(data %>% 
              filter(t != 0),
             by = "ID") %>%
  mutate(percentchange = (a.y - a.x) / a.x,
         My_Answer = cut(percentchange, breaks = c(-1, -0.3, 0.25, Inf),
                         right = FALSE, include.lowest = TRUE, labels = c("g3","g2","g1")),
         My_Answer = as.character(My_Answer),
         My_Answer = ifelse(percentchange == -1, "g4", My_Answer)) %>%
  select(ID, t = t.y, My_Answer)

foo 
  ID t.x a.x t.y a.y percentchange My_Answer
1  1   0 100   5 105          0.05        g2
2  1   0 100  10 126          0.26        g1
3  1   0 100  20 130          0.30        g1
4  1   0 100  30 150          0.50        g1
5  2   0 100   5  90         -0.10        g2
6  2   0 100  10  76         -0.24        g2
7  2   0 100  20  51         -0.49        g3
8  2   0 100  30  40         -0.60        g3

您可以看到这让我们可以My_Answer一次计算所有值。如果你想找出 的值t == 10,你可以拉出这些行:

foo %>%
  filter(t == 10)

  ID  t My_Answer
1  1 10        g1
2  2 10        g2
于 2014-07-24T17:11:25.637 回答