r - 数据帧算法

Question

可能重复：
从其他数据框的函数创建新数据框

我在 SOF 的第一个问题上得到了一点帮助，但我不知道如何回答受访者。所以，我再次发布示例代码（应该第一次这样做 - 我正在学习）。

我有两个数据框。为了解释，我们假装：

df1 列代表收益类型：玉米、燕麦、小麦等。行代表一年中的月份，一月、二月等元素代表在该特定月份购买的那种谷物的每吨价格。

df2 代表国家的列：西班牙、智利、墨西哥等。此框架的行代表每个国家的附加成本，可能是：包装成本、运输成本、国家进口税、检验费等。

现在我想构建第三个数据框：

df3 表示国家每月谷物组合的总成本（例如 10% 玉米、50% 燕麦……）以及相关的运输、税收等成本。假设有一个方程（使用来自 df1 和 df2 的数据）计算给定谷物组合每个国家每月的总成本以及每个国家的额外成本。

换句话说，df3 有 12 行（月）和与国家一样多的列。它的要素是每个国家每月的粮食总成本+成本。

两分钟在 Excel/Gnumeric 中，15 分钟在 Fortran 或 C 中，两天在 R Cookbook 和互联网搜索中苦苦挣扎。而且，大厅里没有人可以大喊：“嘿，凯文，你是怎么在 R 中做到这一点的……？”

如此简单，但对于新手来说，我忽略了一些基本点..

在此先感谢，这是我的假装代码，它说明了我的问题。

埃德

# build df1 - cost of grains (with goofy data so I can track the arithemetic)
  v1 <- c(1:12)
  v2 <- c(13:24)
  v3 <- c(25:36)
  v4 <- c(37:48)
  grain <- data.frame("wheat"=v1,"oats"=v2,"corn"=v3,"rye"=v4)

  grain


# build df2 - additional costs (again, with goofy data to see what is being used where and when)
  w1 <- c(1.3:4.3)
  w2 <- c(5.3:8.3)
  w3 <- c(9.3:12.3)
  w4 <- c(13.3:16.3)
  cost <- data.frame("Spain"=w1,"Peru"=w2,"Mexico"=w3,"Kenya"=w4)
  row.names(cost) <- c("packing","shipping","tax","inspection")

  cost


# assume 10% wheat, 30% oats and 60% rye with some clown-equation for total cost

# now for my feeble attemp at getting a dataframe that has 12 rows (months) and 4 column (countries)

  total_cost <- data.frame( 0.1*grain[,"wheat"] +
                            0.3*grain[,"oats"] +
                            0.6*grain[,"rye"] +
                            cost["packing","Mexico"] +
                            cost["shipping","Mexico"] +
                            cost["tax","Mexico"]  +
                            cost["inspection","Mexico"] )
  total_cost

# this gives the correct values for the total cost for Mexico, for each month.

# and if I plug in the other countries, I get correct answers for that country
# I guess I can run a loop over the counties, but this is R, not Fortran or C. 

# btw, my real equation is considerably more complicated, using functions involving
# multiple columns of df1 and df2 data, so there is no "every column of a df1 get 
#multipied by... or any one-to-one column-row matches.

r - 数据帧算法

0 回答 0

Related

Reference