3

我有一个宽格式表,其中前 3 行用于描述表中显示的数据。例如:

Company:               |  Company A  |  Company B  |  Company C  |       |  Company N
Data source:           |  Budget     |  Actual     |  Budget     |  ...  |    ...
Currency:              |  USD        |  EUR        |  USD        |       |    ...
Indicator:
 Sales                    500            1000         1500        ...       ...
 Gross Income             200            300           400        ...       ...
 ...                      ...            ...           ...        ...       ...
 Indicator J              ...            ...           ...        ...

我想用以下布局将其重塑为长格式:

Indicator | Company   | Currency | Data Source | Value
 Sales    | Company A |   USD    | Budget      | 500
 Sales    | Company B |   EUR    | Actual      | 1000
 ...      |    ...    |    ...   |    ...      |  ...

我试图用 reshape2 包融化它,但没有设法将第 2 行和第 3 行转换为变量

dput(AAA)
structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income", 
"Indicator:", "Net Income", "Sales", "Source:"), class = "factor"), 
    V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
    "1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"), 
    V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
    "1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"), 
    V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
    "185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"), 
    V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
    "195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1", 
"V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, 
-8L))
4

1 回答 1

2

这是一个解决方案,涉及转置数据并进行一些清理。休息是通过“融化”完成的:

    AAA <- structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income", 
              "Indicator:", "Net Income", "Sales", "Source:"), class = "factor"), 
               V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
                                                                            "1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"), 
               V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
                                                                            "1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"), 
               V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
                                                                            "185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"), 
               V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
                                                                            "195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1", 
                                                                                                                                                                      "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                   -8L))
# transpose data
dft <- data.frame(t(AAA), stringsAsFactors=FALSE)

require(reshape2)
# set colnames
colnames(dft) <- dft[1, ]
dft <- dft[-1, ]

# remove empty indicator col
dft[ , 4] <- NULL

# melt data
melt(dft, id.vars=c('Company:', 'Source:', 'Currency:'), variable.name='Indicator:')

# Company: Source: Currency:   Indicator: value
# 1  Company A  Budget       USD        Sales  1000
# 2  Company B  Actual       USD        Sales  1500
# 3  Company C  Budget       EUR        Sales  2000
# 4  Company D  Actual       EUR        Sales  2500

也许你需要更多的清洁(现在每个 col 都是字符,也许还可以在转置之前设置 colnames ......)。

于 2013-08-02T13:44:53.150 回答