-2

Cbinding two dataframes (equal number of rows) with a few columns having common names normally results in a data.frame with the common names altered (e.g. NameA.1, NameB.1, etc) to avoid any issues.

I noticed that even though the names have changed, there were data substitutions. Specifically, the resulting data.frame had data from the first data.frame in all columns with the same name, even in those that were supposed to have data from the second data.frame.

This one is easy to overcome, since one can change the names prior to cbind but it might sneak errors in the results.

------Edit---- I'll try to provide an example:

df1 is:

    row     seqnames    start   end     width   strand  Region  Extra1
    1       chr10       8111    8111    172      *      123      456
    2       chr11       8112    8112    173      *      123b     456b

and df2 is:

    row     seqnames    start   end     width   strand  Whatever1 Whatever2
    1       chr12       9111    9111    174      +      ABC      EFG
    2       chr13       9112    9112    175      +      ABCb     EFGb

I perform cbind and get:

    row     seqnames    start   end     width   strand  Region  Extra1  seqnames.1  start.1 end.1   width.1 strand.1 Whatever1 Whatever2
    1       chr10       8111    8111    172      *      123      456    chr10       8111    8111    172      *        ABC        EFG
    2       chr11       8112    8112    173      *      123b     456b   chr11       8112    8112    173      *        ABCb       EFGb

The values in the second part belong to df1 instead of df2. This only happens in columns that had the same name in df1 and df2. They have been automatically properly renamed but their data have been repeated from the first df.

Question: Is this normal behavior?

I hope this helps

Thank you again

4

1 回答 1

8

Not sure what is your question, but you can specify your own column prefix for columns of each merged object with named arguments of cbind:

data('cars')
cars2=cbind(DataSet1=cars, DataSet2=cars)
head(cars2)
# DataSet1.speed DataSet1.dist DataSet2.speed DataSet2.dist
# 1              4             2              4             2
# 2              4            10              4            10
# 3              7             4              7             4
# 4              7            22              7            22
# 5              8            16              8            16
# 6              9            10              9            10
于 2015-06-03T07:37:57.330 回答