I would like to concatenate all rows which have more than 0.955
of similarity score. The Abo
and Bel
columns represents the similarity score with above and below rows, respectively. In the following input df
I have 10 genomic probes (NAME
column) which is concatenated in just 4 genomic segments (dfout
).
df <- " NAME Abo Bel Chr GD Position
BovineHD0100009217 NA 1.0000000 1 0 31691781
BovineHD0100009218 1.0000000 0.6185430 1 0 31695808
BovineHD0100019600 0.6185430 0.9973510 1 0 69211537
BovineHD0100019601 0.9973510 1.0000000 1 0 69213650
BovineHD0100019602 1.0000000 1.0000000 1 0 69214650
BovineHD0100019603 1.0000000 0.6600000 1 0 69217942
BovineHD0100047112 0.6600000 1.0000000 1 0 93797691
BovineHD0100026604 1.0000000 1.0000000 1 0 93815774
BovineHD0100026605 1.0000000 0.4649007 1 0 93819471
BovineHD0100029861 0.4649007 NA 1 0 105042452"
df <- read.table(text=df, header=T)
My expected output dfout
:
dfout <- "Chr start end startp endp nprob
1 31691781 31695808 BovineHD0100009217 BovineHD0100009218 2
1 69211537 69217942 BovineHD0100019600 BovineHD0100019603 4
1 93797691 93819471 BovineHD0100047112 BovineHD0100026605 3
1 105042452 105042452 BovineHD0100029861 BovineHD0100029861 1"
dfout <- read.table(text=dfout, header=T)
Any ideas?