0

我有 2 个基因组范围

g1<-GRanges(c("chr1:0-14","chr1:15-29"), score=c(20.2,10.4));g1

GRanges object with 2 ranges and 1 metadata column:
   seqnames    ranges strand |     score
      <Rle> <IRanges>  <Rle> | <numeric>
[1]     chr1      0-14      * |      20.2
[2]     chr1     15-29      * |      10.4

g2<-GRanges(c("chr1:0-9","chr1:10-19","chr1:20-29"), state=c('E1','E2','E1'));g2

GRanges object with 3 ranges and 1 metadata column:
   seqnames    ranges strand |       state
      <Rle> <IRanges>  <Rle> | <character>
[1]     chr1       0-9      * |          E1
[2]     chr1     10-19      * |          E2
[3]     chr1     20-29      * |          E1

我想让它们具有可比性。首先我将它们组合起来,然后我使用了 disjoin:

g3<-(c(g1,g2)); g3 

GRanges object with 5 ranges and 2 metadata columns:
    seqnames    ranges strand |     score       state
       <Rle> <IRanges>  <Rle> | <numeric> <character>
 [1]     chr1      0-14      * |      20.2        <NA>
 [2]     chr1     15-29      * |      10.4        <NA>
 [3]     chr1       0-9      * |      <NA>          E1
 [4]     chr1     10-19      * |      <NA>          E2
 [5]     chr1     20-29      * |      <NA>          E1

disjoin(g3)
                                                                                                   
 GRanges object with 4 ranges and 0 metadata columns:
   seqnames    ranges strand
      <Rle> <IRanges>  <Rle>
[1]     chr1       0-9      *
[2]     chr1     10-14      *
[3]     chr1     15-19      *
[4]     chr1     20-29      *

所以, disjoin 正在做我想要的拆分,但不幸的是没有保留元数据。有没有办法像这样保留元数据并获得 GRange?

 GRanges object with 5 ranges and 2 metadata columns:
   seqnames    ranges strand |     score       state
      <Rle> <IRanges>  <Rle> | <numeric> <character>
[1]     chr1       0-9      *| 20.2    E1
[2]     chr1     10-14      *| 20.2   E2
[3]     chr1     15-19      *| 10.4   E2
[4]     chr1     20-29      *| 10.4   E1

谢谢

4

2 回答 2

1

我想你会在这里找到帮助:https: //support.bioconductor.org/p/82551/ 但请注意,在你的情况下它并不准确,因为输出中的范围可以映射到输入中的多个范围

于 2021-03-16T15:30:09.160 回答
0

是的,绝对with.revmap=T是解决方案:

g1<-GRanges(c("chr1:0-14","chr1:15-29"), score=c(20.2,10.4));g1
g2<-GRanges(c("chr1:0-9","chr1:10-19","chr1:20-29"), 
state=c('E1','E2','E1'));g2
g3<-(c(g1,g2)); g3 #combining GRanges
g4<-disjoin(g3, with.revmap=TRUE);g4 #disjoining to compare them WITH revmap
l1<-g4$revmap;l1 
score<-extractList(mcols(g3)$score, l1);score 
state<-extractList(mcols(g3)$state, l1);state
na.omit<-function(l){sapply(l, function(x){x[!is.na(x)]})} #remove NA's
mcols(g4)$score<-na.omit(score)
mcols(g4)$state<-na.omit(state)
g4

GRanges object with 4 ranges and 3 metadata columns:
   seqnames    ranges strand |        revmap     score       state
      <Rle> <IRanges>  <Rle> | <IntegerList> <numeric> <character>
[1]     chr1       0-9      * |           1,3      20.2          E1
[2]     chr1     10-14      * |           1,4      20.2          E2
[3]     chr1     15-19      * |           2,4      10.4          E2
[4]     chr1     20-29      * |           2,5      10.4          E1

现在我可以很容易地将状态与它的分数进行比较,例如做一个箱线图。谢谢巴斯蒂安

于 2021-03-17T07:27:09.317 回答