1

来自类似名称包的 Pyranges 类有两种功能略有不同的方法: intersectoverlay。相交方法描述与重叠的方法描述非常相似:Return overlapping subintervals.vsReturn overlapping intervals. 我无法完全看出这两者之间的区别(是的,我注意到了那个sub前缀)。

是否overlap旨在显示至少在一个位置重叠的完整间隔?

4

1 回答 1

2

设置:

>>> import pyranges as pr
>>> gr = pr.from_dict({"Chromosome": ["chr1"] * 3, "Start": [1, 4, 10],
...                    "End": [3, 9, 11], "ID": ["a", "b", "c"]})
>>> gr
+--------------+-----------+-----------+------------+
|   Chromosome |     Start |       End | ID         |
|   (category) |   (int32) |   (int32) | (object)   |
|--------------+-----------+-----------+------------|
|         chr1 |         1 |         3 | a          |
|         chr1 |         4 |         9 | b          |
|         chr1 |        10 |        11 | c          |
+--------------+-----------+-----------+------------+
Unstranded PyRanges object has 3 rows and 4 columns from 1 chromosomes.
For printing, the PyRanges was sorted on Chromosome.
>>> gr2 = pr.from_dict({"Chromosome": ["chr1"] * 3, "Start": [2, 2, 9], "End": [3, 9, 10]})
>>> gr2
+--------------+-----------+-----------+
| Chromosome   |     Start |       End |
| (category)   |   (int32) |   (int32) |
|--------------+-----------+-----------|
| chr1         |         2 |         3 |
| chr1         |         2 |         9 |
| chr1         |         9 |        10 |
+--------------+-----------+-----------+
Unstranded PyRanges object has 3 rows and 3 columns from 1 chromosomes.
For printing, the PyRanges was sorted on Chromosome.

使用overlap,您可以返回 self 中与 other 中的间隔重叠的间隔。如果一个区间重叠了不止一次,它仍然只返回一次(默认情况下):

>>> gr.overlap(gr2)
+--------------+-----------+-----------+------------+
| Chromosome   |     Start |       End | ID         |
| (category)   |   (int32) |   (int32) | (object)   |
|--------------+-----------+-----------+------------|
| chr1         |         1 |         3 | a          |
| chr1         |         4 |         9 | b          |
+--------------+-----------+-----------+------------+
Unstranded PyRanges object has 2 rows and 4 columns from 1 chromosomes.
For printing, the PyRanges was sorted on Chromosome.

intersect返回的区间是 self 和 other 中重叠区间的交集。默认情况下返回所有重叠:

>>> gr.intersect(gr2)
+--------------+-----------+-----------+------------+
| Chromosome   |     Start |       End | ID         |
| (category)   |   (int32) |   (int32) | (object)   |
|--------------+-----------+-----------+------------|
| chr1         |         2 |         3 | a          |
| chr1         |         2 |         3 | a          |
| chr1         |         4 |         9 | b          |
+--------------+-----------+-----------+------------+
Unstranded PyRanges object has 3 rows and 4 columns from 1 chromosomes.
For printing, the PyRanges was sorted on Chromosome.

有关更多信息,请参阅文档:

于 2020-04-29T21:26:17.727 回答