0

我试图解决一个涉及大量插入列表的编程问题。然而,问题的细节与问题无关。

为了解决这个问题,我最终编写了某种绳索数据结构,试图从我在互联网上找到的稀缺资源中拼凑出一个实现。

我最终写了这个:

from dataclasses import dataclass
import typing as t

from hypothesis import given, strategies as st

T = t.TypeVar("T")


@dataclass
class Branch(t.Generic[T]):
    weight: int
    left: t.Union[t.List[T], "Branch[T]"]
    right: t.Union[t.List[T], "Branch[T]"]


Rope = t.Union[t.List[T], Branch[T]]


def insert(rope: Rope, index: int, item: T) -> Rope:
    if isinstance(rope, list):  # leaf
        if index == len(rope):  # at the end, just append
            rope.append(item)
            return rope

        # split
        left, right = rope[:index], [item, *rope[index:]]
        return Branch(len(left), left, right)
    else:  # branch
        if rope.right and rope.weight <= index:
            rope.right = insert(rope.right, index - rope.weight, item)
        else:
            rope.weight += 1
            rope.left = insert(rope.left, index, item)

        return rope


def straighten(rope: Rope) -> t.Iterator[T]:
    if isinstance(rope, list):
        yield from rope
    else:
        yield from straighten(rope.left)
        yield from straighten(rope.right)


@st.composite
def instructions(draw, elements=st.integers(0, 255)):
    items = draw(st.lists(elements, min_size=1))
    result = []
    for i, item in enumerate(items):
        insertion_point = draw(st.integers(min_value=0, max_value=i))
        result.append((insertion_point, item))
    return result


@given(instructions())
def test_correctness(seq: t.List[t.Tuple[int, int]]) -> None:
    manual: t.List[int] = []
    for (i, item) in seq:
        manual.insert(i, item)

    rope: Rope = []
    for (i, item) in seq:
        rope = insert(rope, i, item)

    assert manual == list(
        straighten(rope)
    ), f"{manual!r} != {rope!r} => {list(straighten(rope))!r}"


if __name__ == "__main__":
    test_correctness()
    print("All good")

如您所见,代码通过了假设,因此它解决了问题。

但是,我无法理解为什么我编写的这段代码是正确的,因为尝试仅使用分支和叶子(也就是说,不使用列表)来重写它会产生不正确的实现:

from dataclasses import dataclass
import typing as t

from hypothesis import given, strategies as st

T = t.TypeVar("T")


@dataclass
class Branch(t.Generic[T]):
    weight: int
    left: t.Union[T, "Branch[T]"]
    right: t.Union[T, "Branch[T]"]


Rope = t.Union[T, Branch[T]]


def insert(rope: Rope, index: int, item: T) -> Rope:
    if isinstance(rope, Branch):  # leaf
        if rope.right and rope.weight <= index:
            rope.right = insert(rope.right, index - rope.weight, item)
        else:
            rope.weight += 1
            rope.left = insert(rope.left, index, item)

        return rope

    if index == 0:
        left, right = item, rope
    else:
        left, right = rope, item

    return Branch(1, left, right)


def straighten(rope: Rope) -> t.Iterator[T]:
    if isinstance(rope, Branch):
        yield from straighten(rope.left)
        yield from straighten(rope.right)
    else:
        yield rope


@st.composite
def instructions(draw, elements=st.integers(0, 255)):
    items = draw(st.lists(elements, min_size=1))
    result = []
    for i, item in enumerate(items):
        insertion_point = draw(st.integers(min_value=0, max_value=i))
        result.append((insertion_point, item))
    return result


@given(instructions())
def test_correctness(seq: t.List[t.Tuple[int, int]]) -> None:
    it = iter(seq)
    _, head = next(it)

    rope: Rope = head
    manual = [head]
    for (i, item) in it:
        manual.insert(i, item)
        rope = insert(rope, i, item)

        straight_rope: t.List[int] = list(straighten(rope))
        assert manual == straight_rope, f"{manual!r} != {straight_rope!r} ({rope!r})"


if __name__ == "__main__":
    test_correctness()
    print("All good")

因此,如果有人对这种数据结构有更深入的了解,我将不胜感激这两种实现之间的区别以及为什么一种有效而另一种无效。先感谢您。

4

1 回答 1

0

您的代码失败,因为 in if rope.right and rope.weight <= index:,条件评估为Falsewhen rope.right == 0。在您的第二个代码中,当正确的节点是值为 0 的叶子时会发生这种情况,并且会导致该值被插入到错误的子树中。

根本没有检查的理由rope.right,所以这个检查应该只是if rope.weight <= index:.

在你的两个rope实现中还有另一个潜在的问题:你正在改变节点,而rope实现通常将节点视为不可变并创建新节点以进行任何更改。这不会影响您的测试,因为您一次只有一个绳索实例。会导致问题的示例:

r = create_some_rope()
r2 = r
r = insert(r, idx, val)
# now r2 may or may not have changed, depending on its structure.

如果您添加绳索连接,它也会导致问题。在这种情况下,如果将绳索连接到自身(或自身的子树),则任何进一步的操作都可能无意中影响绳索的多个部分。

于 2021-08-17T15:10:19.507 回答