0

给定两个点 P、Q 和一个 delta,我定义了等价关系 ~=,其中 P ~= Q if EuclideanDistance(P,Q)<= delta。现在,给定n个点的集合S,在示例中 S = (A, B, C, D, E, F) 和 n = 6(事实点实际上是线段的端点可以忽略不计),是否有一种算法可以在平均情况下,复杂性是否优于 O(n^2) 来找到集合的一个分区(子集的代表元素不重要)?

到目前为止,试图找到这个问题的理论定义是不成功的:k-means clustering、最近邻搜索和其他在我看来是不同的问题。图片显示了我需要在我的应用程序中执行的操作。

有什么提示吗?谢谢

替代文字

编辑:虽然实际问题(给定某种不变量的点附近的集群)在平均情况下应该比 O(n^2) 更好地解决,但我的问题定义存在严重缺陷:=~不是等价关系因为一个简单的事实,它不尊重传递性。我认为这是这个问题不容易解决并且需要先进技术的主要原因。将很快发布我的实际解决方案:当附近的点都满足定义的 =~ 时应该工作。当两极分开的点不尊重这种关系但它们与聚集点的重心有关时,可能会失败。它适用于我的输入数据空间,可能不适用于您的。有谁知道这个问题的完整正式解决方案(有解决方案)?

4

5 回答 5

1

重述问题的一种方法如下:给定一组n2D 点,对于每个点,找到以 为中心p的直径圆所包含的点集。deltap

一个简单的线性搜索给出了O(n^2)你提到的算法。

在我看来,这是在最坏的情况下能做到的最好的事情。当集合中的所有点都包含在直径 <= 的圆内时delta,每个n查询都必须返回O(n)点,从而给出O(n^2)整体复杂性。

但是,应该能够在更合理的数据集上做得更好。看看这个(特别是关于空间分区的部分)和KD-trees。后者应该O(n^2)在合理的情况下给你一个子算法。

可能有一种不同的方式来看待这个问题,一种会带来更好的复杂性;我无法想到任何事情。

于 2010-12-05T23:10:52.127 回答
0

This is a C# KdTree implementation that should solve the "Find all neighbors of a point P within a delta". It makes heavy use of functional programming techniques (yes, I love Python). It's tested but I still have doubts doubts in understanding _TreeFindNearest(). The code (or pseudo code) to solve the problem "Partition a set of n points given a ~= relation in better than O(n^2) in the average case" is posted in another answer.

/*
Stripped C# 2.0 port of ``kdtree'', a library for working with kd-trees.
Copyright (C) 2007-2009 John Tsiombikas <nuclear@siggraph.org>
Copyright (C) 2010 Francesco Pretto <ceztko@gmail.com>

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
   derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
*/

using System;
using System.Collections.Generic;
using System.Text;

namespace ITR.Data.NET
{
    public class KdTree<T>
    {
        #region Fields

        private Node _Root;
        private int _Count;
        private int _Dimension;
        private CoordinateGetter<T>[] _GetCoordinate;

        #endregion // Fields

        #region Constructors

        public KdTree(params CoordinateGetter<T>[] coordinateGetters)
        {
            _Dimension = coordinateGetters.Length;
            _GetCoordinate = coordinateGetters;
        }

        #endregion // Constructors

        #region Public methods

        public void Insert(T location)
        {
            _TreeInsert(ref _Root, 0, location);
            _Count++;
        }

        public void InsertAll(IEnumerable<T> locations)
        {
            foreach (T location in locations)
                Insert(location);
        }

        public IEnumerable<T> FindNeighborsRange(T location, double range)
        {
            return _TreeFindNeighborsRange(_Root, 0, location, range);
        }

        #endregion // Public methods

        #region Tree traversal

        private void _TreeInsert(ref Node current, int currentPlane, T location)
        {
            if (current == null)
            {
                current = new Node(location);
                return;
            }

            int nextPlane = (currentPlane + 1) % _Dimension;

            if (_GetCoordinate[currentPlane](location) <
                    _GetCoordinate[currentPlane](current.Location))
                _TreeInsert(ref current._Left, nextPlane, location);
            else
                _TreeInsert(ref current._Right, nextPlane, location);
        }

        private IEnumerable<T> _TreeFindNeighborsRange(Node current, int currentPlane,
            T referenceLocation, double range)
        {
            if (current == null)
                yield break;

            double squaredDistance = 0;
            for (int it = 0; it < _Dimension; it++)
            {
                double referenceCoordinate = _GetCoordinate[it](referenceLocation);
                double currentCoordinate = _GetCoordinate[it](current.Location);
                squaredDistance +=
                    (referenceCoordinate - currentCoordinate)
                    * (referenceCoordinate - currentCoordinate);
            }

            if (squaredDistance <= range * range)
                yield return current.Location;

            double coordinateRelativeDistance =
                _GetCoordinate[currentPlane](referenceLocation)
                    - _GetCoordinate[currentPlane](current.Location);
            Direction nextDirection = coordinateRelativeDistance <= 0.0
                ? Direction.LEFT : Direction.RIGHT;
            int nextPlane = (currentPlane + 1) % _Dimension;
            IEnumerable<T> subTreeNeighbors =
                _TreeFindNeighborsRange(current[nextDirection], nextPlane,
                    referenceLocation, range);
            foreach (T location in subTreeNeighbors)
                yield return location;

            if (Math.Abs(coordinateRelativeDistance) <= range)
            {
                subTreeNeighbors =
                    _TreeFindNeighborsRange(current.GetOtherChild(nextDirection),
                        nextPlane, referenceLocation, range);
                foreach (T location in subTreeNeighbors)
                    yield return location;
            }
        }

        #endregion // Tree traversal

        #region Node class

        public class Node
        {
            #region Fields

            private T _Location;
            internal Node _Left;
            internal Node _Right;

            #endregion // Fields

            #region Constructors

            internal Node(T nodeValue)
            {
                _Location = nodeValue;
                _Left = null;
                _Right = null;
            }

            #endregion // Contructors

            #region Children Indexers

            public Node this[Direction direction]
            {
                get { return direction == Direction.LEFT ? _Left : Right; }
            }

            public Node GetOtherChild(Direction direction)
            {
                return direction == Direction.LEFT ? _Right : _Left;
            }

            #endregion // Children Indexers

            #region Properties

            public T Location
            {
                get { return _Location; }
            }

            public Node Left
            {
                get { return _Left; }
            }

            public Node Right
            {
                get { return _Right; }
            }

            #endregion // Properties
        }

        #endregion // Node class

        #region Properties

        public int Count
        {
            get { return _Count; }
            set { _Count = value; }
        }

        public Node Root
        {
            get { return _Root; }
            set { _Root = value; }
        }

        #endregion // Properties
    }

    #region Enums, delegates

    public enum Direction
    {
        LEFT = 0,
        RIGHT
    }

    public delegate double CoordinateGetter<T>(T location);

    #endregion // Enums, delegates
}
于 2010-12-07T17:21:59.470 回答
0

对于每个点,计算距原点的距离 D(n),这是一个 O(n) 操作。

使用 O(n^2) 算法查找 D(ab) < delta 的匹配项,跳过D(a)-D(b) > delta。

平均而言,由于跳过了(希望很大)数字,因此结果必须优于 O(n^2)。

于 2010-12-07T12:12:37.760 回答
0

下面的 C# 方法与 KdTree 类、Join()(枚举作为参数传递的所有集合)和 Shuffled()(返回传递集合的打乱版本)方法一起解决了我的问题。referenceVectors当向量与 相同时,可能存在一些有缺陷的情况(阅读问题中的编辑)vectorsToRelocate,就像我在我的问题中所做的那样。

public static Dictionary<Vector2D, Vector2D> FindRelocationMap(
    IEnumerable<Vector2D> referenceVectors,
    IEnumerable<Vector2D> vectorsToRelocate)
{
    Dictionary<Vector2D, Vector2D> ret = new Dictionary<Vector2D, Vector2D>();

    // Preliminary filling
    IEnumerable<Vector2D> allVectors =
        Utils.Join(referenceVectors, vectorsToRelocate);
    foreach (Vector2D vector in allVectors)
        ret[vector] = vector;

    KdTree<Vector2D> kdTree = new KdTree<Vector2D>(
        delegate(Vector2D vector) { return vector.X; },
        delegate(Vector2D vector) { return vector.Y; });
    kdTree.InsertAll(Utils.Shuffled(ret.Keys));

    HashSet<Vector2D> relocatedVectors = new HashSet<Vector2D>();
    foreach (Vector2D vector in referenceVectors)
    {
        if (relocatedVectors.Contains(vector))
            continue;

        relocatedVectors.Add(vector);

        IEnumerable<Vector2D> neighbors =
            kdTree.FindNeighborsRange(vector, Tolerances.EUCLID_DIST_TOLERANCE);

        foreach (Vector2D neighbor in neighbors)
        {
            ret[neighbor] = vector;
            relocatedVectors.Add(neighbor);
        }
    }

    return ret;
}
于 2010-12-09T22:33:35.757 回答
0

绝对是Quadtree的问题。

您还可以尝试对每个坐标进行排序并使用这两个列表(排序为n*log(n),并且您只能检查满足 的点dx <= delta && dy <= delta。此外,您可以将它们放在具有两级指针的排序列表中:一个用于解析 OX 和OY的另一个。

于 2010-12-05T22:58:05.710 回答