0

我有德克萨斯州炼油厂的数据集(此处为 GeoJSON - https://pastebin.com/R0D9fif9):

Name,Latitude,Longitude
Marathon Petroleum,29.374722,-94.933611
Marathon Petroleum,29.368733,-94.903253
Valero,29.367617,-94.909515
LyondellBasell,29.71584,-95.234814
Valero,29.722213,-95.255198
Exxon,29.743865,-95.009208
Shell,29.720425,-95.12495
Petrobras,29.722466,-95.208807

我想用这些点创建一张打印的地图。但在给定的分辨率下,它们靠得太近了。

由于传说中应该提到每个炼油厂,所以我无法聚类。所以我想

  • 获取质心 - 这很容易

    import json
    import csv
    from shapely.geometry import shape, Point, MultiPoint
    
    with open('refineries.csv', 'rU') as infile:
        reader = csv.DictReader(infile)
        data = {}
        for row in reader:
            for header, value in row.items():
                try:
                    data[header].append(value)
                except KeyError:
                    data[header] = [value]
    
     listo = list(zip(data['Longitude'], data['Latitude']))
     points1 = MultiPoint(points=listo)
    
     points = MultiPoint([(-94.933611, 29.374722), (-94.903253, 29.368733), (-94.909515, 29.367617), (-95.234814, 29.71584), (-95.255198, 29.722213), (-95.009208, 29.743865), (-95.12495, 29.720425), (-95.208807, 29.722466)])
    
     print(points.centroid)
    
  • 将所有点从质心移开,直到达到所有点之间的最小距离

你可以在这里帮帮我吗?提前致谢!

4

1 回答 1

0

这取决于您希望如何将点从质心移开。一种方法是计算每个点的大圆距离和相对于质心的方位角,并重新调整所有距离,以确保两个最近点之间的距离大于指定阈值。在下面的示例中,pyproj用于计算方位角和距离。

import json
import csv
import sys
from shapely.geometry import shape, Point, MultiPoint
from pyproj import Geod

with open('refineries.csv', 'rU') as infile:
    reader = csv.DictReader(infile)
    data = {}
    for row in reader:
        for header, value in row.items():
            if not header in data:
                data[header] = []
            data[header].append(value)

listo = list(zip(map(float, data['Longitude']), map(float, data['Latitude'])))

def scale_coords(coords, required_dist = 1000.):
    g = Geod(ellps = 'WGS84')

    num_of_points = len(coords)

    #calculate centroid
    C = MultiPoint(coords).centroid

    #determine the minimum distance among points
    dist_min, dist_max = float('inf'), float('-inf')
    for i in range(num_of_points):
        lon_i, lat_i = coords[i]
        for j in range(i+1, num_of_points):
            lon_j, lat_j = coords[j]
            _,_,dist = g.inv(lon_i, lat_i, lon_j, lat_j)
            dist_min = min(dist_min, dist)
            dist_max = max(dist_max, dist)

    if dist_min > required_dist:
        return coords

    coords_scaled = [None]*num_of_points
    scaling = required_dist / dist_min
    for i, (lon_i, lat_i) in enumerate(coords):
        az,_,dist = g.inv(C.x, C.y, lon_i, lat_i)
        lon_f,lat_f,_ = g.fwd(C.x, C.y, az, dist*scaling)
        coords_scaled[i] = (lon_f, lat_f)

    return coords_scaled

或者,这可能与您也放松方位角的方法相结合。这原则上会导致“径向”距离的比例因子更小。但是,它也会稍微扭曲点的“视觉分布”。此外,上述方法可以通过忽略重新缩放中的任何异常点来“改进”,即,已经离质心足够远并且没有附近邻居的点。

于 2018-09-18T10:40:48.290 回答