encoding - proper/best type for storing latitude and longitude

Question

In a system level programming language like C, C++ or D, what is the best type/encoding for storing latitude and longitude?

The options I see are:

IEEE-754 FP as degrees or radians
degrees or radians stored as a fixed point value in an 32 or 64 bit int
mapping of an integer range to the degree range: -> deg = (360/2^32)*val
degrees, minutes, seconds and fractional seconds stored as bit fields in an int
a struct of some kind.

The easy solution (FP) has the major down side that it has highly non uniform resolution (somewhere in England it can measure in microns, over in Japan, it can't). Also this has all the issues of FP comparison and whatnot. The other options require extra effort in different parts of the data's life cycle. (generation, presentation, calculations etc.)

One interesting option is a floating precision type that where as the Latitude increase it gets more bits and the Longitude gets less (as they get closer together towards the poles).

Related questions that don't quite cover this:

BTW: 32 bits gives you an E/W resolution at the equator of about 0.3 in. This is close to the scale that high grade GPS setups can work at (IIRC they can get down to about 0.5 in in some modes).

OTOH if the 32 bits is uniformly distributed over the earth's surface, you can index squares of about 344m on a side, 5 Bytes give 21m, 6B->1.3m and 8B->5mm.

I don't have a specific use in mind right now but have worked with this kind of thing before and expect to again, at some point.

score 44 · Accepted Answer

The easiest way is just to store it as a float/double in degrees. Positive for N and E, negative for S and W. Just remember that minutes and seconds are out of 60 (so 31 45'N is 31.75). Its easy to understand what the values are by looking at them and, where necessary, conversion to radians is trivial.

Calculations on latitudes and longitudes such as the Great Circle distance between two coordinates rely heavily on trigonometric functions, which typically use doubles. Any other format is going to rely on another implementation of sine, cosine, atan2 and square root, at a minimum. Arbitrary precision numbers (eg BigDecimal in Java) won't work for this. Something like the int where 2^32 is spread uniformly is going to have similar issues.

The point of uniformity has come up in several comments. On this I shall simply note that the Earth, with respect to longitude, isn't uniform. One arc-second longitude at the Arctic Circle is a shorter distance than at the Equator. Double precision floats give sub-millimetre precision anywhere on Earth. Is this not sufficient? If not, why not?

It'd also be worth noting what you want to do with that information as the types of calculations you require will have an impact on what storage format you use.

score 21 · Accepted Answer

Longitudes and latitudes are not generally known to any greater precision than a 32-bit float. So if you're concerned about storage space, you can use floats. But in general it's more convenient to work with numbers as doubles.

Radians are more convenient for theoretical math. (For example, the derivative of sine is cosine only when you use radians.) But degrees are typically more familiar and easier for people to interpret, so you might want to stick with degrees.

score 13 · Accepted Answer

根据这篇关于Decimal Degrees的维基百科文章，精度为 8 的 Decimal 表示应该绰绰有余。

0 decimal places, 1.0 = 111 km
...
7 decimal places, 0.0000001 = 1.11 cm
8 decimal places, 0.00000001 = 1.11 mm

score 7 · Accepted Answer

http://www.esri.com/news/arcuser/0400/wdside.html
在赤道，经度角秒大约等于纬度角秒，即 1/60 海里（或 101.27英尺或 30.87 米）。

32 位浮点数包含 23 位显式数据。
180 * 3600 需要 log2(648000) = 19.305634287546711769425914064259 位数据。请注意，符号位是单独存储的，因此我们只需要计算 180 度。
如果将值 648000 标准化为 2 的某个幂，则适用以下计算。
从 23 中减去 log2(648000) 的位后，我们剩下的额外 3.694365712453288230574085935741 位用于亚秒级数据。
即每秒 2 ^ 3.694365712453288230574085935741 = 12.945382716049382716049382716053 份。
因此，浮点数据类型在赤道处可以具有 30.87 / 12.945382716049382716049382716053 ~= 2.38 米的精度。

如果您将 180 度值归一化为 2 的某个幂，则上述计算是精确的。否则假设亚度精度存储在小数点之后，浮点表示将物理上使用所有 8 位作为度数部分。这为亚度精度留下了 15 位。然后 15 - log2(3600) 使 3.1862188087829629413518832531256 位用于亚秒级数据，或 3.3914794921875 ~= 3.39 米的赤道精度。这比标准化所能提供的大约少一米。

score 4 · Accepted Answer

Might the problems you mentioned with floating point values become an issue? If the answer is no, I'd suggest just using the radians value in double precision - you'll need it if you'll be doing trigonometric calculations anyway.

If there might be an issue with precision loss when using doubles or you won't be doing trigonometry, I'd suggest your solution of mapping to an integer range - this will give you the best resolution, can easily be converted to whatever display format you're locale will be using and - after choosing an appropriate 0-meridian - can be used to convert to floating point values of high precision.

PS: I've always wondered why there seems to be no one who uses geocentric spherical coordinates - they should be reasonably close to the geographical coordinates, and won't require all this fancy math on spheroids to do computations; for fun, I wanted to convert Gauss-Krüger-Koordinaten (which are in use by the German Katasteramt) to GPS coordinates - let me tell you, that was ugly: one uses the Bessel ellipsoid, the other WGS84, and the Gauss-Krüger mapping itself is pretty crazy on it's own...

score 4 · Accepted Answer

什么编码是“最好的”实际上取决于您的目标/要求。

如果您正在执行算术，浮点纬度，经度通常很方便。其他时候笛卡尔坐标（即x，y，z）可能更方便。例如，如果您只关心地球表面上的点，则可以使用n-vector。

至于长期存储，IEEE 浮点将浪费位用于您不关心的范围（对于纬度/经度）或在笛卡尔坐标的情况下您可能不关心的精度（除非您希望在原点有非常好的精度无论出于何种原因）。您当然可以将任一类型的坐标映射到您喜欢的大小的整数，这样整个整数范围就可以覆盖您关心的分辨率下您感兴趣的范围。

当然，除了在编码中不浪费比特之外，还有其他事情需要考虑。例如，（Geohashes）[https://en.wikipedia.org/wiki/Geohash] 有一个很好的特性，即在同一区域很容易找到其他 geohashes。（大多数将具有相同的前缀，您可以计算其他的前缀。）不幸的是，它们在赤道附近和两极附近保持相同的经度精度。我目前正在使用 64 位 geohashes 进行存储，它在赤道处提供大约 3 m 的分辨率。

Maidenhead Locator System具有一些类似的特征，但似乎更适合于在人类之间交流位置而不是存储在计算机上。（存储 MLS 字符串会浪费大量位用于一些相当微不足道的错误检测。）

我发现确实以不同方式处理两极的一个系统是军事网格参考系统，尽管它似乎也更面向人类通信。（从或转换为纬度/经度似乎很痛苦。）

根据您确切想要的内容，您可以使用类似于两极附近的通用极坐标系的东西，以及比世界其他地方的 UTM在计算上更合理的东西，并且最多使用一位来指示两个系统中的哪一个你正在使用。我最多说一点，因为您关心的大多数点不太可能在两极附近。例如，您可以使用“半位”，说 11 表示使用极坐标系统，而 00、01 和 10 表示使用其他系统，并且是表示的一部分。

对不起，这有点长，但我想保存我最近学到的东西。遗憾的是，我还没有找到任何标准、理智和有效的方法来以统一的精度表示地球上的一个点。

编辑：我发现另一种方法看起来更像你想要的，因为它更直接地利用了靠近两极的经度所需的较低精度。事实证明，有很多关于存储法线向量的研究。Encoding Normal Vectors using Optimized Spherical Coordinates描述了这样一种系统，用于在保持最低精度的同时对法线向量进行编码，但它也可以用于地理坐标。

score 4 · Accepted Answer

好问题！

我知道这个问题现在已经有 9 年了，我只知道你正在寻找的答案的一部分，但我刚来这里有一个类似的问题，自从提出这个问题以来，很多事情都发生了变化，例如可用的硬件和 GPS . 我经常在处理不同类型应用程序中不同类型 GPS 的固件中使用这个主题，并且已经忘记了我为我使用过的不同应用程序制定“最佳设计”所花费的时间（和天数）或发达。

与往常一样，不同的解决方案将提供收益和成本，最终，“最佳设计”始终是收益和成本与系统要求的“最佳匹配”。当我问同样的问题时，我必须考虑以下几点：

CPU时间成本

如果 CPU 没有内置的浮点协处理器（许多微控制器就是这种情况），那么处理“float”、“double”和“long double”可能会非常昂贵。例如，对于我经常使用的一个 16 位微控制器，使用“双”值的乘法需要 326 个 CPU 时钟周期，而除法需要 1193 个时钟周期。非常贵！

准确性权衡

在赤道，一个'float'（IEEE-754 32位浮点值），需要表示一个有符号度数，假设能表示7个“干净”的有效十进制数字，一个最低有效十进制数字的变化（例如从 179.9999 到 180.0000）将代表大约 11.12 米的距离。这可能满足也可能不满足硬系统精度要求。而“双”（表示 15 个“干净”的有效十进制数字，因此从 179.999999999999 变为 180.000000000000）表示大约 0.00011 毫米。

输入精度限制

如果您正在处理来自 GPS 的输入，您将获得多少位的真实准确度，以及您需要保留多少位？

开发时间成本

IEEE-754 64 位双精度值（'double'）和 32 位单精度值（'float'）在 C 语言中处理起来非常方便，因为几乎每个 C 编译器都附带了这两种数学库，并且通常非常可靠。如果您的 CPU 带有硬件浮点处理器，这是一个简单的选择。

RAM 和存储成本

如果您必须将大量这些值保存在 RAM（或存储，例如 MYSQL）中，可用 RAM（和存储空间）可能会影响解决方案的可操作性。

可用数据与所需数据

我在写这篇文章时处理的一个例子（我来这里回答这个问题的原因）是我正在处理一个 u-blox M8 GPS，它能够给我二进制 GPS 信息（节省翻译 ASCII NMEA 的 CPU 开销句子）。在这种二进制格式（称为“UBX 协议”）中，纬度和经度表示为带符号的 32 位整数，该表示能够表示（在赤道处）低至约 1.11 厘米的精度。例如，-105.0269805 度经度表示为 -1050269805（使用所有 32 位），一个 LSb 变化表示任何地方的纬度变化约 1.11 厘米，赤道经度变化 1.11 厘米（高纬度地区更小，与余弦成正比）纬度）。此 GPS 所在的应用程序执行导航任务，这（已经存在且经过良好测试的代码）需要' 双'数据类型。不幸的是，将这个整数转换为 IEEE-754 64 位“双精度”并不容易，只需将整数的基数 2 位移动到“双精度”的内部表示位，因为要执行的小数移位是以 10 为底的十进制移位。如果它是一个以 2 为底的十进制移位，那么整数的以 2 为底的位可以移动到“双”的位域中，而只需要很少的翻译。但唉，这不是我有符号整数的情况。因此，在没有硬件浮点处理器的 CPU 上进行乘法运算将花费我：326 个 CPU 时钟周期。仅通过将整数的 base-2 位移动到“double”的内部表示位中就无法轻松完成，因为要执行的十进制移位是 base-10 十进制移位。如果它是一个以 2 为底的十进制移位，那么整数的以 2 为底的位可以移动到“双”的位字段中，而只需要很少的翻译。但唉，这不是我有符号整数的情况。因此，在没有硬件浮点处理器的 CPU 上进行乘法运算将花费我：326 个 CPU 时钟周期。仅通过将整数的 base-2 位移动到“double”的内部表示位中就无法轻松完成，因为要执行的十进制移位是 base-10 十进制移位。如果它是一个以 2 为底的十进制移位，那么整数的以 2 为底的位可以移动到“双”的位字段中，而只需要很少的翻译。但唉，这不是我有符号整数的情况。因此，在没有硬件浮点处理器的 CPU 上进行乘法运算将花费我：326 个 CPU 时钟周期。但唉，这不是我有符号整数的情况。因此，在没有硬件浮点处理器的 CPU 上进行乘法运算将花费我：326 个 CPU 时钟周期。但唉，这不是我有符号整数的情况。因此，在没有硬件浮点处理器的 CPU 上进行乘法运算将花费我：326 个 CPU 时钟周期。

double   ldLatitude;
int32_t  li32LatFromGps;
ldLatitude = (double)li32LatFromGps * 0.0000001;

请注意，此乘法是在此基础上选择的：

ldLatitude = (double)li32LatFromGps / 10000000.0;

因为在我正在处理的 CPU 上，“双”乘法比“双”除法快 3.6 倍。这就是微控制器世界的生活。:-)

如果导航任务可以直接使用 32 位有符号整数来完成，那将是非常棒的（如果我能在周末抽出时间的话，将来可能会如此）！那么就不需要转换了……但是用这样一个整数做导航任务会不会花费更多？CPU成本，可能效率更高。开发时间成本？这是另一个问题，尤其是在已经使用 IEEE-754 64 位“双”值的经过充分测试的系统的情况下！另外，已经存在提供地图数据的软件（使用“双”度值），该软件也必须转换为使用有符号整数——这不是一夜之间的任务！

一个非常有趣的选择是使用原始纬度/经度整数直接（不进行平移）表示“矩形”（实际上是梯形，在两极处变成三角形）的近似值之间的交点。在赤道，这些矩形的东西尺寸约为 1.11 厘米，南北尺寸约为 1.11 厘米，而在纬度，例如英国伦敦，尺寸约为东西尺寸 0.69 厘米，南北尺寸约为 1.11 厘米。这可能不容易处理，也可能不容易处理，这取决于应用程序需要什么。

无论如何，我希望这些想法和讨论能帮助其他正在关注这个主题的人，为他们的系统寻找“最佳设计”。

亲切的问候，维克

score 3 · Accepted Answer

0.3 inch resolution is getting down to the point where earthquakes over a few years make a difference. You may want to reconsider why you believe you need such fine resolution worldwide.

Some of the spreading centres in the Pacific Ocean change by as much as 15 cm/year.

score 3 · Accepted Answer

一个 Java 程序，用于计算将 lat/long 值转换为 Float/Double 的最大舍入误差（以米为单位）：

import java.util.*;
import java.lang.*;
import com.javadocmd.simplelatlng.*;
import com.javadocmd.simplelatlng.util.*;

public class MaxError {
  public static void main(String[] args) {
    Float flng = 180f;
    Float flat = 0f;
    LatLng fpos = new LatLng(flat, flng);
    double flatprime = Float.intBitsToFloat(Float.floatToIntBits(flat) ^ 1);
    double flngprime = Float.intBitsToFloat(Float.floatToIntBits(flng) ^ 1);
    LatLng fposprime = new LatLng(flatprime, flngprime);

    double fdistanceM = LatLngTool.distance(fpos, fposprime, LengthUnit.METER);
    System.out.println("Float max error (meters): " + fdistanceM);

    Double dlng = 180d;
    Double dlat = 0d;
    LatLng dpos = new LatLng(dlat, dlng);
    double dlatprime = Double.longBitsToDouble(Double.doubleToLongBits(dlat) ^ 1);
    double dlngprime = Double.longBitsToDouble(Double.doubleToLongBits(dlng) ^ 1);
    LatLng dposprime = new LatLng(dlatprime, dlngprime);

    double ddistanceM = LatLngTool.distance(dpos, dposprime, LengthUnit.METER);
    System.out.println("Double max error (meters): " + ddistanceM);
  }
}

输出：

Float max error (meters): 1.7791213425235692
Double max error (meters): 0.11119508289500799

score 1 · Accepted Answer

如果“存储”是指“保存在记忆中”，那么真正的问题是：你打算如何处理它们？

我怀疑在这些坐标做任何有趣的事情之前，它们会通过 math.h 中的函数以弧度的形式汇集。除非您计划实现相当多的超越函数，这些函数在打包到位字段中的 Deg/Min/Secs 上运行。

那么，为什么不让事情变得简单，并按照您的要求将它们存储在 IEEE-754 度或弧度中呢？

score 1 · Accepted Answer

以下代码将 WGS84 坐标无损地打包成一个无符号长整数（即 8 个字节）：

using System;
using System.Collections.Generic;
using System.Text;

namespace Utils
{
    /// <summary>
    /// Lossless conversion of OSM coordinates to a simple long.
    /// </summary>
    unsafe class CoordinateStore
    {
        private readonly double _lat, _lon;
        private readonly long _encoded;

        public CoordinateStore(double lon,double lat)
        {
            // Ensure valid lat/lon
            if (lon < -180.0) lon = 180.0+(lon+180.0); else if (lon > 180.0) lon = -180.0 + (lon-180.0);
            if (lat < -90.0) lat = 90.0 + (lat + 90.0); else if (lat > 90.0) lat = -90.0 + (lat - 90.0);

            _lon = lon; _lat = lat;

            // Move to 0..(180/90)
            var dlon = (decimal)lon + 180m;
            var dlat = (decimal)lat + 90m;

            // Calculate grid
            var grid = (((int)dlat) * 360) + ((int)dlon);

            // Get local offset
            var ilon = (uint)((dlon - (int)(dlon))*10000000m);
            var ilat = (uint)((dlat - (int)(dlat))*10000000m);

            var encoded = new byte[8];
            fixed (byte* pEncoded = &encoded[0])
            {
                ((ushort*)pEncoded)[0] = (ushort) grid;
                ((ushort*)pEncoded)[1] = (ushort)(ilon&0xFFFF);
                ((ushort*)pEncoded)[2] = (ushort)(ilat&0xFFFF);
                pEncoded[6] = (byte)((ilon >> 16)&0xFF);
                pEncoded[7] = (byte)((ilat >> 16)&0xFF);

                _encoded = ((long*) pEncoded)[0];
            }
        }

        public CoordinateStore(long source)
        {
            // Extract grid and local offset
            int grid;
            decimal ilon, ilat;
            var encoded = new byte[8];
            fixed(byte *pEncoded = &encoded[0])
            {
                ((long*) pEncoded)[0] = source;
                grid = ((ushort*) pEncoded)[0];
                ilon = ((ushort*)pEncoded)[1] + (((uint)pEncoded[6]) << 16);
                ilat = ((ushort*)pEncoded)[2] + (((uint)pEncoded[7]) << 16);
            }

            // Recalculate 0..(180/90) coordinates
            var dlon = (uint)(grid % 360) + (ilon / 10000000m);
            var dlat = (uint)(grid / 360) + (ilat / 10000000m);

            // Returns to WGS84
            _lon = (double)(dlon - 180m);
            _lat = (double)(dlat - 90m);
        }

        public double Lon { get { return _lon; } }
        public double Lat { get { return _lat; } }
        public long   Encoded { get { return _encoded; } }


        public static long PackCoord(double lon,double lat)
        {
            return (new CoordinateStore(lon, lat)).Encoded;
        }
        public static KeyValuePair<double, double> UnPackCoord(long coord)
        {
            var tmp = new CoordinateStore(coord);
            return new KeyValuePair<double, double>(tmp.Lat,tmp.Lon);
        }
    }
}

来源：http ://www.dupuis.me/node/35

score 1 · Accepted Answer

正如@Roland Pihlakas 已经指出的那样，这取决于您使用坐标的精度。

我只是建议另一种观点：

地球的赤道周长（周长）为 40.000 公里；
这等于 40M 米，即 40 亿厘米；
32 位变量包含 2^32 或约 42 亿个不同的值，这比所提到的周长中的厘米数多一点。
这意味着，如果我们为纬度和经度选择 32 位整数值，它将允许我们以 < 1 厘米的精度定位地球上的一个点。
使用浮点值：
- float32包含 23 个有效位 => ~4.7 米精度
- float64包含 52 个有效位 => < 1 mm 精度

score 0 · Accepted Answer

您可以使用decimal数据类型：

CREATE TABLE IF NOT EXISTS `map` (
  `latitude` decimal(18,15) DEFAULT NULL,
  `longitude` decimal(18,15) DEFAULT NULL 
);

score 0 · Accepted Answer

在自己寻找答案后遇到这个问题后，这是基于一些先例的另一种可能的方案。

网络工作组 RFC 3825 为 DHCP（即在网络上分发 IP 地址的系统）提出了基于坐标的地理位置选项。见https://tools.ietf.org/rfc/rfc3825.txt

在该方案中，纬度和经度以度为单位编码，具有定点值，其中前 9 位是有符号度，25 位是小数度，6 位用于精度。精度位的值表示被认为是准确的 25 个小数位的数量（例如，通过消费者 GPS 与高精度测量员 GPS 收集的坐标）。使用 WGS84，精度为 8 位十进制数字，无论您在地球上的哪个位置，都可以达到大约一毫米。

正如其他一些人所发布的那样，浮点编码确实不适合这种类型的事情。是的，它可以表示非常多的小数位数，但准确性要么被忽略，要么必须在其他地方处理。例如，打印具有全浮点精度的浮点数或双精度数会导致带有十进制数字的数字不太可能是远程准确的。同样，简单地输出具有 8 或 10 位精度的浮点数或双精度数，根据浮点数的计算方式，很多都不是源值的真实表示（例如，为什么 1.2-1.0 不等于 0.2 使用浮点算术） .

有关为什么您应该关心坐标系精度的幽默示例，请参阅https://xkcd.com/2170/。

当然，RFC 3825 中使用的 40 位编码在 32 或 64 位世界中几乎不方便，但这种风格可以很容易地扩展到 64 位数字，其中 9 位用于有符号度，6 位用于精度，保留 49 位作为小数部分。这会产生 15 位十进制数字的精度，这基本上比任何人都需要的多（参见幽默示例）。

score 0 · Accepted Answer

最小尺寸的最佳精度是 int32。

存储 7 个小数位（1.11 厘米误差）经度双精度数为您提供 +/-1.800.000.000 的数字，非常适合存储在 int32 中，您只需将双精度数乘以 10M 即可

int32_t lng = (int32_t)(double_lng * 10000000);

解释（维基百科）

赤道被划分为 360 度经度，因此赤道处的每一度代表 111,319.5 米（111.32 公里）。然而，当一个人从赤道向极点移动时，经度的一个度数乘以纬度的余弦，距离减小，在极点接近零。赤道处 1 厘米精度所需的小数位数为 7。如果您需要将 180º 和 7 位小数存储在整数中，则结果将是 1.800.000.000 数字，该数字在 32 位整数范围内。

正如您在 Google 地图中看到的，当您单击任何位置时，Golge 会为您提供适合 32 位整数的 6 位小数浮点数。

比较：

vs 双 -> 半尺寸
vs float -> Float 没有足够的精度
与 24 位建议：任何 32 位或 64 位处理器都无法寻址 24 位，您必须获取三个字节然后转换为 int32 或加倍然后操作，丢失了很多 cicles 和许多代码行

encoding - proper/best type for storing latitude and longitude

15 回答 15

Related

Reference