python - 如何沿动物轨迹点提取 ECMWF ERA-5 数据？

Question

我想从 ERA5 每小时压力水平数据中提取/插入每个轨道位置（空间 [xy]、时间 [t] 和高度）的气温值。

我使用 CDS 工具箱检索了 ERA5 数据集，如下所示。但是，我不知道如何提取每个点的值。我尝试在 CDS 工具箱中使用该工具：'ct.observation.interp_from_grid()'，但没有成功。

import cdstoolbox as ct

# Initialise the application
@ct.application(title='my trial to retrieve and annotate movement data')

# Define a download output for the application
@ct.output.download()

# Define application function
def application():
    """Define a function that extracts hourly Air Temperature in 2018 for track points and provides a download link.

    # Retrieve hourly air temperature
    data = ct.catalogue.retrieve(
        'reanalysis-era5-pressure-levels',
        {
            'variable': 'temperature',
            'product_type': 'reanalysis',
            'pressure_level': [
            '900', '925', '950',
        ],
            'year': 2018,
            'month': '05',
            'day': '01',
            'time': [
                '00:00', '01:00', '02:00', '03:00',
                '04:00', '05:00', '06:00', '07:00',
                '08:00', '09:00', '10:00', '11:00',
                '12:00', '13:00', '14:00', '15:00',
                '16:00', '17:00', '18:00', '19:00',
                '20:00', '21:00', '22:00', '23:00',
            ],
        'area': [
            48, 111, 47,
            112,
        ],
            }
    )
    # Interpolate data for track points
    indexers = {'lon': [146.29, 147.10], 'lat': [-6.689, -7.644], 'plev':['891', '653'], 'time':['2019-03-23 18:52:29', '2019-03-23 21:52:30']}
    points = ct.observation.interp_from_grid(data, method='nearest', drop='False', wrap_lon='False', **indexers)    
    
    print(points)
    return points

或者，我可以先下载 ERA5 数据，然后使用 R 中光栅包的提取功能。但是，我不想在我的计算机上下载大量数据集（可能是数百 GB，甚至 TB），因为我的轨道点覆盖了大的空间和时间尺度。

这是一个仅用于演示的虚拟跟踪点。

structure(list(Latitude = c(-6.689718, -7.644683, -8.31021, -9.177921, 
-9.493564), Longitude = c(146.297638, 147.107101, 148.211472, 
148.670151, 149.00795), timestamp = c("2019-03-23 15:52:14", 
"2019-03-23 18:52:29", "2019-03-23 21:52:30", "2019-03-24 00:52:29", 
"2019-03-24 03:52:15"), altitude_hPa = c(891, 653, 521, 910, 
711)), class = "data.frame", row.names = c(NA, -5L))

如果您有任何建议或其他方式，我将不胜感激。

提前致谢，

蝙蝠

score 1 · Accepted Answer

海蝙蝠

到目前为止，我还不知道 cdstoolbox，但是根据您的演示请求（使用cdstoolbox-remote；非常方便！），我对它进行了更深入的了解。我将问题追溯到interp_from_grids包含以下代码行的方法：

if 'time' in indexers:
   indexers['time'] = indexers['time'].astype('float64')

如果indexers包含"time"该方法尝试将其转换为 float64 - 这不适str list用于您的演示。为了解决这个问题，我将"time"数组转换为一个numpy.datetime64对象。就像是：

numpy.array(['2019-03-23 18:52:29', '2019-03-23 21:52:30'], dtype = 'datetime64')

这解决了"AttributeError: 'list' object has no attribute 'astype'"错误（因为它现在可以转换为float64），但是，它不是 JSON 可序列化的（新错误：）"AttributeError: 'list' object has no attribute 'astype'"。

在这一点上我有点迷茫 - 时间插值是否有效？该方法（没有时间深入挖掘）似乎以"time"某种方式处理，但是，我在 cdstoolbox 网站上找不到示例。时间点插值甚至可能吗？

一切顺利，R

python - 如何沿动物轨迹点提取 ECMWF ERA-5 数据？

1 回答 1

Related

Reference