0

I have a 4-D dataset (as xr.DataArray) with dimensions temperature, datasource, time, and altitude.

How can I create a scatter plot with of temperature(src0, z) vs. temperature(src1, z), so that I can select the altitude via a slider?

I'm currently having the problem that when I convert the data to a hv.Table, I have among others one column datasource and one column temperature, and I cannot figure out how to plot temperature(datasource=='src0') vs. temperature(datasource=='src1')


EDIT:

I try to clarify: I have a 4-D dataset DATA (which is a xr.DataArray) with dimensions data_variable, datasource, time, and altitude.

data_variable has 2 entries, temperature and humidity.

datasource has 2 entries, model and measurement

There are 6 altitudes and ~2000 times.

How can I create a scatter plot which has

  • on the x-axis the data for the datasource model
  • on the y-axis the data for the datasource measurement

such that altitude and data_variable can be selected with a slider?

4

1 回答 1

1

如果我正确理解您的问题,您想绘制温度随时间变化的散点值,比较两个数据源并按不同高度进行索引?

# Load the data into a holoviews Dataset
ds = hv.Dataset(data_array)

# Create Scatter objects plotting time vs. temperature
# and group by altitude and datasource
scatter = ds.to(hv.Scatter, 'time', 'temperature',
                groupby=['altitude', 'datasource'], dynamic=True)

# Now overlay the datasource dimension and display
scatter.overlay('datasource')

希望我正确理解了您的问题,但基于此基本模式,您应该能够以您想要的任何排列方式绘制数据。

编辑:根据您的编辑,主要问题是 HoloViews 期望每个 data_variable 都在一个单独的数组中,在熊猫术语中,您需要执行与pd.melt.

# Define data array like yours
dataarray = xr.DataArray(np.random.rand(10, 10, 2, 2), name='variable',
                   coords=[('time', range(10)), ('altitude', range(10)),
                           ('datasource', ['model', 'measurement']),
                           ('data_variable', ['humidity', 'temperature'])])

# Groupby datasource and data_variable, combining the resultant array into a Dataset with 4 data variables
group_dims = ['datasource', 'data_variable']
grouped = hv.Dataset(dataarray, datatype=['xarray']).groupby(group_dims)
dataset = xr.merge([da.data.rename({'variable': ' '.join(key)}).drop(group_dims)
                    for key, da in grouped.items()])

ds = hv.Dataset(dataset)
scatter = ds.to(hv.Scatter, 'model temperature', 'measurement temperature', 'altitude')

但是请注意,在测试这个时我遇到了一个错误,我现在已经打开了一个 PR(见这里

于 2017-03-20T14:16:41.160 回答