0

我为标题的绕口令道歉。

总而言之,我正在尝试将 dtw 应用于我拥有的年轮系列数据框。我希望能够将 dtw 应用于每一列,将每一列与数据集中的其余部分进行比较,但这只是为了弄清楚逻辑,这让我感到困惑——我现在有一个数组 9one 列)和一个我想要的数组数组将它与单独的(其他列)进行比较 - 因为我有 46 列,手动执行此操作将花费大量时间。所以我正在寻找一种方法来打印每列之间的距离。

我有我的单个数组,即第 1 列(a1):

array([2231.121954586618, 2191.32688635395, 2153.33037342928,
       2167.460745065675, 2182.327272147529, 2148.104497944283,
       2114.629371754906, 2093.254599793933, 2013.228738264795,
       1960.124018035272, 1956.115012446374, 2004.772102502964,
       1996.031697793075, 1984.117922837165, 1927.018245950742,
       1889.983294062236, 1857.106663618318, 1855.521387844768,
       1854.30527162405, 1843.946144942001, 1834.918111326537,
       1786.367506785417, 1764.596236951255, 1765.789120636587,
       1768.225728544412, 1801.390137110182, 1820.438710725669,
       1821.776101512033, 1814.626915671021, 1789.410699262131,
       1752.680382970908, 1774.240633213347, 1793.576383615812,
       1802.430943044276, 1810.653920721653, 1832.59203921635,
       1836.215188930494, 1804.727265942576, 1807.798802135772,
       1853.273004232627, 1875.641068893134, 1880.352238594259,
       1845.111091114404, 1807.281434172499, 1802.326163448382,
       1779.565520429905, 1827.148896035324, 1860.634653074935],
      dtype=object)

和数组数组,即列 2:46 (a1_compare):

array([[2338.980748451803, 2313.115476761541, 2266.320969548615, ...,
        1971.777882561555, 2004.912406403344, 2005.090872507429],
       [5085.120869045766, 4994.508983933459, 4926.377921200292, ...,
        3810.539158921751, 3757.139414193585, 3698.921580852207],
       [1441.5932022738868, 1441.5932022738868, 250.2478024965511, ...,
        2864.532339498514, 2775.946234841519, 2764.567521984336],
       ...,
       [822.4370926086343, 848.1167402384477, 887.7301546370533, ...,
        1549.347739499023, 1592.226581401639, 1577.883355154341],
       [1508.596325796503, 1593.192415483712, 1587.73520115259, ...,
        1467.943298815971, 1556.004468001763, 1528.921150058964],
       [1300.0305814488, 1369.177320180398, 1480.576904436118, ...,
        1379.66588731831, 1367.312665162758, 1328.830519316272]],
      dtype=object)

最后是我尝试比较它们的代码:

def compare1(array1, array_arrays):
for i in array_arrays:
    distance, path = fastdtw(a1, i, dist = manhattan_distance)
return distance

但这仅返回一个值:

compare1(a1, a1_compare)

12271.277

当我希望它成为每个人时-第一个是:4164.2393701224755,但我也想要所有其他人。关于如何在不必单独比较每个列/数组的情况下执行此操作的任何建议?

4

1 回答 1

0

如果我正确地解释了您的问题,则您的 for 循环会计算然后覆盖distance每个子数组,a1_compare直到最后一次迭代,此时仅返回最后一个的值。有很多方法可以保存每次迭代的结果,但对我来说最明智的方法是分配一个与每次迭代长度相同的空数组a1_compare并保存distance到输出数组的适当索引中:

def compare1(array1, array_arrays):
    distances = np.empty(len(array_arrays)) #create an empty container the right size
    #enumerate is an easy way to get an index at the same time as the value itself.
    for i, value in enumerate(array_arrays): 
        distance, path = fastdtw(a1, value, dist = manhattan_distance)
        distances[i] = distance #save our results
    return distances #return all of them
于 2020-07-19T03:12:29.303 回答