python - 重塑 Python 列表以匹配输入层（数据预处理 - Keras - LSTM - MoCap）

Question

再会，

我正在尝试使用多个 excel 文件（运动捕捉数据）作为输入来训练 LSTM。每个 excel 文件代表一个身体动作，我想在训练集中和测试集中使用多个动作来训练网络。下面是单个 excel 文件的示例：

至于输入形状，是(1, 2751, 93)，输入维度分解：样本：1，时间步长：2751，特征：93

输入自变量 (x) 是人体关节及其位置，因变量 (y) 是每个动作的标签。

提前致谢！

编辑：添加了详细的代码

# Multiple Sheets
import os
import glob
motionName = []
for ds in glob.glob("*.csv"):
    head, tail = os.path.split(str(ds)) 
    motionName.append(tail)
    print('Motion Name: ', tail)

import pandas as pd
num_rows = 300
samples = 0
datasets = []
activityIndex = []
list_num_features = [[]]
for i, activity in enumerate(motionName):
    data = pd.read_csv('{}'.format(motionName[i]), nrows = num_rows, header=None, skiprows=1)
    list_num_features.append([])
    datasets.append(data)
    #datasets[i].append(data)
    for j in range(0, len(data.columns)):
      list_num_features[i].append(data.columns[j])
      
    activityIndex.append('{}'.format(motionName[i]))
    samples += 1
print('activityIndex : {} '.format(activityIndex))
for i in range(0, len(datasets)-1):
  print('{}'.format(motionName[i]))
  print(datasets[i].head())

输出：

因此，调用“df.head()”时获得的预期输出类似于此输出：

我想要做的是能够在需要时分别获取/打印每条记录（行）。在使用下面的示例代码加载单个数据帧时，我能够做到这一点，但在尝试将多个数据帧加载到列表中然后尝试使用循环为每个数据帧实现相同的步骤时失败。

# Single Sheet
import pandas as pd
dataset = pd.read_csv('motion.csv')
index = dataset.index
print(len(index))
num_rows = len(index)
dataset.head()

编辑：问题澄清！

简单地说，我现在拥有的是以下内容：

存储在列表中的 8 个数据帧（列表形状 (8,)）
每个数据框形状为 (300,93)

例如，我想要做的是将此列表调整为 (8, 300, 93)，以便它匹配神经网络的输入层。

当我不断收到以下错误时：

ValueError: cannot reshape array of size 8 into shape (8,300,93)

如果可能的话，我要求澄清，因为对于我为什么会出现这个错误，我的结果有点模糊。

提前致谢！

score 0 · Accepted Answer

Wrote this function to handle the preprocessing to overcome the reshaping issue. Also, the function encodes the labels (y) using Scikit-Learn 'LabelEncadoer()'.

## Data Preprocessing 
from sklearn.preprocessing import LabelEncoder
def preprocess_df(df, start, quantity, numRows, df_name):
    x = []
    features = []
    y = []
    label_encoder = LabelEncoder()
    for i in range(start, quantity):
        data = pd.read_csv('{}'.format(df[i]), nrows=numRows, skiprows=1)
        y.append(df[i])
        x.append(data)
        if i == start:    
            for j in range(0, len(data.columns)):
                features.append(data.columns[j])
        if df_name == 'test':
            i = i - start
            print('({}/{}) x[{}]: {}'.format(i+1, (quantity - start), i, x[i].shape))
        else:
            print('({}/{}) x[{}]: {}'.format(i+1, quantity, i, x[i].shape))
    print('{} set (x) shape: {}, {} set (y) shape: {}'.format(df_name, np.array(x).shape, df_name, np.array(y).shape))
    y = np.array(label_encoder.fit_transform(y))
    return np.array(x), y, np.array(features)

python - 重塑 Python 列表以匹配输入层（数据预处理 - Keras - LSTM - MoCap）

编辑：添加了详细的代码

编辑：问题澄清！

1 回答 1

Related

Reference