I'm trying to input custom data(MIDI vector) into the PCA function of sklearn library.
Below is the current shape of my data.
data
[[[ 4. 56. ] # [rhythm1 melody1]
[ 2. 56. ] # [rhythm2 melody2]
[ 2. 55. ] # [rhythm3 melody3]
[ 2.5 55. ] # ...
[ 1.5 -1. ] # ...
[ 4. -1. ] # ...
[ 4. -1. ] # ...
[ 4. -1. ] # ...
[ 4. -1. ] # ...
[ 4. -1. ]] # [rhythm n melody n]
# next is another MIDI file's Rhythm & Melody
[[ 4. 56. ]
[ 2. 56. ]
[ 2. 55. ]
[ 2.5 55. ]
[ 1.5 -1. ]
[ 4. -1. ]
[ 4. -1. ]
[ 4. -1. ]
[ 4. -1. ]
[ 4. -1. ]]]
I know that the shape of the sklearn input data is 2D array. However, I want to reduce the dimension in consideration of rhythm and melody.
Finally, I want to make my data like below.
data
[[x1 # dimension reduction about (rhythm1 & melody1)
x2 # dimension reduction about (rhythm2 & melody2)
x3 # dimension reduction about (rhythm3 & melody3)
x4 # ...
x5 # ...
x6 # ...
x7 # ...
x8 # ...
x9 # ...
x10] # dimension reduction about (rhythm10 & melody10)
# next is another MIDI file's Rhythm & Melody
[y1
y2
y3
y4
y5
y6
y7
y8
y9
y10]]
here is my code.
def PCA_preprocessing(data, n_components=2):
pca = PCA(n_components=n_components)
pca.fit(data)
PCA_data = pca.transform(data)
return PCA_data
num_seq = data.shape[1] # num_seq = 10
PCA_data = PCA_preprocessing(data, n_components=num_seq)
Error code ValueError: Found array with dim 3. Estimator expected <= 2.
How can I solve this problem? Thank you for reading.