我现在很难过。我一直在环顾四周并尝试音频比较。我找到了相当多的材料,以及对不同库和方法的大量参考。
到目前为止,我已经采用Audacity并导出了一个名为“long.wav”的 3 分钟 wav 文件,然后将其中的前 30 秒拆分为一个名为“short.wav”的文件。我想沿着这条线的某个地方我可以通过java为每个可视化记录(log.txt)数据,并且应该能够至少看到值之间的一些视觉相似性......这里有一些代码
主要方法:
int totalFramesRead = 0;
File fileIn = new File(filePath);
BufferedWriter writer = new BufferedWriter(new FileWriter(outPath));
writer.flush();
writer.write("");
try {
AudioInputStream audioInputStream =
AudioSystem.getAudioInputStream(fileIn);
int bytesPerFrame =
audioInputStream.getFormat().getFrameSize();
if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
// some audio formats may have unspecified frame size
// in that case we may read any amount of bytes
bytesPerFrame = 1;
}
// Set an arbitrary buffer size of 1024 frames.
int numBytes = 1024 * bytesPerFrame;
byte[] audioBytes = new byte[numBytes];
try {
int numBytesRead = 0;
int numFramesRead = 0;
// Try to read numBytes bytes from the file.
while ((numBytesRead =
audioInputStream.read(audioBytes)) != -1) {
// Calculate the number of frames actually read.
numFramesRead = numBytesRead / bytesPerFrame;
totalFramesRead += numFramesRead;
// Here, do something useful with the audio data that's
// now in the audioBytes array...
if(totalFramesRead <= 4096 * 100)
{
Complex[][] results = PerformFFT(audioBytes);
int[][] lines = GetKeyPoints(results);
DumpToFile(lines, writer);
}
}
} catch (Exception ex) {
// Handle the error...
}
audioInputStream.close();
} catch (Exception e) {
// Handle the error...
}
writer.close();
然后执行FFT:
public static Complex[][] PerformFFT(byte[] data) throws IOException
{
final int totalSize = data.length;
int amountPossible = totalSize/Harvester.CHUNK_SIZE;
//When turning into frequency domain we'll need complex numbers:
Complex[][] results = new Complex[amountPossible][];
//For all the chunks:
for(int times = 0;times < amountPossible; times++) {
Complex[] complex = new Complex[Harvester.CHUNK_SIZE];
for(int i = 0;i < Harvester.CHUNK_SIZE;i++) {
//Put the time domain data into a complex number with imaginary part as 0:
complex[i] = new Complex(data[(times*Harvester.CHUNK_SIZE)+i], 0);
}
//Perform FFT analysis on the chunk:
results[times] = FFT.fft(complex);
}
return results;
}
在这一点上,我尝试到处记录:转换前的音频字节、复杂值和 FFT 结果。
问题:无论我记录什么值,每个 wav 文件的 log.txt 都是完全不同的。我不明白。鉴于我从 large.wav 中获取了 small.wav(并且它们具有所有相同的属性),原始 wav byte[] 数据...或 Complex[][] fft 数据之间应该有非常大的相似性。 ..或到目前为止的东西..
如果在这些计算的任何时候数据甚至不接近相似, 我怎么可能尝试比较这些文件。
我知道我在音频分析方面缺少相当多的知识,这就是我向董事会寻求帮助的原因!感谢您提供的任何信息、帮助或修复!