2

I'm working on a project to recognize simple audio patterns. I have two data sets, each made up of between 4 and 32 note/duration pairs. One set is predefined, the other is from an incoming data stream. The length of the two strongly correlated data sets is often different, but roughly the same "shape". My goal is to come up with some sort of ranking as to how well the two data sets correlate/match.

I have converted the incoming frequencies to pitch and shifted the incoming data stream's pitch so that it's average pitch matches that of the predefined data set. I also stretch/compress the incoming data set's durations to match the overall duration of the predefined set. Here are two graphical examples of data that should be ranked as strongly correlated:

http://s2.postimage.org/FVeG0-ee3c23ecc094a55b15e538c3a0d83dd5.gif

(Sorry, as a new user I couldn't directly post images)

I'm doing this on a 8-bit microcontroller so resources are minimal. Speed is less an issue, a second or two of processing isn't a deal breaker.

It wouldn't surprise me if there is an obvious solution, I've just been staring at the problem too long. Any ideas?

Thanks in advance...

4

2 回答 2

0

看不到图形,但是... 将光谱分成多个箱。您可能已经这样做了,但它们可能太精细了。根据您的应用,考虑将频谱划分为 16 或 32 个 bin,可能是对数,因为这就是我们听到的方式。然后,比较每个 bin 中的功率。例如,将第一个样本中 500 Hz 与 1000 Hz 的比率与第二个样本中的相同比率进行比较。这消除了样本幅度不等的任何问题。

于 2010-03-29T18:32:50.033 回答
0

一维信号匹配通常使用卷积函数来完成。然而,这可能是处理器密集型的。

可以使用的更简单的算法是首先检查两个信号的每个音符的持续时间是否大致相等。然后如果检查两个信号的下一个频率模式是否相同。我所说的下一个频率模式的意思是将频率的有序列表分解为下一个频率是更高还是更低的有序列表。所以从 500Hz 到 1000Hz 到 700Hz 到 400Hz 的东西会简单地变成高-低-低。这可能足够好,具体取决于您的目的。

于 2010-06-19T05:04:36.727 回答