2

我正在使用 Librosa 的本机 beat_track 函数,如下所示,

from librosa.beat import beat_track
tempo, beat_frames = beat_track(audio, sampling_rate)

歌曲的原始速度是 at146 BPM而函数是近似的73.5 BPM。虽然我明白73.5*2 ~ 148 BPM。我们如何实现以下目标: 1. 知道何时放大/缩小估计 2. 通过某种形式的信号预处理来提高准确性。

我正在学习 DSP,因此可能不习惯所有的概念。任何指导表示赞赏。谢谢。

4

1 回答 1

4

What you observe is the so-called "octave-error", i.e., the estimate is wrong by a factor of 2, 1/2, 3, or 1/3. It's a quite common problem in global tempo estimation. A great, classic introduction to global tempo estimation can be found in An Experimental Comparison of Audio Tempo Induction Algorithms. The article also introduces the common metrics Acc1 and Acc2.

Since the publication of that article, many researchers have tried to solve the octave-error problem. The (from my very biased point of view) most promising ones are A Single-Step Approach to Musical Tempo Estimation Using a Convolutional Neural Network by myself (you might also want to check out this later paper, which uses a simpler NN architecure) and Multi-Task Learning of Tempo and Beat: Learning One to Improve the Other by Böck et al.

Both approaches use convolutional neural networks (CNNs) to analyze the spectrograms. While a CNN could also be implemented in librosa, it currently is missing the programmatic infrastructure to easily do this. Another audio analysis framework seems to be a step ahead in this regard: Essentia. It is capable of running TensorFlow-models.

于 2020-05-06T08:02:20.697 回答