我读到的关键信息是在这个答案和同一页面上的其他人讨论如何在 scipy 中进行快速傅立叶变换以找到互相关曲线。
如果您的波形文件(输入)是一个具有两个 numpy 数组(左、右)的元组,零填充至少与自身一样长(以阻止它明显循环对齐),则代码遵循 Gustavo 的答案。我认为您需要认识到 ffts 做出了时间不变性的假设,这意味着如果您想要获得任何类型的基于时间的信号跟踪,您需要“咬掉”小数据样本。
- 假设一切都在一个平面上(没有高度因素)
- 忘记前面声音和后面声音的区别(你无法区分)
您还需要使用两个麦克风之间的距离来确保您没有收到回声(时间延迟大于 90 度延迟的时间延迟)。
import wave
import struct
from numpy import array, concatenate, argmax
from numpy import abs as nabs
from scipy.signal import fftconvolve
from matplotlib.pyplot import plot, show
from math import log
def crossco(wav):
"""Returns cross correlation function of the left and right audio. It
uses a convolution of left with the right reversed which is the
equivalent of a cross-correlation.
cor = nabs(fftconvolve(wav[0],wav[1][::-1]))
return cor
def trackTD(fname, width, chunksize=5000):
track = []
#opens the wave file using pythons built-in wave library
wav = wave.open(fname, 'r')
#get the info from the file, this is kind of ugly and non-PEPish
(nchannels, sampwidth, framerate, nframes, comptype, compname) = wav.getparams ()
#only loop while you have enough whole chunks left in the wave
while wav.tell() < int(nframes/nchannels)-chunksize:
#read the audio frames as asequence of bytes
frames = wav.readframes(int(chunksize)*nchannels)
#construct a list out of that sequence
out = struct.unpack_from("%dh" % (chunksize * nchannels), frames)
# Convert 2 channels to numpy arrays
if nchannels == 2:
#the left channel is the 0th and even numbered elements
left = array (list (out[0::2]))
#the right is all the odd elements
right = array (list (out[1::2]))
left = array (out)
right = left
#zero pad each channel with zeroes as long as the source
left = concatenate((left,[0]*chunksize))
right = concatenate((right,[0]*chunksize))
chunk = (left, right)
#if the volume is very low (800 or less), assume 0 degrees
if abs(max(left)) < 800 :
a = 0.0
#otherwise computing how many frames delay there are in this chunk
cor = argmax(crossco(chunk)) - chunksize*2
#calculate the time
t = cor/framerate
#get the distance assuming v = 340m/s sina=(t*v)/width
sina = t*340/width
a = asin(sina) * 180/(3.14159)
#add the last angle delay value to a list
#plot the list
要即时执行此操作,我想您需要有一个传入的立体声源,您可以在短时间内“收听”(我使用 1000 帧 = 0.0208 秒),然后计算并重复。
[编辑:发现您可以轻松使用 fft convolve 函数,使用两者之一的倒置时间序列进行相关]