在此答案的基础上,您可以通过使用numpy.fromstring或numpy.fromfile获得良好的性能提升。另请参阅此答案。
这是我所做的:
def interpret_wav(raw_bytes, n_frames, n_channels, sample_width, interleaved = True):
if sample_width == 1:
dtype = np.uint8 # unsigned char
elif sample_width == 2:
dtype = np.int16 # signed 2-byte short
else:
raise ValueError("Only supports 8 and 16 bit audio formats.")
channels = np.fromstring(raw_bytes, dtype=dtype)
if interleaved:
# channels are interleaved, i.e. sample N of channel M follows sample N of channel M-1 in raw data
channels.shape = (n_frames, n_channels)
channels = channels.T
else:
# channels are not interleaved. All samples from channel M occur before all samples from channel M-1
channels.shape = (n_channels, n_frames)
return channels
如果需要将数据复制到内存中,则为 shape 分配新值将引发错误。这是一件好事,因为您想就地使用数据(总体上使用更少的时间和内存)。如果可能,ndarray.T 函数也不会复制(即返回视图),但我不确定您如何确保它不会复制。
使用 np.fromfile 直接从文件中读取会更好,但您必须使用自定义 dtype 跳过标题。我还没试过这个。