python - Python：将 wav 文件写入 numpy 浮点数组

Question

ifile = wave.open("input.wav")

我现在如何将此文件写入一个 numpy 浮点数组？

score 33 · Accepted Answer

>>> from scipy.io.wavfile import read
>>> a = read("adios.wav")
>>> numpy.array(a[1],dtype=float)
array([ 128.,  128.,  128., ...,  128.,  128.,  128.])

通常它是字节，然后是整数......这里我们只是将它转换为浮点类型。

你可以在这里阅读：https ://docs.scipy.org/doc/scipy/reference/tutorial/io.html#module-scipy.io.wavfile

score 14 · Accepted Answer

问这个问题七年后……

import wave
import numpy

# Read file to get buffer                                                                                               
ifile = wave.open("input.wav")
samples = ifile.getnframes()
audio = ifile.readframes(samples)

# Convert buffer to float32 using NumPy                                                                                 
audio_as_np_int16 = numpy.frombuffer(audio, dtype=numpy.int16)
audio_as_np_float32 = audio_as_np_int16.astype(numpy.float32)

# Normalise float32 array so that values are between -1.0 and +1.0                                                      
max_int16 = 2**15
audio_normalised = audio_as_np_float32 / max_int16

score 3 · Accepted Answer

使用librosa包并简单地将 wav 文件加载到 numpy 数组：

y, sr = librosa.load(filename)

将音频加载和解码为时间序列 y，表示为一维 NumPy 浮点数组。变量 sr 包含 y 的采样率，即音频每秒的采样数。默认情况下，所有音频都混合为单声道并在加载时重新采样为 22050 Hz。可以通过向 librosa.load() 提供附加参数来覆盖此行为。

Librosa 图书馆文档中的更多信息

score 0 · Accepted Answer

在@Matthew Walker 的答案下没有足够的声誉来发表评论，所以我提出了一个新的答案来添加对马特答案的观察。max_int16应该2**15-1不是2**15。

更好的是，我认为规范化线应该替换为：

audio_normalised = audio_as_np_float32 / numpy.iinfo(numpy.int16).max

如果音频是立体声（即两个通道），则左右值是交错的，因此要获得立体声阵列，可以使用以下方法：

channels = ifile.getnchannels()
audio_stereo = np.empty((int(len(audio_normalised)/channels), channels))
audio_stereo[:,0] = audio_normalised[range(0,len(audio_normalised),2)]
audio_stereo[:,1] = audio_normalised[range(1,len(audio_normalised),2)]

我相信这回答了评论部分中的@Trees 问题。

python - Python：将 wav 文件写入 numpy 浮点数组

4 回答 4

Related

Reference