7

我希望在后台运行一个 python 脚本,并在麦克风的阈值达到某个点时使用 pyaudio 录制声音文件。这适用于双向无线电网络上的监视器。因此,我们只想记录传输的音频。

心目中的任务:

  • 在百分比门阈值上记录音频输入

  • 在沉默了这么多秒后停止录制

  • 音频后继续录制这么多秒

  • 阶段 2:将数据输入 MySQL 数据库以搜索记录

我正在查看类似的文件结构

/home/Recodings/2013/8/23/12-33.wav 将是 23/08/2013 @ 12:33.wav 传输的记录

我使用了来自

使用 python 检测并录制声音

我现在有点不知所措,如果能得到一点指导,将不胜感激

谢谢你

4

4 回答 4

17

当前的最佳答案有点过时,仅适用于 python 2。这是为 python 3 更新的版本。它将函数包装到类中,并将所有内容打包成一个简单易用的版本。请注意,最佳答案和我的脚本之间有一个关键区别:

顶部的脚本记录一个文件然后停止,而我的脚本会在检测到噪音时继续记录,并将记录转储到目录中。

这两个脚本的主要思想非常相似:

第 1 步:“监听”直到 rms 大于阈值

第 2 步:开始录制,设置停止录制的计时器,== TIMEOUT_LENGTH

第 3 步:如果 rms 在计时器超时之前再次突破阈值,则重置计时器

步骤 4:现在计时器已过期,将录音写入目录并返回步骤 1

import pyaudio
import math
import struct
import wave
import time
import os

Threshold = 10

SHORT_NORMALIZE = (1.0/32768.0)
chunk = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
swidth = 2

TIMEOUT_LENGTH = 5

f_name_directory = r'C:\Users\Jason\PyCharmProjects\AutoRecorder\records'

class Recorder:

    @staticmethod
    def rms(frame):
        count = len(frame) / swidth
        format = "%dh" % (count)
        shorts = struct.unpack(format, frame)

        sum_squares = 0.0
        for sample in shorts:
            n = sample * SHORT_NORMALIZE
            sum_squares += n * n
        rms = math.pow(sum_squares / count, 0.5)

        return rms * 1000

    def __init__(self):
        self.p = pyaudio.PyAudio()
        self.stream = self.p.open(format=FORMAT,
                                  channels=CHANNELS,
                                  rate=RATE,
                                  input=True,
                                  output=True,
                                  frames_per_buffer=chunk)

    def record(self):
        print('Noise detected, recording beginning')
        rec = []
        current = time.time()
        end = time.time() + TIMEOUT_LENGTH

        while current <= end:

            data = self.stream.read(chunk)
            if self.rms(data) >= Threshold: end = time.time() + TIMEOUT_LENGTH

            current = time.time()
            rec.append(data)
        self.write(b''.join(rec))

    def write(self, recording):
        n_files = len(os.listdir(f_name_directory))

        filename = os.path.join(f_name_directory, '{}.wav'.format(n_files))

        wf = wave.open(filename, 'wb')
        wf.setnchannels(CHANNELS)
        wf.setsampwidth(self.p.get_sample_size(FORMAT))
        wf.setframerate(RATE)
        wf.writeframes(recording)
        wf.close()
        print('Written to file: {}'.format(filename))
        print('Returning to listening')



    def listen(self):
        print('Listening beginning')
        while True:
            input = self.stream.read(chunk)
            rms_val = self.rms(input)
            if rms_val > Threshold:
                self.record()

a = Recorder()

a.listen()
于 2018-05-15T00:22:08.533 回答
11

前段时间我写了一些步骤

  • Record audio input on a n% gate threshold

A:为“Silence”启动一个布尔变量类型,你可以计算RMS来决定Silence是真还是假,设置一个RMS阈值

  • stop recording after so many seconds of silence

A:您是否需要计算一次超时,因为它会获取帧速率、块大小以及您想要多少秒,以计算您的超时时间 (FrameRate / chunk * Max_Seconds)

  • keep recording for so many seconds after audio

A: 如果 Silence 为 false == (RMS > Threshold) 获取音频的最后一块数据 (LastBlock) 并保持记录:-)

  • Phase 2: input data into MySQL database to search the recordings

答:这一步取决于你

源代码:

import pyaudio
import math
import struct
import wave

#Assuming Energy threshold upper than 30 dB
Threshold = 30

SHORT_NORMALIZE = (1.0/32768.0)
chunk = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
swidth = 2
Max_Seconds = 10
TimeoutSignal=((RATE / chunk * Max_Seconds) + 2)
silence = True
FileNameTmp = '/home/Recodings/2013/8/23/12-33.wav'
Time=0
all =[]

def GetStream(chunk):
    return stream.read(chunk)
def rms(frame):
    count = len(frame)/swidth
    format = "%dh"%(count)
    # short is 16 bit int
    shorts = struct.unpack( format, frame )

    sum_squares = 0.0
    for sample in shorts:
        n = sample * SHORT_NORMALIZE
        sum_squares += n*n
    # compute the rms 
    rms = math.pow(sum_squares/count,0.5);
    return rms * 1000

def WriteSpeech(WriteData):
    stream.stop_stream()
    stream.close()
    p.terminate()
    wf = wave.open(FileNameTmp, 'wb')
    wf.setnchannels(CHANNELS)
    wf.setsampwidth(p.get_sample_size(FORMAT))
    wf.setframerate(RATE)
    wf.writeframes(WriteData)
    wf.close()

def KeepRecord(TimeoutSignal, LastBlock):
    all.append(LastBlock)
    for i in range(0, TimeoutSignal):
        try:
            data = GetStream(chunk)
        except:
            continue
        #I chage here (new Ident)
        all.append(data)

    print "end record after timeout";
    data = ''.join(all)
    print "write to File";
    WriteSpeech(data)
    silence = True
    Time=0
    listen(silence,Time)     

def listen(silence,Time):
    print "waiting for Speech"
    while silence:
        try:
            input = GetStream(chunk)
        except:
            continue
        rms_value = rms(input)
        if (rms_value > Threshold):
            silence=False
            LastBlock=input
            print "hello ederwander I'm Recording...."
            KeepRecord(TimeoutSignal, LastBlock)
        Time = Time + 1
        if (Time > TimeoutSignal):
            print "Time Out No Speech Detected"
            sys.exit()

p = pyaudio.PyAudio()

stream = p.open(format = FORMAT,
    channels = CHANNELS,
    rate = RATE,
    input = True,
    output = True,
    frames_per_buffer = chunk)

listen(silence,Time)
于 2013-08-24T21:14:23.227 回答
0

所以你只需要这个getLevel(data)功能吗?快速破解将是:

def getLevel(data):
   sqrsum = 0
   for b in data:
      b = ord(b)
      sqrsum+=b*b
   return sqrsum

这应该随着音量增加。通过反复试验适当地设置阈值。

于 2013-08-23T18:24:06.400 回答
0

对于那些因为缺少 portaudio.h 而在安装 pyaudio 时遇到问题的人,您可以这样做:

sudo apt-get install portaudio19-dev python-pyaudio python3-pyaudio

答案来自:portaudio.h: No such file or directory

于 2021-07-18T11:46:20.163 回答