我正在尝试编写一个python代码,其中用户输入一个视频文件,程序将静音/哔哔其中的诅咒/坏词并输出一个过滤的视频文件,基本上是一个亵渎过滤器。首先,我将视频文件转换为 .wav 格式,然后尝试将音频亵渎过滤器应用到 .wav 文件中,并将该 .wav 文件写入视频。
到目前为止,我能够制作音频文件块并使用 Speech_recognition 库从每个 5 秒的音频块中提取文本。但是,如果单词在块之间重叠,我将无法检测到它们并应用检查条件,如果在诅咒单词列表中找到该块文本并将减少该音频文件的 db,这将使其静音(建议其他方式发出哔声而不是静音)。
如果我的方法是正确的,我很困惑。我只想要一个使用 python 的音频亵渎过滤器,直到现在我只能提取文本并制作块。
import speech_recognition as sr
import os
from pydub.utils import make_chunks
from pydub import AudioSegment
from pydub.playback import play
import codecs
import re
import fileinput
list=[]
#Curse words list
with codecs.open("words_final.txt", "r") as f0:
sentences_lines=f0.read().splitlines()
for sentences in sentences_lines:
list.append(sentences)
# print(list)
# create a speech recognition object
r = sr.Recognizer()
# a function that splits the audio file into chunks
# and applies speech recognition
def get_large_audio_transcription(path):
"""
Splitting the large audio file into chunks
and apply speech recognition on each of these chunks
"""
# open the audio file using pydub
sound = AudioSegment.from_wav(path)
chunk_length_ms = 5000 # pydub calculates in millisec
chunks = make_chunks(sound, chunk_length_ms) #Make chunks of one sec
folder_name = "audio-chunks"
# create a directory to store the audio chunks
if not os.path.isdir(folder_name):
os.mkdir(folder_name)
whole_text = ""
# process each chunk
for i, audio_chunk in enumerate(chunks, start=1):
# export audio chunk and save it in
# the `folder_name` directory.
chunk_filename = os.path.join(folder_name, f"chunk{i}.wav")
audio_chunk.export(chunk_filename, format="wav")
# recognize the chunk
with sr.AudioFile(chunk_filename) as source:
audio_listened = r.record(source)
# try converting it to text
try:
text = r.recognize_google(audio_listened,language="en-US")
wav_file=AudioSegment.from_file(chunk_filename, format = "wav")
# Reducing volume by 5
silent_wav_file = AudioSegment.silent(duration=8000)
# Playing silent file
play(silent_wav_file)
except sr.UnknownValueError as e:
print("Error:", str(e))
else:
text = f"{text.capitalize()}. "
print(chunk_filename, ":", text)
whole_text += text
# return the text for all chunks detected
return whole_text
path = "Welcome.wav"
print("\nFull text:", get_large_audio_transcription(path))
#Will implement a loop to sum all chunks a make a final filtered .wav file