这是我的 Google Speech to Text AI 设置
这是 Speech to Text AI 的输出文件:https ://justpaste.it/speechtotext2
这是 YouTube 自动字幕的输出文件:https ://justpaste.it/ytautotranslate
这是视频链接:https ://www.youtube.com/watch?v=IOMO-kcqxJ8&ab_channel=SoftwareEngineeringCourses-SECourses
这是提供给 Google Speech AI 的视频的音频文件:https ://storage.googleapis.com/text_speech_furkan/machine_learning_lecture_1.flac
在这里,我提供时间分配的 SRT 文件
YouTube 的 SRT:https ://drive.google.com/file/d/1yPA1m0hPr9VF7oD7jv5KF7n1QnV3Z82d/view?usp=sharing
Google Speech to Text API 的 SRT(由 YouTube 分配的时间):https ://drive.google.com/file/d/1AGzkrxMEQJspYenCbohUM4iuXN7H89wH/view?usp=sharing
我比较了一些句子,肯定 YouTube 的自动翻译更好
例如
谷歌语音转文本: Represent the **doctor** representation is one of the hardest part of computer AI you will learn about more about that in the future lessons.
What does this mean? Do you think this means that we are not just focused on behavior and **into doubt**. It is more about the reasoning when a human takes an action. There is a reasoning behind it.
YouTube 的自动字幕: represent the **data** representation is one of the hardest part of computer ai you will we will learn more about that in the future lessons
what does this mean do you think this means that we are not just focused on behavior and **input** it is more about the reasoning when a human takes an action there is a reasoning behind it
我检查了很多案例,YouTube 的猜对词要好得多。这怎么可能?
这是我用来提取视频音频的命令:ffmpeg -i "input.mkv" -af aformat=s16:48000:output.flac