javascript - Continuous Speech Recognition on browser like "ok google" or "hey siri"

Question

I am doing a POC and my requirement is that I want to implement the feature like OK google or Hey Siri on browser.

I am using the Chrome Browser's Web speech api. The things I noticed that I can't continuous the recognition as it terminates automatically after a certain period of time and I know its relevant because of security concern. I just does another hack like when the SpeechReognition terminates then on its end event I further start the SpeechRecogntion but it is not the best way to implement such a solution because suppose if I am using the 2 instances of same application on the different browser tab then It doesn't work or may be I am using another application in my browser that uses the speech recognition then both the application doesn't behave the same as expected. I am looking for a best approach to solve this problem.

Thanks in advance.

score 3 · Accepted Answer

由于您的问题是您无法长时间连续运行 SpeechRecognition，因此一种方法是仅当您在麦克风中获得一些输入时才启动 SpeechRecognition。

这样只有当有一些输入时，你才会启动 SR，寻找你的 magic_word。
如果找到了 magic_word，那么您将能够正常使用 SR 执行其他任务。

这可以通过 WebAudioAPI 检测到，它不受 SR 所受时间限制的约束。您可以通过 LocalMediaStream 从MediaDevices.getUserMedia.

有关更多信息，请在下面的脚本中查看此答案。

以下是如何将其附加到 SpeechRecognition：

const magic_word = ##YOUR_MAGIC_WORD##;

// initialize our SpeechRecognition object
let recognition = new webkitSpeechRecognition();
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;
recognition.continuous = true;

// detect the magic word
recognition.onresult = e => {
    // extract all the transcripts
    var transcripts  = [].concat.apply([], [...e.results]
      .map(res => [...res]
        .map(alt => alt.transcript)
      )
    );
  if(transcripts.some(t => t.indexOf(magic_word) > -1)){
    //do something awesome, like starting your own command listeners
  }
  else{
    // didn't understood...
  }
}
// called when we detect silence
function stopSpeech(){
    recognition.stop();
}
// called when we detect sound
function startSpeech(){
    try{ // calling it twice will throw...
      recognition.start();
  }
  catch(e){}
}
// request a LocalMediaStream
navigator.mediaDevices.getUserMedia({audio:true})
// add our listeners
.then(stream => detectSilence(stream, stopSpeech, startSpeech))
.catch(e => log(e.message));


function detectSilence(
  stream,
  onSoundEnd = _=>{},
  onSoundStart = _=>{},
  silence_delay = 500,
  min_decibels = -80
  ) {
  const ctx = new AudioContext();
  const analyser = ctx.createAnalyser();
  const streamNode = ctx.createMediaStreamSource(stream);
  streamNode.connect(analyser);
  analyser.minDecibels = min_decibels;

  const data = new Uint8Array(analyser.frequencyBinCount); // will hold our data
  let silence_start = performance.now();
  let triggered = false; // trigger only once per silence event

  function loop(time) {
    requestAnimationFrame(loop); // we'll loop every 60th of a second to check
    analyser.getByteFrequencyData(data); // get current data
    if (data.some(v => v)) { // if there is data above the given db limit
      if(triggered){
        triggered = false;
        onSoundStart();
        }
      silence_start = time; // set it to now
    }
    if (!triggered && time - silence_start > silence_delay) {
      onSoundEnd();
      triggered = true;
    }
  }
  loop();
}

作为一个 plunker，因为 StackSnippets 和 jsfiddle 的 iframe 都不允许 gUM 有两个版本......

javascript - Continuous Speech Recognition on browser like "ok google" or "hey siri"

1 回答 1

Related

Reference