0

I'm using firebase cloud functions to transcribe user-uploaded audio files with the example code for longRunningRecognize:

// Detects speech in the audio file. This creates a recognition job that you
// can wait for now, or get its result later.
const [operation] = await client.longRunningRecognize(request);

// Get a Promise representation of the final result of the job
const [response] = await operation.promise();

This code works fine for short audio files that can be transcribed faster than the 9-minute firebase cloud function maximum execution limit, but 1) many of my ~hour-long user-uploaded files don't get transcribed that quickly, and 2) it seems wasteful to have a cloud function getting billed for each tenth of a second it's running just sitting around waiting for an API response.

I think the obvious fix here would be for Google's Speech-to-Text API to support webhooks.

Until that happens, how can I serialize and deserialize the SpeechClient operation so I can get the result of this transcription job later from a scheduled function?

Specifically, I'm looking for something that would work like the made-up SERIALIZE and DESERIALIZE functions in this example:

// start speech recognition job:
const [operation] = await client.longRunningRecognize(request);
const serializedOperation = operation.SERIALIZE();
db.doc("jobs/job1").set(serializedOperation);

// get the result later in a scheduled function:
const snap = await db.doc("jobs/job1").get();
const serializedOperation = snap.data();
const operation = DESERIALIZE(serializedOperation);
const [response] = await operation.promise();
4

2 回答 2

1

LongRunningRecognize returns an Operation. Operation name is unique.

You could save the Operation name somewhere and then at a later time call GetOperation

于 2021-02-11T06:12:25.120 回答
0

Thank you Brendan for the pointer to GetOperation—that was the linchpin I needed to figure this out.

Serializing an operation is trivially easy: just call operation.name and you'll get the operation's unique ID.

Deserializing an operationName with the @google-cloud/speech node library was SO FRUSTRATINGLY DIFFICULT to figure out how to do but I finally figured it out.

To check on the status of an Operation and get its result from an operation.name, use client.checkLongRunningRecognizeProgress(operation.name) like this:

const operation = await client.checkLongRunningRecognizeProgress(operationName);
if(operation.done) {
  console.log(JSON.stringify(operation.result));
} else {
  const {progressPercent, startTime, lastUpdateTime} = op.metadata;
}
于 2021-02-14T03:36:36.110 回答