1

我正在用 python 为 Amazon Echo 构建一个应用程序。当我说出 Amazon Echo 无法识别的坏话时,我的技能会退出并返回主屏幕。我希望阻止这种情况并重复亚马逊 Echo 刚刚所说的话。

为了在某​​种程度上实现这一点,我尝试调用一个函数来在会话结束或检测到错误输入时说些什么。

def on_session_ended(session_ended_request, session):
    """
    Called when the user ends the session.
    Is not called when the skill returns should_end_session=true
    """
    print("on_session_ended requestId=" + session_ended_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    return get_session_end_response()

但是,我只是从 Echo 中得到一个错误——这个函数on_session_ended从未被输入过。

那么如何在 Amazon Echo 上进行错误捕获和处理呢?

更新 1:我将自定义插槽的话语数量和意图数量减少到一个。现在用户应该只说 A、B、C 或 D。如果他们说超出此范围的任何内容,则仍会触发意图,但没有槽值。因此,我可以根据插槽值是否存在进行一些错误检查。但是,这似乎不是最好的方法。当我尝试添加没有插槽和相应话语的意图时,任何与我的任何一个意图都不匹配的内容都默认为这个新意图。我该如何解决这些问题?

更新 2:这是我的代码的一些相关部分。

意图处理程序:

def lambda_handler(event, context):
    print("Python START -------------------------------")
    print("event.session.application.applicationId=" +
          event['session']['application']['applicationId'])

    if event['session']['new']:
        on_session_started({'requestId': event['request']['requestId']},
                           event['session'])

    if event['request']['type'] == "LaunchRequest":
        return on_launch(event['request'], event['session'])
    elif event['request']['type'] == "IntentRequest":
        return on_intent(event['request'], event['session'])
    elif event['request']['type'] == "SessionEndedRequest":
        return on_session_ended(event['request'], event['session'])


def on_session_started(session_started_request, session):
    print("on_session_started requestId=" + session_started_request['requestId']
          + ", sessionId=" + session['sessionId'])


def on_launch(launch_request, session):
    """ Called when the user launches the skill without specifying what they want """
    print("on_launch requestId=" + launch_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    # Dispatch to your skill's launch
    return create_new_user()


def on_intent(intent_request, session):
    """ Called when the user specifies an intent for this skill """

    print("on_intent requestId=" + intent_request['requestId'] +
          ", sessionId=" + session['sessionId'])

    intent = intent_request['intent']
    intent_name = intent['name']
    attributes = session["attributes"] if 'attributes' in session else None
    intent_slots = intent['slots'] if 'slots' in intent else None

    # Dispatch to skill's intent handlers

    # TODO : Authenticate users
    #   TODO : Start session in a different spot depending on where user left off

    if intent_name == "StartQuizIntent":
        return create_new_user()

    elif intent_name == "AnswerIntent":
        return get_answer_response(intent_slots, attributes)

    elif intent_name == "TestAudioIntent":
        return get_audio_response()

    elif intent_name == "AMAZON.HelpIntent":
        return get_help_response()

    elif intent_name == "AMAZON.CancelIntent":
        return get_session_end_response()

    elif intent_name == "AMAZON.StopIntent":
        return get_session_end_response()

    else:
        return get_session_end_response()


def on_session_ended(session_ended_request, session):
    """
    Called when the user ends the session.
    Is not called when the skill returns should_end_session=true
    """
    print("on_session_ended requestId=" + session_ended_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    return get_session_end_response()

然后我们有实际被调用的函数和响应构建器。我已经编辑了一些隐私代码。我还没有建立所有的显示响应文本字段并且有一些 uid 硬编码,所以我不必担心身份验证。

# --------------- Functions that control the skill's behavior ------------------

####### GLOBAL SETTINGS ########
utility_background_image = "https://i.imgur.com/XXXX.png"


def get_welcome_response():
    """ Returns the welcome message if a user invokes the skill without specifying an intent """
    session_attributes = {}
    card_title = ""
    speech_output = ("Hello and welcome ... quiz .... blah blah ...")
    reprompt_text = "Ask me to start and we will begin the test!"
    should_end_session = False

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session,
                                                   build_display_response(utility_background_image,
                                                                          card_title, primary_text,
                                                                          secondary_text)))


def get_session_end_response():
    """ Returns the ending message if a user errs or exits the skill """
    session_attributes = {}
    card_title = ""
    speech_output = "Thank you for your time!"
    reprompt_text = ''
    should_end_session = True

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session,
                                                   build_display_response(utility_background_image,
                                                                          card_title, primary_text,
                                                                          secondary_text)))


def get_audio_response():
    """ Tests the audio capabilities of the echo """
    session_attributes = {}
    card_title = ""  # TODO : keep no 'welcome'?
    speech_output = ""
    reprompt_text = ""
    should_end_session = False

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session, build_audio_response()))


def create_new_user():
    """ Creates a new user that the server will recognize and whose action will be stored in db """
    url = "http://XXXXXX:XXXX/create_user"
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))
    uuid = data["uuid"]
    return ask_question(uuid)


def query_server(uuid):
    """ Requests to get num_questions number of questions from the server """
    url = "http://XXXXXXXX:XXXX/get_question_json?uuid=%s" % (uuid)  # TODO : change needs to to be uuid
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))

    if data["status"]:
        question = data["data"]["question"]
        quid = data["data"]["quid"]
        next_quid = data["data"]["next_quid"]  # TODO : will we need any of this?
        topic = data["data"]["topic"]
        type = data["data"]["type"]
        media_type = data["data"]["media_type"]  # either 'IMAGE', 'AUDIO', or 'VIDEO'
        answers = data["data"]["answer"]  # list of answers stored in order they should be spoken
        images = data["data"]["image"]  # list of images that correspond to order of answers list
        audio = data["data"]["audio"]
        video = data["data"]["video"]

        question_data = {"status": True, "data":{"question": question, "quid": quid, "answers": answers,
                         "media_type": media_type, "images": images, "audio": audio, "video": video}}
        if next_quid is "None":
            return None
        return question_data
    else:
        return {"status": False}


def ask_question(uuid):
    """ Returns a quiz question to the user since they specified a QuizIntent """
    question_data = query_server(uuid)

    if question_data is None:
        return get_session_end_response()

    card_title = "Ask Question"
    speech_output = ""
    session_attributes = {}
    should_end_session = False
    reprompt_text = ""

    # visual responses
    display_title = ""
    primary_text = ""
    secondary_text = ""

    images = []
    answers = []

    if question_data["status"]:
        session_attributes = {
            "quid": question_data["data"]["quid"],
            "uuid": "df876c9d-cd41-4b9f-a3b9-3ccd1b441f24",
            "question_start_time": time.time()
        }

        question = question_data["data"]["question"]
        answers = question_data["data"]["answers"]  # answers are shuffled when pulled from server
        images = question_data["data"]["images"]
        # TODO : consider different media types

        speech_output += question
        reprompt_text += ("Please choose an answer using the official NATO alphabet. For example," +
                          " A is alpha, B is bravo, and C is charlie.")

    else:
        speech_output += "Oops! This is embarrassing. There seems to be a problem with the server."
        reprompt_text += "I don't exactly know where to go from here. I suggest restarting this skill."

    return build_response(session_attributes, build_speechlet_response(card_title, speech_output,
            reprompt_text, should_end_session,
             build_display_response_list_template2(title=question, image_urls=images, answers=answers)))


def send_quiz_responses_to_server(uuid, quid, time_used_for_question, answer_given):
    """ Sends the users responses back to the server to be stored in the database """
    url = ("http://XXXXXXXX:XXXX/send_answers?uuid=%s&quid=%s&time=%s&answer_given=%s" %
          (uuid, quid, time_used_for_question, answer_given))
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))
    return data["status"]


def get_answer_response(slots, attributes):
    """ Returns a correct/incorrect message to the user depending on their AnswerIntent """

    # get time, quid, and uuid from attributes
    question_start_time = attributes["question_start_time"]
    quid = attributes["quid"]
    uuid = attributes["uuid"]

    # get answer from slots
    try:
        answer_given = slots["Answer"]["value"].lower()
    except KeyError:
        return get_session_end_response()

    # calculate a rough estimate of the time it took to answer question
    time_used_for_question = str(int(time.time() - question_start_time))

    # record response data by sending it to the server
    send_quiz_responses_to_server(uuid, quid, time_used_for_question, answer_given)

    return ask_question(uuid)


def get_help_response():
    """ Returns a help message to the user since they called AMAZON.HelpIntent """
    session_attributes = {}
    card_title = ""
    speech_output = "" # TODO
    reprompt_text = "" # TODO
    should_end_session = False

    return build_response(session_attributes,
            build_speechlet_response(card_title, speech_output, reprompt_text, should_end_session,
             build_display_response(utility_background_image, card_title)))


# --------------- Helpers that build all of the responses ----------------------


def build_hint_response(hint):
    """
    Builds the hint response for a display.

    For example, Try "Alexa, play number 1" where "play number 1" is the hint.
    """
    return {
        "type": "Hint",
        "hint": {
            "type": "RichText",
            "text": hint
        }
    }


def build_display_response(url='', title='', primary_text='', secondary_text='', tertiary_text=''):
    """
    Builds the display template for the echo show to display.

    Echo show screen is 1024px x 600px

    For additional image size requirements, see the display interface reference.
    """
    return [{
        "type": "Display.RenderTemplate",
        "template": {
            "type": "BodyTemplate1",
            "token": "question",
            "title": title,
            "backgroundImage": {
                "contentDescription": "Question",
                "sources": [
                    {
                        "url": url
                    }
                ]
            },
            "textContent": {
                "primaryText": {
                    "type": "RichText",
                    "text": primary_text
                },
                "secondaryText": {
                    "type": "RichText",
                    "text": secondary_text
                },
                "tertiaryText": {
                    "type": "RichText",
                    "text": tertiary_text
                }
            }
        }
    }]


def build_list_item(url='', primary_text='', secondary_text='', tertiary_text=''):
    return {
        "token": "question_item",
        "image": {
            "sources": [
                {
                    "url": url
                }
            ],
            "contentDescription": "Question Image"
        },
        "textContent": {
            "primaryText": {
                "type": "RichText",
                "text": primary_text
            },
            "secondaryText": {
                "text": secondary_text,
                "type": "PlainText"
            },
            "tertiaryText": {
                "text": tertiary_text,
                "type": "PlainText"
            }
        }
    }


def build_display_response_list_template2(title='', image_urls=[], answers=[]):
    list_items = []
    for image, answer in zip(image_urls, answers):
        list_items.append(build_list_item(url=image, primary_text=answer))

    return [{
        "type": "Display.RenderTemplate",
        "template": {
            "type": "ListTemplate2",
            "token": "question",
            "title": title,
            "backgroundImage": {
                "contentDescription": "Question Background",
                "sources": [
                    {
                        "url": "https://i.imgur.com/HkaPLrK.png"
                    }
                ]
            },
            "listItems": list_items
        }
    }]


def build_audio_response(url): # TODO add a display repsonse here as well
    """ Builds audio response. I.e. plays back an audio file with zero offset """
    return [{
        "type": "AudioPlayer.Play",
        "playBehavior": "REPLACE_ALL",
        "audioItem": {
            "stream": {
                "token": "audio_clip",
                "url": url,
                "offsetInMilliseconds": 0
            }
        }
    }]


def build_speechlet_response(title, output, reprompt_text, should_end_session, directive=None):
    """ Builds speechlet response and puts display response inside """
    return {
        'outputSpeech': {
            'type': 'PlainText',
            'text': output
        },
        'card': {
            'type': 'Simple',
            'title': title,
            'content': output
        },
        'reprompt': {
            'outputSpeech': {
                'type': 'PlainText',
                'text': reprompt_text
            }
        },
        'shouldEndSession': should_end_session,
        'directives': directive
    }


def build_response(session_attributes, speechlet_response):
    """ Builds the complete response to send back to Alexa """
    return {
        'version': '1.0',
        'sessionAttributes': session_attributes,
        'response': speechlet_response
    }

更新 3:我更新了意图,因此现在有一个自定义意图采用自定义插槽,然后我有另一个自定义意图不使用插槽。这些自定义意图也有自己的示例话语。下面列出了意图及其话语。当我开始使用该技能时,它可以正常工作。然后当我说/输入“zoo zoo zoo”来测试错误的输入时,我得到一个错误。下面列出了“zoo zoo zoo”的请求和响应。我正在寻找一种很好的方法来捕捉这个错误的输入错误并将技能恢复/恢复到以前的状态。

意图:

...
{
      "intent": "TestAudioIntent"
},
{
      "slots": [
        {
          "name": "Answer",
          "type": "LETTER"
        }
      ],
      "intent": "AnswerIntent"
},
...

示例话语:

AnswerIntent {Answer}
AnswerIntent I think it is {Answer}
TestAudioIntent test the audio

示例 JSON 请求:

{
  "session": {
    "new": false,
    "sessionId": "SessionId.574f0b74-be17-4f79-bbd6-ce926a1bf856",
    "application": {
      "applicationId": "XXXXXXXX"
    },
    "attributes": {
      "quid": "7fa9fcbf-35db-4bbd-ac73-37977bcef563",
      "question_start_time": 1515691612.7381804,
      "uuid": "df876c9d-cd41-4b9f-a3b9-3ccd1b441f24"
    },
    "user": {
      "userId": "XXXXXXXX"
    }
  },
  "request": {
    "type": "IntentRequest",
    "requestId": "EdwRequestId.23765cb0-f327-4f52-a9a3-b9f92a375a5f",
    "intent": {
      "name": "TestAudioIntent",
      "slots": {}
    },
    "locale": "en-US",
    "timestamp": "2018-01-11T17:26:57Z"
  },
  "context": {
    "AudioPlayer": {
      "playerActivity": "IDLE"
    },
    "System": {
      "application": {
        "applicationId": "XXXXXXXX"
      },
      "user": {
        "userId": "XXXXXXXX"
      },
      "device": {
        "supportedInterfaces": {
          "Display": {
            "templateVersion": "1",
            "markupVersion": "1"
          }
        }
      }
    }
  },
  "version": "1.0"
}

我得到以下测试错误作为响应:

The remote endpoint could not be called, or the response it returned was invalid.
4

1 回答 1

0

我最终做的是使用类似于亚马逊的对话管理系统的东西。如果用户说了一些没有填满空位的话,我会用这个问题再次提示他们。我的目标是在用户每次发言后记录他们的陈述/答案,因此我没有使用内置的对话管理。此外,我对所有插槽都使用了亚马逊的插槽同义词,以使我的模态更加健壮。

我仍然不知道这是最好的方法,但它是一个起点,似乎工作正常......

于 2018-01-18T09:15:53.727 回答