Develop our first Alexa Skill: Accessible Pubs

Chema Jan. 22, 2019

nlp skill alexa osm develop lambda aws

Alexa arrived in Spain at the end of 2018. There are already thousands of apps available, called Skills. We did not want to miss the opportunity to experiment with its SDK and create a simple skill. We tell you how we have done it and how the experience has been.

The Skill idea

Given our commitment to accessibility we wanted that while creating a new app where the voice will be used as a control and information interface, it will provide accessibility information to its users.

As many of you know, we actively work with OpenStreetMap and its ecosystem APIs.

We wanted to do something that touched both aspects.

After giving it a few laps, we had the app concept: a searcher of accessible pubs/bars.

The operation is very simple. The user opens the skill and asks to show you accessible bars in your area (if you have geolocation enabled) or the city you want. Alexa recognizes the "intent" (we will see it later) and invokes the corresponding backend webhook. The backend makes a call to the OSM API (Nominatim/Overpass) looking for those bars that are accessible (initially it only shows those that are accessible for wheelchairs / carts). Finally, they present the results ordered by proximity and 3 in 3.

Online SDK

Amazon needs developers. You need to create and pamper an ecosystem of useful Skills for your users. That's why it takes care of every detail so that developers are welcome and encouraged to create skills.

Proof of this is that you can create an Alexa Skill using only the web browser. No need to download anything, no containers, or anything. Only from the web console of the Alexa SDK.

Once you register one in the Alexa SDK, the process of creating a Skill is divided into 3 phases:

  • Build: The first step is to define the magic word. That word or words that will launch your skill. It is known as activation skill. Then you have to define the "intents". An intent is an action to execute for your skill. An intent must have one or several words for Alexa to recognize it. This is where you have to do a good job of analysis and use your imagination to add all possible combinations. This will help a lot to improve the usability of your skill. Of course, the expressions of an intent must have high cohesion between them and high independence with respect to the expressions of the other intents. The first step is to define the magic word. That word or words that will launch your skill. It is known as activation skill. Then you have to define the "intents". An intent is an action to execute for your skill. An intent must have one or several words for Alexa to recognize it. This is where you have to do a good job of analysis and use your imagination to add all possible combinations. This will help a lot to improve the usability of your skill. Of course, the expressions of an intent must have high cohesion between them and high independence with respect to the expressions of the other intents. When the intents/expressions model is clear, clicking on "Build" generates a recognition model. It may take several minutes and this model is transparent to us. It is internal to Alexa. Regarding the backend, we have to add the webhook of our backend where Alexa will call with the activated intent or the AWS Lambda endpoint where our code is.
  • Test: Needless to say you have to test the skill well. The Alexa development console facilitates this work because it allows you to test it directly through the microphone of the PC or by directly entering text as if it were a chatbot. If you want to try Skills with geolocation and you do not have a speaker with Alexa, you can try your apps in development directly with the Alexa app for Android and iOS.
  • Distribution: The job to publish a skill in the marketplace is simple. Especially if we compare it with Google Play or App Store. Mainly you need a good description, icons and links to the privacy policy and terms of use. With the work done and after clicking on distribute the app is in the status of "In Review". The time that was under review was approximately 3 weeks (disclaimer, new year caught by means). If all goes well, Alexa sends an email when the app is already available to everyone.

We have not gone into details about the process, we believe there is good documentation about it on the Internet

We will focus on the aspects of the skill that we have developed.

Intents

Apart from the default intents of Alexa (CancelIntent, YesIntent, NoIntent, HelpIntent, etc) we have created 2 specific ones:

  • GetPlacesSearchIntent: This intent allows you to search for accessible bars in the city that you are told. The city is captured in a slot that we call 'city' and that goes as an argument when the webhook is invoked. The result is the list of accessible bars ordered from the center of the city and paged (from 3 to 3). Some of the expressions used are: "acceso a silla de ruedas en {city}", "locales accesibles en {city}", "pubs en {city}", etc.
  • GetPlacesNearIntent: It is similar to the previous one but uses the geolocation of the user (device) to look around. In this case, in the results also add the address in meters to the venues. Some of the expressions used are: "en este barrio", "por aquí cerca", "restaurantes por mi alrededor", etc.

Alexa Console Screenshot

Alexa Console Screenshot where we specify the intents

The backend: AWS Lambda

For the backend we used a Python template of "facts" for AWS Lambda.

This part of the development is the ugliest because you have to open another window with AWS Lambda for its development and then go back to the Alexa console to try.

In AWS Lambda you have to install the dependencies (typically in requirements.txt) as if they were packages of your app (like "vendor" in compose or "node_modules" in npm). A good solution so that it does not conflict with version control, is to use a "vendor" type directory and exclude it from version control.

It is a somewhat tedious job. We have to use npm to install the packages locally, move them to vendor, compress the folder and upload it to the AWS console.

Regarding the code, we include a simplified snippet of Accessible Bars.

	
	class GetPlacesSearchHandler(AbstractRequestHandler):
    """Handler for coffee intent."""
    def can_handle(self, handler_input):
        # type: (HandlerInput) -> bool
        return is_intent_name("GetPlacesSearchIntent")(handler_input)

    def handle(self, handler_input):
        # type: (HandlerInput) -> Response
        logger.info("In GetPlacesSearchHandler")
        attribute_manager = handler_input.attributes_manager
        session_attr = attribute_manager.session_attributes
        page = 0
        lastQuery = session_attr.get('lastQuery', None)
        qCity = None
        for slotName, slotValue in handler_input.request_envelope.request.intent.slots.items():
            if slotName == 'city':
                qCity = slotValue.value
                break
        if qCity is None:
            speech = 'Vaya, parece que no he entendido bien la ciudad, ¿puedes repetirlo? gracias'
            handler_input.response_builder.speak(speech).ask(speech).set_card(SimpleCard(SKILL_NAME, speech))
            return handler_input.response_builder.response
        if lastQuery != qCity:
            page = 0
        session_attr['lastQuery'] = qCity
        logger.info("City: {}".format(qCity))
        city = getCity(qCity)
        if city is None:
            speech = 'Vaya, no encuentro ninguna ciudad que se llame {}, ¿puedes probar otra vez?'.format(qCity)
            handler_input.response_builder.speak(speech).ask(speech).set_card(SimpleCard(SKILL_NAME, speech))
            return handler_input.response_builder.response

        # session_attr["restaurant"] = restaurant["name"]
        ret = getResults(*city)
        attribute_manager = handler_input.attributes_manager
        session_attr = attribute_manager.session_attributes
        # session_attr["restaurant"] = restaurant["name"]
        logger.info("Num results: {}".format(len(ret)))
sb = SkillBuilder()
...
sb.add_request_handler(GetPlacesNearHandler())
...
lambda_handler = sb.lambda_handler()

Testing the skill

The most complicated thing to try the skill is to know well how Alexa works. If you have never used Alexa, it is good to test it beforehand as such, learn how it works, download other skills and play with them.

It can be very frustrating at the beginning of the development if you know the basic concept of alexa as the activation word, the default intents, etc.

Screenshot of Accessible Pubs on Alexa for Android

Testing the skill on Alexa for Android

Conclusions

The experience has been very good. We believe that there is still a lot to work in the backend part.

First in the improvement of the development cycle and then in the contextual conversations model. Right now Alexa behaves like "Speech-To-Text with stereos".

It would be interesting for the SDK itself to provide these tools that would allow the development of much more complete and apparently smarter skills. Surely soon we see it.

They also have to improve things like time for Skills review process.

From our side, we will closely monitor the evolution of the Alexa ecosystem and its SDK. Will it be a revolution this year? We will see it.

You can install "Accessible Pubs" from Amazon Alexa Skills Marketplace (Spain)
 
 

Do not miss anything!

Subscribe to our mailing list and stay informed

Accept the terms & privacy conditions

Subscription done! Thanks so much

This website uses cookies

The cookies on this website are used to personalize content and analyze traffic. In addition, we share information about your use of the website with web analytics partners, who may combine it with other information you have provided or collected from your use of their services. You accept our cookies if you continue to use our website.
 

I agree See cookies

This website uses cookies

The cookies on this website are used to personalize content and analyze traffic. In addition, we share information about your use of the website with web analytics partners, who may combine it with other information you have provided or collected from your use of their services. You accept our cookies if you continue to use our website.