As a software developer based out of Toronto (and one of the organizers of Voice Tech TO), I recently took a trip down to NYC to attend one of Amazon’s infamous Alexa Dev Days. For those unfamiliar with the event, Alexa Dev Days are recurring, multi-city developer workshops that are free and open to the public. Full of helpful info, they are geared toward the design and development of third-party skills for Alexa-enabled devices.
It’s quite a trip from Toronto to New York just to take a one-day workshop, but looking back on all the things I learned and discovered, it was 100% worth it.
In a nutshell, the event covered three major topics:
Let’s go through these one by one.
Since designing for the ear is so different from designing for the eye, the Alexa team conducted a number of small exercises to get people into the right headspace.
First ExerciseAfter grouping us into teams, the Alexa instructors challenged us to design a simple recommendation skill. We had to figure out: what does our skill recommend, exactly? And how, conversationally, do users arrive at those recommendation(s)?
My team chose to build a bike recommendation engine.
As we learned, everything begins with writing the conversation in its ideal form, commonly referred to as the “happy path.” Using nothing but plain old paper, this is the dialogue we came up with:
User (U): Open bike assistant.
Alexa (A): Hi, I can help you find a bike. Are you looking for a city bike or a mountain bike?
(U): A city bike.
(A): Okay. How often do you ride your bike on average?
(U): Twice a week.
(A): Got it. What is your budget?
(U): Around $200.
(A): Okay. Searching… I would recommend a carbon road bicycle from Amazon at $220. Would you like me to send the information to your phone?
(A): Sending now.
Second ExerciseOn the basis of our happy path, our next exercise was to formally document the user’s overall intention, as well as the kinds of information necessary for the skill to satisfy that intention, aka the different input types, or slots. Finally, we needed to identify the actual inputs that could legitimately be entered under each of these slots, aka the values. As you’ll see in the next section, we needed to do this because, when you build the actual skill, you can’t just type up and submit a transcript of your desired conversation. You need to build the program in terms of intentions, slots, and legitimate value entries.
User Intention: An appropriate bike recommendation
Slots: Bike type, frequency, budget
Values: City Bike or Mountain Bike (for Bike Type); Once a week, twice a week, daily, or often (for the Ride frequency); Any number (for Budget)
User Intention: Receiving the recommendation on your phone
Values: Yes, No
At this point in the exercise, the Alexa team emphasized the need to perform and test our skill with real users in order to learn more about their intentions and add/remove slots and values accordingly.
After designing your happy path and identifying the relevant slots and values, the workshop’s focus shifted toward actually developing the skill. The Alexa team walked us through a number of tools to help with that, loosely divided into “graphical” tools and more traditional “console” tools.
“Graphical” skill building, remarkably, does not require any coding, occurring entirely in your web browser using a combination of the Alexa Skills Kit Developer Console, Skill Code Generator, and AWS Lambda Console (don’t be misled by the word “console” here — these are GUIs). Let’s go through these one by one.
Alexa Skills Kit Developer Console
Many of you may already know this tool, though it got a pretty solid facelift. Basically, it allows developers to define the intention, slots, and values using its GUI.
Notice in the above image that there’s a column for synonyms: for each slot, you can define values and add synonyms that would resolve to the same value.
You can also quickly define what information is mandatory for each utterance and automate the question that will be prompted to the user if that info is missing (below).
Skill Code Generator
AWS Lambda Console
AWS Lambda Console lets you define the lambda that will fulfill the skill’s request — in our case, finding an appropriate bicycle recommendation from the immense sea of bikes out there. Then the previously generated index.js can be uploaded to your lambda et voilà, the skill is ready for you to try out in Alexa Skills Kit Developer Console or through a configured device.
If GUIs aren’t your favourite, or if versioning and continuous deployment are necessary for your project (they often are), an old-fashioned terminal is the choice for you. For the developers in the room, Amazon presented their command line “ask-cli.” It allows builders to manage skill configuration directly from the console.
Whether you’re a designer or a developer, a product manager or simply a voice enthusiast, Amazon provides the tools for you to make a skill.
Another interesting moment of the day was the presentation of some upcoming features Amazon had in store for Alexa, relevant for developers no less than users.
Amazon is offering a new type of voice experience through an extended product line they’re calling Gadget. The first product available is called Echo Button and it was released in the US in December 2017. It is meant to enhance the gaming experience on Alexa, with a trivia game for example, allowing users to race each other to answer questions. An API allowing third-party skills to interact with these gadgets will be made available soon.
When a user makes their request, depending on the network or the query processing, the user could end up waiting for a significant amount of time. As of November 2017, Alexa now supports progressive response, which allows skills to send intermediate responses to the user while processing the real request.
As of November 2017, AWS is offering a series of features around Alexa for Business. It provides a suite of management tools (like fleet management for company devices) and easily integrates with enterprise software (like conferencing and help desk). Developers also have the opportunity to build skills for corporations in a private skill store using a wider range of APIs.
Currently, Domino’s pizza is the only service that can notify users on Alexa. However, a notification option for all skills are in the works. It’s a sensitive topic, but a developer preview will be available soon.
To leverage less predictable user utterances, Alexa added a SearchQuery slot-type to its Skill Kit that can be applied to specific slots to allow for open-ended inputs/values to simplify the design of a skill. See an example here.
Though not new, the team showed how integrating Alexa Voice Service (AVS) into a device allows you to interact with it via Alexa. To illustrate their point, they showed how Alexa could be used to control your web browser to find images quickly online.
After a jam-packed day at the W Hotel in midtown Manhattan, the strongest impression I was left with is that Amazon is committed to sharpening their skill-building tools for designers and developers alike, no doubt in order to boost production and provide their users with the best experiences possible. Smart.
If you don’t have a chance to attend one of these Dev Days, feel free to reach out or come to the next meeting of Voice Tech TO. A Toronto-based event that explores Google Home, Amazon Alexa, and the wide world of voice, it’s our own personal Dev Day ;-)