(Left to right: Guy Tonye, Tsun Yin Lip, Aanchal Bakshi, Ryan McLeod, Erin Gallagher)
We recently attended the highly-anticipated developer conference, Google I/O 2019 in Mountain View, California. The team thoroughly enjoyed hearing from keynote and session speakers in a variety of talks ranging from accessibility to augmented reality. Here are some of our biggest takeaways from the event:
Google has made it their goal to make information universally accessible. They’re working on getting the next billion users online, especially those in developing countries that don’t necessarily have premium devices or the same internet connectivity. In addition, Google has set a goal to make AI available to everyone by pursuing AI-driven initiatives for civic technologies through projects like Live Relay, Project Euphonia, and Project Diva. Live Relay can convert text-to-speech (TTS), allowing users that cannot speak, hear, or are hard of hearing to have phone conversations by reading the text of what their partner is saying and/or typing and having them read out on the other end of the line.
Euphonia is a new initiative consisting of two products (Live Transcribe, and Live Caption) that allow voice assistants to better understand users with speech disabilities. These speech-to-text (STT) converters allow for instant captions on your phone as you consume all sorts of media – be it a person speaking live, a YouTube video, a podcast, or a video chat app. In addition to these initiatives, there will be an enhanced search screen for blind users to navigate and find content quickly without having to swipe multiple times on a device.
Project Diva, also known as “DIVersely Assisted”, is meant to create an assortment of trigger commands such as a single-purpose button to give people some independence and autonomy when using Google Assistant.
It is a known fact that the assistant had lagging issues when interpreting roughly 700 languages to execute an action. It typically required around 100GB of language interpretation data, but during the keynote presentation, Google CEO Sundar Pichai said that Google has managed to compress the data to 0.5GB which will allow for faster actions to be made by the device with lower latency. In addition, they’re opening up Google Assistant to be able to access any apps directly. If you have an app in a certain category, you can access and launch that app through the assistant using app actions. For example, if you ask your Google Assistant to order food through UberEats, it will then provide the app with your order and take you directly to the checkout screen for your confirmation.
With quicker response time, Google Assistant is capable of handling complex compound actions through “Continued Conversation”. Saying "Hey Google" as the trigger command every time to invoke the assistant is something that we always wished we never had to do. Thankfully, Google listened! Now, you can have more interactive and natural conversations without having to say "Hey Google" at the start of every sentence.
Another upcoming automated capability is the execution of having the assistant fill a form on any given website. Essentially, Google Assistant has your information such as your name and address and can fill-in-the-blanks without you having to type. It will also confirm with the user when necessary when it comes to sensitive information such as a user’s credit card. The ability to set a context for the assistant, like creating personal references through the “You” tab, is something that especially stood out to us. You can give someone a nickname on your contact list and the assistant will provide you with information based on that context. For instance, the next time you ask about the weather at Mom's house, the assistant will already know which city your mom lives in, reducing the number of steps needed to get your answer. Essentially, things that are tailored to you will be applied to a larger, generic question.
Assistant will leverage information from how-to markup on websites to provide custom step-by-step instructions to users allowing for the use of common utterances to make the process more natural. Google has created a template through DialogFlow where you just have to provide information and Google will shape that and manage the entire conversation. DialogFlow is a tool for building conversational agents that will be able to answer questions that Google Assistant can’t.
In order to take advantage of smart screens in the home, Google has developed a framework built on top of the Google Assistant called Interactive Canvas. Although it is currently in developer preview it is providing users and developers with the opportunity to have a more immersive and interactive voice experience. This is really exciting for the use case of voice games.
Many people are familiar with Incognito Mode on Google Chrome. It’s a simple privacy tool that prevents Chrome from maintaining a record of what you search for and which sites you visit while you browse. Now, Google wants to make this tool available in many of their other applications including Google Maps and the Search widget.
Another update is a new feature called Auto-Delete. Auto-Delete will give users the option to clear the data used on Google web and app platforms on top of seeing the data that was collected. In addition, Android will include more features to alert the user of what data is used by which app as well as suggest which applications are recording your private data. The web application will include a feature to schedule a time for data deletion to happen without having to go in and delete it yourself.
The usual model for training machine learning (ML) requires sending potentially personal data to the server (e.g. the controversy around Alexa recording voice samples for learning purposes). Federated Learning is considered to be the mass producer of ML with embedded privacy that allows for ML without having to send personal data to their server. Google has implemented a set of rules to ensure that the user's privacy is not compromised. It works by running training on the device (e.g., your phone) and sending only the updates to the server instead. The actual personal data doesn't leave your phone and the results are aggregated from a large population of devices.
The example shown at Google I/O included word suggestions on Gboard where some acronyms like “OMW” (on my way) were offered as a suggestion based on your usage. Rather than associate and collect data on an individual standpoint, Google wants to take the data as a collective. In the case of OMW, this may be widely used by many people around you, but you’ve never used it. However, your phone may already know the shortcut and is just waiting for you to use it. This is a way to see if ML computation can be done using data living on the edge of the users' devices. Gboard would like to help suggest the words that you commonly use. When you type, it’ll give words that it thinks you will likely use. It wants to include words that wouldn’t necessarily be in the dictionary. By following acronyms that are being used, federated learning will reflect your device usage, things that matter to you.
Augmented Reality (AR) is being taken to the next level. When you use Google Search, you can now see an option to view the search result in AR. During the keynote presentation, the speaker searched for ‘shark’ and sure enough, a full sized shark was viewed in AR! One application of this is the ability to more fully understand the scale of objects not in your direct physical space. Apart from this, there were a lot of other interesting features introduced in AR core like new APIs that enhance the shadows, reflection, and lighting on virtual objects which is so effective that it can become difficult to distinguish between what’s real and what’s not. When they showed Environmental HDR and how dynamically it adjusts the lighting on an AR Object to match the lighting as it changes in the real space, it was so exciting! Any device that will receive the ARCore update should be able to take advantage of this feature.
Luckily for the public, Google Maps AR was released as an exclusive trial on Pixel phones. This feature allows users to see physical directions in the AR space – think big arrows pointing down the street, indicating where you have to turn. Even though it is not meant to be used as a primary form of navigation, it can nonetheless help people find their bearings when they get off of a train or bus.
Apart from these two features, the conference had demos of AR capabilities previously known, albeit with much more polished specs. These ranged from fun stations where you could shoot lasers from your eyes to seeing a garden grow in real time, all the way to posing and taking pictures with Marvel heroes in AR. Google is certainly not letting go of their AR pursuits, and it is a space the tech community should keep an eye on as new features like Google Maps AR and AR Search rolls out.
Google I/0 2019 was an amazing showcase of Google’s new hardware, software, and various updates for its existing apps and services. It was a great experience and we can’t wait for what’s in store at Google I/O 2020!
Stay tuned for updates on where we'll be next!