The engine behind Amazon Alexa is one of the machine learning technologies powering a new suite of artificial intelligence tools announced by AWS this week.
At the AWS re:Invent conference in Las Vegas this week, Amazon Web Services announced three Artificial Intelligence (AI) services that make it easy for developers to build apps that can understand natural language, turn text into lifelike speech, have conversations using voice or text, analyze images, and recognise faces, objects, and scenes.
Amazon Lex, Amazon Polly, and Amazon Rekognition are based on the same highly scalable Amazon technology built by the thousands of deep learning and machine learning experts across the company. AWS says that Amazon AI services all provide high-quality, high-accuracy AI capabilities that are scalable and cost-effective. Amazon AI services are fully managed services, so there are no deep learning algorithms to build, no machine learning models to train, and no up-front commitments or infrastructure investments required. This promises to free developers to focus on defining and building a new generation of apps that can see, hear, speak, understand, and interact with the world around them.
Amazon Web Services provided the following information:
Until now, very few developers have been able to build, deploy, and broadly scale apps with AI capabilities because doing so required access to vast amounts of data, and specialized expertise in machine learning and neural networks. Effectively applying AI involves extensive manual effort to develop and tune many different types of machine learning and deep learning algorithms (e.g. automatic speech recognition, natural language understanding, image classification), collect and clean the training data, and train and tune the machine learning models. And this process must be repeated for every object, face, voice, and language feature in an application. Amazon AI services eliminate all of this heavy lifting, making AI broadly accessible to all app developers by offering Amazon’s powerful and proven deep learning algorithms and technologies as fully managed services that any developer can access through an API call or a few clicks in the AWS Management Console. Amazon AI services make the full power of Amazon’s natural language understanding, speech recognition, text-to-speech, and image analysis technologies available at any scale, for any app, on any device, anywhere.
“The combination of better algorithms and broad access to massive amounts of data and cost-effective computing power provided by the cloud is making AI a reality for application developers. AWS is home to some of the most innovative and creative AI applications in use today,” said Raju Gulabani, VP, Databases, Analytics, and AI, AWS. “Thousands of machine learning and deep learning experts across Amazon have been developing AI technologies for years to predict what customers might like to read, to drive efficiencies in our fulfillment centers through robotics and computer vision technologies, and to give customers our AI-powered virtual assistant, Alexa. Now, we are making the technology underlying these innovations available to any developer in the form of three fully managed Amazon AI services that are easy to use, powerful, and cost effective. We are excited to see how customers use Amazon Lex, Amazon Polly, and Amazon Rekognition to build a new generation of apps that have human-like intelligence and can see, hear, speak, and interact with people and their environments.”
Intelligent conversations with Amazon Lex
Amazon Lex is a new service for building conversational interfaces using voice and text that is built on the same automatic speech recognition (ASR) technology and natural language understanding (NLU) that powers Amazon Alexa. Amazon Lex makes it easy to bring sophisticated, natural language capabilities to virtually any app. Developers can build and test bots (conversational apps that perform automated tasks like checking the weather or booking flights) directly from the AWS Management Console by typing in a few sample phrases (e.g., “find a flight,” or “book a flight”) along with instructions for getting the required parameters to complete task (e.g., travel date and destination) and the corresponding clarifying questions to ask the user (e.g., “when do you want to travel?” and “where do you want to go?”). Amazon Lex takes care of the rest, building the language model and asking the follow-up questions needed to complete the task. Because Amazon Lex is integrated with AWS Lambda, developers can configure Amazon Lex to invoke the appropriate backend service (e.g., the flight booking service) through an AWS Lambda function. Developers can also use pre-built enterprise connectors that execute AWS Lambda functions to answer questions like “what are my top 10 accounts in Salesforce.com,” by fetching data from enterprise systems like Salesforce, Microsoft Dynamics, Marketo, Zendesk, QuickBooks and HubSpot.
Bots built using Amazon Lex can be used anywhere: from web applications, to chat and messenger apps like Slack and Facebook Messenger, or through voice in apps on mobile or connected devices. Amazon Lex handles the authentication required by different platforms and simplifies the user interface design by not requiring developers to write custom code for each platform. Moreover, developers do not have to worry about scaling their infrastructure as Amazon Lex scales automatically as traffic to a bot increases, and developers pay only for the calls made to the Amazon Lex API.
Capital One offers a broad spectrum of financial products and services to consumers, small businesses, and commercial clients through a variety of channels. “As a heavy user of AWS, Amazon Lex’s seamless integration with other AWS services like AWS Lambda and Amazon DynamoDB is really appealing,” said Firoze Lafeer, Chief Technology Officer, Capital One Labs, Capital One. “A highly scalable solution, Amazon Lex also offers potential to speed time to market for a new generation of voice and text interactions, such as our recently launched Capital One skill for Alexa.”
OhioHealth is a nationally recognized healthcare organization with a network of 11+ hospitals in 47 counties. “We are excited about utilizing evolving speech recognition and natural language processing technology to enhance the lives of our customers. Amazon Lex represents a great opportunity for us to deliver a new experience to our patients,” said Michael Krouse, Senior Vice President Operational Support and Chief Information Officer, OhioHealth. “Everything we do at OhioHealth is ultimately about providing the right care to our patients at the right time and in the right place. Amazon Lex’s next generation technology and the innovative applications we are developing while using it will help provide an enhanced customer experience. We are just scratching the surface of what is possible.”
HubSpot is a marketing and sales software leader. “HubSpot’s GrowthBot is an all-in-one chatbot which helps marketers and sales people be more productive by providing access to relevant data and services using a conversational interface. With GrowthBot, marketers can get help creating content, researching competitors, and monitoring their analytics. Through Amazon Lex, we’re adding sophisticated natural language processing capabilities that helps GrowthBot provide a more intuitive UI for our users,” said Dharmesh Shah, Chief Technology Officer and Founder, HubSpot. “Amazon Lex lets us take advantage of advanced AI and machine learning without having to code the algorithms ourselves.”
Twilio helps businesses make communications relevant and contextual by making it possible to easily embed real-time communication and authentication capabilities directly into software applications. “Developers and businesses use Twilio to build apps that can communicate with customers in virtually every corner of the world,” said Benjamin Stein, Director of Messaging Products, Twilio. “Amazon Lex will provide developers with an easy-to-use modular architecture and comprehensive APIs to enable building and deploying conversational bots on mobile platforms. We look forward to seeing what our customers build using Twilio and Amazon Lex.”
Intelligent Speech with Amazon Polly
Amazon Polly makes it easy for developers to add natural-sounding speech capabilities to existing applications like newsreaders and e-learning platforms, or create entirely new categories of speech-enabled products – from mobile apps to devices and appliances. Amazon Polly is easy to use; developers can send text to Amazon Polly using the SDK or from within the AWS Management Console and Polly immediately returns an audio stream that can be played directly or stored in a standard audio file format. With 47 lifelike voices and support for 24 languages, developers can choose from both male and female voices with a variety of accents to make applications for users around the globe. And Amazon Polly’s fluid pronunciation of text content means applications deliver high-quality voice output across a wide variety of text formats. Amazon Polly is scalable, returning high-quality speech fast, even when converting large volumes of text to speech. With Amazon Polly, developers pay only for the text they convert, and they can cache generated speech and replay it as many times as they like with no restrictions.
The Washington Post is a Pulitzer Prize-winning media and technology company that publishes more than 1200 stories a day. “We’ve long been interested in providing audio versions of our stories, but have found that existing text-to-speech solutions are not cost-effective for the speech quality they offer,” said Joseph Price, Senior Product Manager, The Washington Post. “With the arrival of Amazon Polly and its high-quality voices, we look forward to offering readers more rich and versatile ways to experience our content.”
GoAnimate is a cloud-based, animated video creation platform, designed to allow business people with no background in animation to quickly and easily create animated videos. “Amazon Polly gives GoAnimate users the ability to immediately give voice to the characters they animate using our platform. This is especially helpful in scenarios where live voiceover is either resource or time prohibitive, such as when developing a video in many languages, or within pre-production to speed the approval process,” said Alvin Hung, CEO and Founder, GoAnimate. “The speech from Amazon Polly is integrated seamlessly with our rich set of pre-animated assets, which reinforces GoAnimate’s ease of use and affords our customers both efficiency and speed to market.”
Intelligent Image Analysis with Amazon Rekognition
Amazon Rekognition enables developers to quickly and easily build applications that analyze images, and recognize faces, objects, and scenes. Amazon Rekognition uses deep learning technologies to automatically identify objects and scenes, such as vehicles, pets, or furniture, and provides a confidence score that lets developers tag images so that application users can search for specific images using key words. Amazon Rekognition can locate faces within images and detect attributes, such as whether or not the face is smiling or the eyes are open. Amazon Rekognition also supports advanced facial analysis functionalities such as face comparison and facial search. Using Rekognition, developers can build an application that measures the likelihood that faces in two images are of the same person, thereby being able to verify a user against a reference photo in near real-time. Similarly, developers can create collections of millions of faces (detected in images) and can search for a face similar to their reference image in the collection. Amazon Rekognition removes the complexity and overhead required to develop and manage expensive image processing pipelines by making comprehensive image classification, detection, and management capabilities available in a simple, cost-effective, and reliable AWS service. There are no upfront costs for Amazon Rekognition, developers pay only for the images they analyze and the facial feature vectors they store.
Redfin is a full-service brokerage that uses modern technology to help people buy and sell houses. “Redfin users love to browse images of properties on our site and mobile apps, and we want to make it easier for our users to sift through hundreds of millions of listing and images,” says Yong Huang, Director of Big Data & Analytics, Redfin. “Amazon Rekognition generates a rich set of tags directly from images of properties. This makes it relatively simple to build a smart search feature that helps customers discover houses based on their specific needs, such as a fireplace, yard, or swimming pool. And since Rekognition accepts Amazon S3 URLs, it is a huge time-saver to detect objects, scenes, and faces without having to move images around.”
SmugMug is a safe and beautiful home for photos that stores billions of beautiful photos for millions of amazing customers every day. “SmugMug customers want to spend their time making more memories, not manually managing their photo collection,” said Don MacAskill, Co-Founder, Chief Executive Officer, and Chief Geek, SmugMug. “Amazon Rekognition will allow us to automatically identify the content in customers’ photos, unlocking a host of features that will allow them and their visitors to have more time to focus on enjoying life and celebrating their photos.”
Deep Learning and AI on AWS
Amazon Polly is available today in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Dublin) Regions, and will expand to additional Regions in the coming months. Amazon Rekognition is available in US East (N. Virginia), US West (Oregon), and EU (Dublin) Regions, and will expand to additional Regions in the coming months. Customers can sign up for the Amazon Lex preview starting today.
In addition to these services, AWS recently announced it is investing significantly in MXNet, an open source distributed deep learning framework, initially developed by Carnegie Mellon University and other top universities, by contributing code and improving the developer experience. MXNet will enable machine learning scientists to build scalable deep learning models that can significantly reduce the training time for their applications. For more information on AWS support for MXNet, visit: http://www.allthingsdistributed.com/2016/11/mxnet-default-framework-deep-learning-aws.html.
AWS also makes it easy for developers to run their own deep learning and machine learning workloads to build their own AI platform on top of AWS. Amazon Elastic Compute Cloud (Amazon EC2), with its broad set of instance types and GPUs with large amounts of memory, is ideal for deep learning training. P2 instances, launched in September 2016, were designed for large-scale machine learning and deep learning with up to 8 NVIDIA Tesla K80 Accelerators, each running a pair of NVIDIDA GK210 GPUs that have 12 GiB of memory and 2,496 parallel processing cores. And, customers can make use of AWS’s Deep Learning AMI, which contains six pre-configured and pre-tested deep learning frameworks including all dependencies, Nvidia drivers, and data science tools like Jupyter and Anaconda. In addition, AWS CloudFormation templates are available for training deep neural networks at scale in just a few clicks.
AppDate: Prepare for space
In this week’s AppDate, SEAN BACHER highlights Space Nation Navigator, Hitman Sniper, Snake Mask, Memrise, WhatsApp Web, and Carrot Weather.
Space Nation Navigator
Space Nation Navigator is a bit of a strange app. It is part game, part exercise and part educational. On the game side, users have to navigate the Mars Rover, put the International Space Station back into orbit or move their Martians to safety before a sand storm hits Mars. When it comes to exercise, Space Nation Navigator provides users with a range of exercises and Yoga videos to prepare them for space travel and working in an anti-gravity environment. The education aspect teaches users about the planets, and star constellations, and then offers quizzes on what has been taught.
Platform: Android and iOS
Cost: A free download.
Stockists: Visit the store linked to your device.
Memrise takes a new approach to help people learn new languages. Instead of providing a user with random phrases and words to memorise, the app connects you with a person already fluent in the language you want to learn. In turn, the person you are speaking to wants to learn the language in which you are fluent. Once your profile is filled out and languages selected, it connects you with people around the world who are interested in your language, and then allows you to chat with them in real-time. Memrise also lets one learn new languages through games, chatbots and grammarbots that help with spelling, tenses and pronunciations.
Platform: Android and iOS
Cost: A free download.
Stockists: Visit the store linked to your device.
Hitman Sniper is loosely based on the Agent 47 movie released a few years ago. The game offers players the ability to hone their shooting skills through a range of training courses and, once they think they are ready, they can start taking out the bad guys. Things start off easy enough, but they get more and more difficult as one progresses through the 150 missions on offer. One will also have to upgrade various gun components, like scopes, magazine capacities and silencers, to make the missions a little easier. Hitman Sniper lets users buy 16 to tackle each of the missions – either with real money or via the points accumulated by completing missions. Money and points can also be used to upgrade firearms.
Platform: Android and iOS
Cost: R7 – with a range of in-app purchases.
Stockists: Visit the store linked to your device.
The iconic Snake game that was preinstalled on most older Nokia phones has had a complete make-over. It now uses Facebook’s AR technology, meaning that you have to navigate the snake around obstacles in your home or office, all the while collecting coins and stars that change the snake’s speed and length. Unfortunately, Snake Mask is only available on Nokia’s new range of smartphones. However, it should not take long before it slithers onto other devices.
Platform: New Nokia smartphones running Android.
Cost: Free to use through the Facebook app installed on the device.
Stockists: Available through the Facebook app.
Although this is by no means a new app, it is an extremely useful one, and one that not many people know about. Tapping out WhatsApps on your phone is easy enough, but thanks to WhatsApp Web it can be even easier. Open the WhatApp Web page under WhatsApp and you will see a QR code. Scan this code through WhatsApp on your mobile and you will be shown a replica of what you would normally see on your phone. You can then type and reply to messages using your computer instead of having to stop everything and unlock your phone every time a message comes through. WhatsApp Web is great if you share your computer with other people as it automatically disconnects when the browser is closed. However WhatsApp also offers an app that when installed will stay connected to your phone unless you manually remove it.
Platform: Any up-to-date Internet browser
Cost: Free to use and install
Stockists: Visit www.WhatsApp.com
There are thousands of weather apps on the Internet these days and all of them do the same thing – inform you of the weather in your area. However, Carrot Weather has taken what is just another app and turned it into something fun. By fun, I mean sarcastic, rude and completely politically incorrect. A user starts off by selecting religious and political views. It then asks about personality, ranging from friendly to homicidal to overkill – which includes profanity. So, for instance, instead of waking up to to the standard partly cloudy forecast, Carrot Weather will display something like: “It’s only partly sunny, the sun is a total effing failure.” It also has a range of insults that it throws at you whenever you open the app – some of them downright insulting, so it is definitely not for those who are easily offended. The app’s user interface is very simple, displaying a week’s daily forecast and hourly forecasts for the day selected.
Platform: Android and iOS
Cost: Free to download but with adverts. The premium, advert free version costs R12 per month.
Stockists: Visit the store linked to your device.
SA Start-up reinvents PABX
For any South African business, the idea of setting up or changing a telephonic switchboard system is the stuff of nightmares. Dealing with expensive hardware and hearing things like QSIG and VOIP is not what you’d call exciting.But now there is an app.
Enter BuzzBox (www.buzzboxcloud.co.za), a web-based telephone switchboard that is aimed at small and medium sized businesses wanting to take the hassle and cost out of the company switchboard. Whether you are a small one-man operation or a larger organisation with staff working remotely, BuzzBox is the best switchboard solution.
What sets BuzzBox apart from anything else on the market is its easy-to-use dashboard. It puts you in control of everything from picking your phone number to setting up voice prompts and managing your business-hours schedule.
BuzzBox was developed when the startup behind it, Jini-Guru, needed such a service for its own use across multiple continents. “When we started Jini-Guru we could not find a seamless online process that would allow us to set up a full web-based switchboard, so we decided to build one for ourselves,” says Mike Smits, Director at Jini-Guru.
He says a lot of startups today are tech savvy and know how to use apps and the services that go with it. “It’s the uberisation of services and its driving demand for instant service activation.”
BuzzBox works as an app on both iOS and Android but users wanting a desk phone option can choose from a variety of devices on offer or use their existing VOIP phones.
Setting up a BuzzBox account takes 5 minutes. During registration your FICA documents are uploaded [ID and proof or residence] and you get to pick your phone number before the account is created. Companies that want to keep an existing number can do so too.
The real magic happens when you log on to the BuzzBox Dashboard. The main screen displays a summary of statistics for your account while the left-hand menu provides you quick access to various configuration settings and reports.
Setting up new extensions or external numbers is done with a few clicks and you can even set up various departments which is a great way to route a call to various people in a department, like sales or support.
The intuitive user interface also makes it easy to set up hold-music and voice prompts. You can add voice prompts by recording them straight to your phone, just make sure you use a clear voice with quiet surroundings for the best customer experience.
One of the main features of BuzzBox is its call recording feature that allows an organisation to record calls for legislative purposes, such as a lawyer, or for customer service purposes such as support. Recordings are stored securely online, and you have the ability to download recordings for playback. Companies can opt-in for this service and it’s free to use. Recordings are stored online and are fully encrypted so only you can listen to, or download them. Storage costs R1 for every 1000 minutes of stored recordings.
Other features include call forwarding and scheduling. The latter allows you to set office hours for your organisation which will divert calls to an after-hours messaging service. You also have the option to enable routing to an employee who is on call after hours.
BuzzBox also has a reseller program for companies wanting to offer this as a switchboard solution to their existing customers.
The costs for this service is R89 p/m for the first phone number which includes your first extension for free. Thereafter you’ll pay R89p/m per extension. Calls between extensions are free but you pay per second for all outgoing phone calls. More info on pricing can be found here: https://buzzboxcloud.co.za/pricing/
BuzzBox is offering a Launch promotion where they are offering the first line and extension free for 12 months. Only pay for calls. Use promo code “feoifyaa” during sign-up to apply your discount.