Google Cloud AI Product Review, Part 2

Written by Hossein Javedani Sadaei | 12/19/18 9:00 PM

Most tech savvy people these days know about the potential for cloud computing technology and how the cloud has already affected businesses by effectively storing data and balancing the existing workloads. Google Cloud Platform has been adopted by many organizations for building AI solutions, mostly because it originates with one of the most prominent IT companies - Google. In part 1 of this series, I provided a review of some of Google's AI products. Today, we'll review the remainder.

Cloud Talent Solution

As expressed by Talent Daily, Google's public release of Cloud Talent Solution is a further sign of its vast investment in AI and deep learning, and the swiftly increasing application of these technologies to talent acquisition and management.

Image courtesy of Google Cloud

Cloud Talent Solution combines pre-trained APIs for talent technology providers and enterprise hiring organizations. Applying AI, it more accurately matches job seekers to open positions, and it also controls job searches, profile searches, and more. Companies may try Cloud Talent Solution out for free with 1-10,000 queries per month (minimal costs apply for 10,000+ queries).

Google offers this new recruiting and hiring platform both to help employers with recruiting by streamlining organizational duties and to help applicants by matching their talents and preferences to open positions. To make life easy for both the hiring and the hired, Google offers Cloud Talent Solution to facilitate the candidate's search and the recruiter's process. The primary points of this service are as follows:

Supports enterprise-wide talent acquisition technology through features like commute searches and profile searches;
Employs ML technology to assess job content and job seeker intentions;
Understands and interprets vague job descriptions, job search queries, and profile exploration queries by applying superior ML techniques;
Allows recruiters to give users personalized job search experiences by matching them with jobs that fall within their desired commute time and mode of transit;
Matches veterans transitioning from military careers with relevant civilian jobs; and
Delivers accurate results in profile search with the aid of an advanced ML model.
Companies can try Cloud Talent Solution out for free (limited to 10,000 searches per month).
Cloud Talent Solution ties in neatly with another Google product --Google Hire--which is the recruiting software for G Suite.

Dialogflow Enterprise Edition

Dialogflow Enterprise Edition is a service acquired by Google for creating conversational interfaces for various devices, apps, websites, etc., giving companies an easy and natural way to communicate with clients. It is powered by advanced AI methods to identify the intention and context of client queries so that the company's conversational interface can produce effective and accurate replies.

Image courtesy of Google Cloud.

Chatbots can be very easily produced via Dialogflow. These can be extended across multiple platforms and support multiple languages. These chatbots also auto correct spelling errors. Additionally, Dialogflow delivers automated phone service capabilities. Users who call your company's new phone number will speak directly with your Dialogflow agent (currently available only in beta). This service was created based on natural language understanding technology.

Although this service is powerful, other giant cloud service providers like IBM Watson conversational AI from IBM, Lex from Amazon, and LUIS from Microsoft currently have more users and greater popularity.

Cloud Video Intelligence API

This service makes videos easily searchable and discoverable by pulling metadata, recognizing key nouns, and annotating the content of the video. Using a REST API, it's possible to search every moment of every video file to locate each event of key nouns and to evaluate their importance. Applying this service is easy, as you can see through demos on the internet with codes included. Reza Mirkhani offers another good example of using this API.

Image courtesy of Hacker Noon.

The following image demonstrates how this service can capture important information from each frame of a video.

See following example of our implementation of this service on a video from Gbike and Dinosaur sample video.

Cloud Vision API

This API connects your company to Google’s image recognition abilities. Its API/REST interface is similar to images.google.com, but it does much more than showing similar images to users. If you enter an image of a face, for example, this API is able to detect whether the image shows a dog, a cat, a human, etc., and it's also able to identify parts of the image's face (see following example of using this API).

It also attempts to identify whether the figure pictured is happy or sad, working in security or retail, etc. With this API, it is possible to do the following:

Facial detection (identifies each face within a picture, its emotions, whether or not the figures are wearing hats, etc.; it does not, however support facial recognition.)
Landmark detection
Logo detection
Label detection
Optical Character Recognition (OCR, which detects text, etc.)
Image features (i.e. primary colors used, cropping suggestions, etc.)
Web detection (pulls similar images from the internet)

I implemented this API for landmark testing, and it successfully identified the Petronas Twin Towers in Kuala Lumpur (see following output).

The following test is another example of how this API is a powerful OCR tool. Notice how the image is noisy, which means it would be hard for an AI system to determine the correct characters.

This API is able to run in the following languages: C#, Go, Java, Node.js, PHP, Python and Ruby.

Cloud Speech-to-Text

Google Cloud Speech-to-Text converts audio files to text by using strong neural network models in an API. At the time of writing this blog, this service works for 120 languages (and variants) to support companies worldwide. This API enables voice command-and-control, transcribes audio from call centers, and more. With Google's ML technology, it can also handle both real-time streaming and prerecorded audio. Cloud Speech-to-Text:

Works for 120 languages and variants;
Automatically recognizes which language is being spoken;
Renders text transcriptions in real time or stores transcriptions in a file for both short-form and long-form audio;
Correctly transcribes proper nouns and appropriately formats dates, phone numbers, etc.; and
Offers pre-built models suited to your needs

The following table compares Google Cloud Speech-to-Text's prices with prices for similar APIs produced by Azure and AWS prices based on information from TechTarget.

Cloud Text-to-Speech

This API "enables developers to synthesize natural-sounding speech with 30 voices, available in 14 languages and variants." Cloud Text-to-Speech produces high-fidelity audio using Google's neural networks combined with DeepMind's research in WaveNet. Business owners can use this API in across several devices and application to foster interactions that feel real. Text-to-Speech:

Is powered by machine learning from Google;
Offers exclusive access to DeepMind's WaveNet Voices;
Allows users to choose from 30+ voices;
Is easy to integrate with applications and devices that are already in use; and
Supports several common use cases

Cloud Natural Language API

This API "reveals the structure and meaning of text both through powerful pre-trained machine learning models in an easy to use REST API and through custom models that are easy to build with AutoML Natural Language BETA." To show some use cases of this API, we implemented it in the following sample text: "Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence, coined the term Machine Learning in 1959 while at IBM." Here are the results:

As stated on Google Cloud's website, Cloud Natural Language makes it easy to analyze the sentiment behind social media posts, assess the intent of customer conversations in call centers and messaging apps, and obtain other information surrounding people, places, events, and more in text documents, blog posts, news articles, etc.

Cloud Translation API

The Translation API dynamically translates between supported languages using Neural Machine Translation. It is a single, programmatic interface that is extremely responsive. Websites and applications can integrate with the Cloud Translate API so that source text can be translated into a new language. This API can also identify unknown languages. Developers who don't have extensive machine learning expertise can train custom models. Best of all, Google's research teams are constantly updating the API to add languages, create new language pairings, and increase accuracy. As enumerated on the Google Cloud website, the main features of this service are:

Language detection
Programmatic access
Continuous updates
Text translation
Language detection
Adjustable quota
Affordable, easy pricing

Cloud Inference API

Cloud Inference API is for companies of a particular size that measure clicks, sensor readings, and other event-driven data provide datasets that can beat millions or even billions of rows of data. It’s no simple feat performing a system that can obtain insights from this amount of data, however, by using new plug-and-play solution from Google i.e., Cloud Inference API, it is easy. Time-series interpretation is essential for the day-to-day operation of various organizations. Several common examples involve analyzing foot traffic and conversion for retailers, identifying data fraud, identifying correlations in real time over sensor data, or creating high-quality recommendations. With Cloud Inference API Alpha, it is possible to infer insights in real time from time-series datasets.

Conclusion

In the previous blog, some of the main Google Cloud AI products including Cloud AutoML, Cloud TPU, Cloud ML engine, and BigQuery ML ML were reviewed, and some details were provided about each. In this blog, we've reviewed some other APIs which can be used to build useful AI applications using voice, video, text, etc. We've also shared some examples and outputs of the models presented.

View full post