Nowadays, using the cloud to develop artificial intelligence apps is common for medium and large-sized companies, and compared to other cloud service providers, Google has a big share of the market.

Introduction

Here at Avenue Code, it's been a while since our data science team adopted Google as one of the main service providers for its data science projects. Over time, we've gained some insights on Google Cloud's AI products and tools, and we'd like to share our experience with our readers. This two-part blog will introduce these tools, present comparison studies, and provide useful learning resources. Let's start with Cloud AutoML.

Cloud AutoML

In general, AutoML was created for algorithm selection, hyperparameter tuning of models, iterative modeling, and model assessment. Lately, at the Cloud NEXT 2018 conference, Google AutoML was released in its beta version. This service dramatically decreases the steps required when training and tuning a machine learning method. 

Screen Shot 2018-10-11 at 4.48.08 PM

Image courtesy of Google Cloud

Cloud AutoML beta is a good example of a machine learning product that allows developers with minimal machine learning expertise to still train high-quality methods particular to their business requirements by leveraging Google’s transfer learning. AutoML consists of three main modules: AutoML Vision, AutoML Natural Language, and AutoML Translation.

AutoML Vision

AutoML Vision, a graphical drag-and-drop engine that allows users to leverage Google’s cloud computing backend to train self-learning object identification and image discovery patterns, is retiring alpha and opening beta to the public.

One customer for this product is Disney. Disney has been using Cloud AutoML so that its clients can find and shop products related to specific Disney characters.

Cloud AutoML - Custom Machine Learning

Image courtesy of Analytics Vidhya 

 AutoML Natural Language

This API allows more people to build custom machine learning models to group content into a custom set of categories. Custom machine learning models for classifying content are beneficial when the pre-defined classes are available. It is also possible to use the AutoML Natural Language UI to upload training data and train and test a custom model.

Screen Shot 2018-10-11 at 4.49.34 PM

Image courtesy of Google Cloud

AutoML Translation

Lately, Google is keeping a competitive edge in the neural machine translation (NMT) race with AutoML Translation, a new iteration of its AutoML machine learning series of products. AutoML actually supports Google’s cloud platform clients to use their own training data to customize and train domain-specific machine learning methods. There are many languages that have already been supported by this API, which can be found here.

Other AutoML Providers

Even though Google AutoML is one of the most successful services in this domain, there are other companies and startups that have been working on providing similar services. The following table provides a short list of these organizations. More information and details can be found here.

Screen Shot 2018-10-11 at 4.49.49 PMScreen Shot 2018-10-11 at 4.50.05 PM

Image courtesy of Applied AI

Cloud TPUs

A tensor processing unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) produced by Google especially for neural network machine learning. The tensor processing unit was released in 2016 at Google I/O, when the organization announced that the TPU had already been used inside their data centers for over a year.  Cloud TPUs are a group of hardware accelerators that Google produced and optimized especially to speed up machine learning workloads for training and reasoning programmed with TensorFlow. Cloud TPUs are meant to give the best representation per dollar for targeted TensorFlow workloads and to enable machine learning engineers and researchers to go through iterations more quickly. The following are the main features of this product:

  • TPUs can accelerate machine learning
  • TPUs have been built for AI on Google Cloud
  • TPUs make faster iteration possible 
  • TPUs can be used with state-of-art methods

Screen Shot 2018-10-11 at 4.50.17 PM

                                                                                                                                      Image courtesy of Google Cloud
Cloud TPU Version 2

Although the first TPU focused on efficiently running machine-learning models for tasks like language translation, AlphaGo Go strategy, and search and image recognition, the more intensive task of training these models was done separately on top-end GPUs and CPUs. TPU2 is intended to both train and run machine learning models and cut out the GPU/CPU bottleneck. The authors of the RiseML blog measured throughput in terms of images per second on synthetic data (i.e. with training data created on the fly at various batch sizes), and here's what they found:

Screen Shot 2018-10-11 at 4.50.28 PM

Image courtesy of RiseML Blog

Also, the following analysis of cost efficiency outlines TPU2's considerable advantages:

Screen Shot 2018-10-11 at 4.50.39 PM

Image courtesy of RiseML Blog

With this pricing, Cloud TPU is the clear winner.

Cloud Machine Learning Engine

The Google Cloud Machine Learning Engine is a large-scale machine learning model that covers a broad set of ML. It is not a SaaS program that you can just upload data to and start using. It is integrated with other Google Cloud data platform products, such as Cloud Storage, Cloud Dataflow, and Cloud Datalab. In short, this service is a computing platform which by itself does not do ML jobs; developers should implement models by writing codes.

Cloud ML Engine Features:
  • Automatic Resource Provisioning
  • Server-Side Preprocessing
  • Hypertune
  • Integrated
  • Multiple Frameworks
  • Portable ModelsScreen Shot 2018-10-11 at 4.51.02 PM

The main competitor of the Cloud ML Engine in the market is SageMaker from Amazon, so let's compare the two: Both AWS and Google Cloud provide Jupyter notebook, with backend running on a cloud VM that has pre-installed machine learning frameworks and cloud services. There are some differences between the two, however. ML Engine supports only Tensorflow, scikit-learn and XGBoost frameworks. On SageMaker, you can use MXNet and your custom libraries as well. You get new versions of Tensorflow on ML Engine weeks to months before you get them on SageMaker. If you plan to use automatic hyperparameter optimization, this works better on ML Engine, both in terms of results and time. Other comparisons are detailed in the  table below:

Screen Shot 2018-10-11 at 4.51.12 PM

Major companies that have adopted Google ML Engine include Airbus, Home Depot, Snapchat, Evernote, Niantic, Telus, Accenture, and Pivotal. 

The following links provide some important resources for learning more about Google Cloud ML Engine:


BigQuery ML Beta

Google has announced a beta version of BigQuery ML, an innovative software that allows users to develop some machine learning methods inside the Google BigQuery Cloud data warehouse using SQL commands. BigQuery ML lets data scientists and data analysts create and apply ML models on scale-structured or semi-structured data directly inside BigQuery, using simple SQL, in just a few seconds. The main features of BigQuery ML are:

  • BigQuery ML makes predictive analytics accessible and simple
  • This service accelerates time to insights
  • BigQuery ML scales automatically without the need for manual setting

Limitations of BigQuery ML

This technology, however, is restricted in the number of models that it can support. So far, BigQuery ML supports only two types of models: linear regression models that forecast numerical values, such as stock price prediction, and binary logistic regression models that can be applied to classify an email as spam and do other relatively simple tasks in data sets. BigQuery ML may not satisfy data scientists who prefer to use additional tools and methods to build models. On the other hand, BigQuery ML does make it possible for the users of data analytics with a good knowledge of SQL and a poor understanding of other advanced ML models to start developing models without having to acquire new languages and also to use further analytics tools to solve some simple problems (see the example below).

Screen Shot 2018-10-11 at 5.03.10 PM

Image courtesy of Medium

The main advantage of BigQuery ML is that it directly integrates machine learning aptitudes into a data warehouse's SQL interface.

Conclusion

In this blog, we reviewed some of the main Google Cloud AI products, including Cloud AutoML, Cloud TPU, Cloud ML Engine, and BigQuery ML. We also compared these products with similar services from Google competitors. In the next blog, we'll examine additional Google Cloud AI tools.


Author

Hossein Javedani Sadaei

Hossein Javedani Sadaei is a Senior Data Scientist at Avenue Code with a post-doctoral in big data mining and a PhD in statistics. He works mostly with machine learning and deep learning in retail, telecommunication, energy, and stock. His main expertise is developing scalable machine learning and deep learning algorithms using fuzzy logics.


Google Cloud AI Product Review, Part 2

READ MORE