Jump to ratings and reviews
Rate this book

Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning

Rate this book
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP.

Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way.

You'll learn how

Employ best practices in building highly scalable data and ML pipelines on Google CloudAutomate and schedule data ingest using Cloud RunCreate and populate a dashboard in Data StudioBuild a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQueryConduct interactive data exploration with BigQueryCreate a Bayesian model with Spark on Cloud DataprocForecast time series and do anomaly detection with BigQuery MLAggregate within time windows with DataflowTrain explainable machine learning models with Vertex AIOperationalize ML with Vertex AI Pipelines

464 pages, Kindle Edition

Published March 29, 2022

80 people are currently reading
184 people want to read

About the author

Valliappa Lakshmanan

17 books25 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
24 (38%)
4 stars
23 (37%)
3 stars
11 (17%)
2 stars
3 (4%)
1 star
1 (1%)
Displaying 1 - 14 of 14 reviews
Profile Image for Henne.
159 reviews74 followers
September 18, 2019
A good overview of data science and machine learning techniques using 'big data' technologies on GCP; a good companion to the GCP Data Engineering courses on Coursera.
Profile Image for Ben.
198 reviews15 followers
March 1, 2019
Somewhat of an unfair rating of two stars as I do not think I was among the intended audience for this book or at least didn't have the right expectations for this book. I have a background in software engineering, have used GCP for software engineering purposes, but do not have a data science background. To me, the book seemed like a mix of concepts, product descriptions, code snippets, and a single real-world example that, in mixing these, did not deliver an interesting, instructive message on any of the individual parts. It didn't really spend enough time at the conceptual level for me to feel like I understand the data science concepts any better. The command-line and code snippets didn't seem like useful knowledge as they are easily looked up in a reference and not "reusable" knowledge. I was also bored to death of the airline delays example by the end of the book :) I struggled to generalize the information in the book. Given my expectations, I likely would have been better off picking up a book on the introductory concepts of data science than this book.
14 reviews3 followers
July 30, 2020
This book was of great help with the Data Engineering exam of Google Cloud.
Profile Image for Andrew Breza.
517 reviews32 followers
July 17, 2022
As my employer prepares to move to GCP, I've been studying the platform's capabilities and getting excited about what it can do. The other GCP books I've read have covered the platform at a high level, discussing how the different services fit together. This book is much more applied, taking a concrete problem and working through a different aspect of it in each chapter.

My only critique of the book is that the example problem is straightforward enough that most of the firepower the author throws at it is overkill. An R script on a reasonably powerful laptop would have probably had only a slightly higher error rate.
Profile Image for William Anderson.
134 reviews25 followers
May 4, 2018
While this is a great intro to some of the basics and offerings of GCP that can be leveraged for datascience, the book is targeting much more to explaining the pieces of the platform and getting up and running vs anything in depth. While the cloud native solutions such as cloud dataflow are touched on each could have its own book going through architecture integrations more in depth. Nonetheless a solid intro book.
Profile Image for Waits.
2 reviews2 followers
January 11, 2022
Really bad book. Very disordered thoughts, very long paragraphs talking about being a Data Engineer or simple visualizations and little about basic fundamentals of GCP.

Lack of clarity, examples did not work properly. I was not able to even finish the book, I am still not quite sure what was the reason of this book, but the title does not relate to the reality.
Profile Image for Douglas.
160 reviews13 followers
January 16, 2018
A level-headed end to end process for data science and engineering in the cloud (not just Google Cloud). The author was a teammate of mine when joining the company and he should be very proud of this work.
Profile Image for Paul.
238 reviews
April 2, 2021
Useful step-by-step guide to do a simple Data Science project on Google Cloud Platform, including where to get some initial public data to work with, how to create the components on Google Cloud Platform, how to analyze the results, and related things.
Profile Image for Ntombizakhona Mabaso.
107 reviews6 followers
December 7, 2024
This book dives deep into tools like BigQuery and TensorFlow, making it an excellent resource for data science enthusiasts.

While I found it quite lengthy (perhaps because data science isn’t my primary interest), I appreciated the hands-on code examples and the thoughtful suggestions, resources, and articles provided throughout.

The appendix on 'Considerations for Sensitive Data within Machine Learning Datasets' is particularly noteworthy—worth revisiting multiple times for its invaluable insights. A comprehensive guide for those looking to master data science on Google Cloud.
Profile Image for Chaouki.
77 reviews1 follower
October 6, 2023
In a time where every ML book has the fashion mnist and god awful repetitive themes, this book stands head and shoulders above any book i have read in it's style and concept presentation. Like an expensive perfume that when you smell you say it is worth every penny. This book is a breeze. If you've had enough of toy datasets and want to get to know, really get to know Google cloud , this book is a great pic. Buy and thank me later.
Profile Image for Denis Kotnik.
64 reviews1 follower
January 29, 2023
I think the use case (prediction of airplane arrivals on time) in book could have been less complex. I missed more architectural diagrams - I was lost on the moments and needed to re-read pages in order to understand sources and sinks of data pipelines.
In short: good book for a GCP data engineer/scientist (with a bit of Google advertisement between the lines).
Profile Image for Anh Dang.
10 reviews6 followers
August 31, 2021
The book is practical, sufficiently informative. But focus on one language of Python would be better. Could published a book focusing on Java
Profile Image for Josua Naiborhu.
87 reviews4 followers
January 11, 2023
I am amazed by how the author explained the concept in a concise way with the hands-on use case. really loved this book
Displaying 1 - 14 of 14 reviews

Can't find what you're looking for?

Get help and learn more about the design.