In Designing Cloud Data Platforms , Danil Zburivsky and Lynda Partner reveal a six-layer approach that increases flexibility and reduces costs. Discover patterns for ingesting data from a variety of sources, then learn to harness pre-built services provided by cloud vendors.
Summary Centralized data warehouses, the long-time defacto standard for housing data for analytics, are rapidly giving way to multi-faceted cloud data platforms. Companies that embrace modern cloud data platforms benefit from an integrated view of their business using all of their data and can take advantage of advanced analytic practices to drive predictions and as yet unimagined data services. Designing Cloud Data Platforms is a hands-on guide to envisioning and designing a modern scalable data platform that takes full advantage of the flexibility of the cloud. As you read, you’ll learn the core components of a cloud data platform design, along with the role of key technologies like Spark and Kafka Streams. You’ll also explore setting up processes to manage cloud-based data, keep it secure, and using advanced analytic and BI tools to analyze it.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the technology Well-designed pipelines, storage systems, and APIs eliminate the complicated scaling and maintenance required with on-prem data centers. Once you learn the patterns for designing cloud data platforms, you’ll maximize performance no matter which cloud vendor you use.
About the book In Designing Cloud Data Platforms , Danil Zburivsky and Lynda Partner reveal a six-layer approach that increases flexibility and reduces costs. Discover patterns for ingesting data from a variety of sources, then learn to harness pre-built services provided by cloud vendors.
What's inside Best practices for structured and unstructured data sets Cloud-ready machine learning tools Metadata and real-time analytics Defensive architecture, access, and security
About the reader For data professionals familiar with the basics of cloud computing, and Hadoop or Spark.
About the author Danil Zburivsky has over 10 years of experience designing and supporting large-scale data infrastructure for enterprises across the globe. Lynda Partner is the VP of Analytics-as-a-Service at Pythian, and has been on the business side of data for over 20 years.
Table of Contents 1 Introducing the data platform 2 Why a data platform and not just a data warehouse 3 Getting bigger and leveraging the Big 3: Amazon, Microsoft Azure, and Google 4 Getting data into the platform 5 Organizing and processing data 6 Real-time data processing and analytics 7 Metadata layer architecture 8 Schema management 9 Data access and security 10 Fueling business value with data platforms
Good books to know why need to used cloud data platform? Compare with on-prem. Using AWS, Azure or Google Cloud to build. Detail every step and what to be aware of it. Gôd for beginner like me.
**Designing Cloud Data Platforms** by Manning offers a useful overview for data engineers, software architects and experienced developers working with cloud data solutions.
The book effectively describes various architectural approaches, providing a balance of theoretical explanation and practical considerations. The inclusion of pros and cons for each architecture is a helpful element.
The identification and description of relevant product offerings from AWS, Azure, and Google Cloud are also beneficial, given the extensive range of services available.
The book addresses practical challenges such as data deduplication and file organization for batch processing with a pragmatic perspective.
The authors' experience in the field is apparent and contributes to a knowledgeable presentation of the material.
In conclusion, "Designing Cloud Data Platforms" is a warmly recommended read for all architects and developers looking to build robust and scalable data solutions in the cloud. It offers a strong blend of theory, practical guidance, and real-world insights that make it an invaluable addition to any technical library.
If you are interested on how cloud data platforms work today, this book definitely for you. The author deep dive into how the data is ingested or collected to our system, go through different pipeline and procedure, and produced interesting data in the end.
I like that it actually has an architectural view that can be applied using a broad number of technologies and uses, ant it's refreshing to have a book that has little amount of buzz. While reading the book I had a couple of aha moments about my experiences, things that could be improved, etc.
One problem with this book is that it mentions a lot of technologies and sometimes it goes into the details of them, also modifying the architectures to take into account specifics of a particular technology, and while there is value in having the implementation connection of the presented contents, it makes a large part of the contents specific for the current time frame.
The book is thoroughly detailed. It showcases a modern, loosely-coupled architecture that the authors developed after their experience with numerous data platforms.
Every layer of the architecture is explained comprehensively, ensuring readers can get started while sidestepping common pitfalls. The authors delve into potential tools for each layer, discussing options from the three leading cloud vendors as well as insights into implementing certain components in-house.
A comprahensive, clear, theoretical guide. It provides a solid introduction to architectures and tools across the entire spectrum of cloud data solutions. Authors present batch and real-time processing on Azure, Google and Amazon cloud platforms. Book is complete when it comes to designing data platform. More for beginners, but still well written and allowes to organize your knowledge.
This book can be a reasonable introduction for readers not aware of Data Lake architecture. But if you know basics already, I don’t think you will learn a lot of new stuff. Their “architecture diagram” , were they tried to put all batch and real time components together is completely unusable (for me). Not enough practical examples and e2e scenarios.
Excellent book. There is so much helpful information that I will reread. This book is very beneficial for software architects and developers who will build cloud data-driven applications.