Graph data closes the gap between the way humans and computers view the world. While computers rely on static rows and columns of data, people navigate and reason about life through relationships. This practical guide demonstrates how graph data brings these two approaches together. By working with concepts from graph theory, database schema, distributed systems, and data analysis, you'll arrive at a unique intersection known as graph thinking.
Authors Denise Koessler Gosnell and Matthias Broecheler show data engineers, data scientists, and data analysts how to solve complex problems with graph databases. You'll explore templates for building with graph technology, along with examples that demonstrate how teams think about graph data within an application.
Build an example application architecture with relational and graph technologies Use graph technology to build a Customer 360 application, the most popular graph data pattern today Dive into hierarchical data and troubleshoot a new paradigm that comes from working with graph data Find paths in graph data and learn why your trust in different paths motivates and informs your preferences Use collaborative filtering to design a Netflix-inspired recommendation system
Disclaimer: I'm a technical proofer for this book, so I have access to the full text before it's published.
This book attempts to show when to use graph databases, how to model data as graph, how to perform queries, etc. It's more a conceptual book than in-deep tutorial for Gremlin language & Apache Tinkerpop framework - before diving into them you need to understand when to use graph databases, what are alternatives, etc. After you understand that you can switch to other books, such as, Practical Gremlin and/or Graph Databases in Action, that provide more in depth coverage of the Tinkerpop/Gremlin.
Very few books make the attempt to explore, traverse, and select the application of theory to real world with its pros and cons. Either books are too theoretical, and its a lifetimes work to implement them correctly, or they show the practical implementation aspects by loosely relating it to passing theoretical connections. If anyone is planning to understand sound theoretical connections to the realities and vagaries of implementations of those theory in production to make profits, this is the book to go for. Remember that this book correctly identifies/classifies the profundity of all aspects of practical implementation, and chooses its grounds carefully. So this is for building profitable data products
The book fails to warn at first sight that this book is mainly about DataStax more than graph databases. Consider reading if you are using DataStax as your main graph database.
The introduction to graph modeling is nice and instructive.
The book outlines pros and cons of graph schema modeling, queries, has extensive amount of examples. The gold part was 10+ tips of graph modeling, like: vertex-edge-vertex should read like a sentence or phrase from your queries. The downside was a graph storage engine selected as a use case for book examples. Most of the query examples were suited to Cassandra table structure limitations, I believe more simple graph engine could have been selected.
(Finished it a while ago but had forgotten to be update this) Love the fact that chapters are evenly distributed. GSL is great, gave me ideas on how to implement it with mermaid.js I'm missing: who uses Cassandra ? What are gremlin alternatives (e.g CQL ? Or the python ORM for Cassandra ?) ? Why Cassandra and not Neo4J? Code changes could be clearer using lines changed highlighted
This is firmly in the realm of an introductory book promoting the use of DataStax distributed graph database built on Cassandra. As an introduction it provides good guidance on thinking about and modelling data using a graph approach instead of the more familiar tabular approach. However, many of the specific code examples are verbose and irrelevant if you do not plan on using DataStax.