My name is Shmulik Kachlon I am an architecture specialist working for one of the four big banks in Australia. For the last few years, I have led architecture and development of complex solution with real-time capabilities. The different between the work that I have done up until a few months ago and what I am working on now is the high volume of transactions that we need to process per second, and also the ability to process some additional business wisdom for customers at the back of each transaction. That led us to realising that we need to investigate on a real-time computation system ans Storm was one of the candidates.
Recently I was tasked to lead a proof of concept using real-time computation system for one of our projects. We needed to quickly experiment various solutions for real-time computation and Twitter Storm was chosen for that purpose.
How did the book help me ?
Before reading the book I was completely an outsider to the open source world in general and to the real-time computation systems in particular. Reading the book put me is a position where I know enough to start building commercial solutions that require real-time computation system. The book is written and organised in such a way that even people without much knowledge of real-time computation and complex event processing systems can get themselves up-skilled quickly enough so that a solution can be provided to cater for the business needs.
It was only after reading a few chapters of the book and we were in a position to start writing a proof of concept which integrated with financial systems and provide important information to our demo customers at real-time, something that was hard or even impossible to achieve before embarking on the Strom journey. My feeling is that the book will certainly be a significant milestone in what we will be building for real-time event processing and analytics.
Key takeaways :
There are a few point that I would like to provide a feedback on and commend the author on getting them well executed:
• Easy to read. Written in a simple way for technical people who didn't necessary come from Java or the real-time computation world • Make you exposed to the open source world in general and to Storm in particular. We got ourselves up and running with a storm cluster running our business logic and handling the high volume of transactions we had to cater for • Setting up Storm Clusters and Topologies is made really easy if you follow the simple yet great instructions in the book • The book provides a very clear understanding about the Storm's components and their role in the Storm architecture • We developed a sound ability to design a lambda architecture based on best practices • Integration with Hadoop map reduce is made easy and is explained is a great way. You can see why this two systems go hand in hand • Last but not least, before reading the book I got myself into reading too much information about Strom on the internet and I wasn't able to be in a position where I can start building the right solution for the enterprise I work for
In summary, the book gave us an insight not only on the powerful world of real-time computation and event processing to cater for high volumes of transactions with message delivery guaranteed but also the ability to combine that with the big data world in order to achieve an end-to-end solution which will serve the business for many more years and will be a strategic component of the overall enterprise architecture.
There's a ton of material here, and some of the recipes would definitely be useful to anyone trying to build out their first Storm topology, but I had a few problems with the book:
1) The first chapter doesn't introduce basic Storm concepts, instead directing the reader to review project documentation before continuing with the book. While I don't disagree that Storm's documentation is useful and plentiful, I expect a book to at least attempt a summary of important concepts so that it may stand on its own in the face of future project changes.
2) The book relies heavily on poorly defined buzzwords to motivate Storm (indeed, many of the ideas the book relies on in the first chapter are difficult to define clearly in 2013).
3) The specific technology choices the book suggests (Java, Eclipse) are not what I would suggest to new users of Storm.
4) I found much of the writing to be unpolished, and stumbled over enough grammatically awkward sentences that I found the reading experienced tarnished.
If you really need to walk through a variety of Storm projects and are interested in the ecosystem of technology you're likely to encounter while implementing Storm topologies, this book may be for you. If you're looking for a clear introduction to Storm concepts and concise examples of how to apply them, look elsewhere.
I read this book ebcause I am really interested in real-time processing and because I think Storm is really nice framework.
The book is written in 'cookbook' style, so lots of chapters on really specific subjects. The book starts out with a simple hello world example, but things get gradually more complex.
The book doesn't just cover programming with Storm but it goes into good detail on subjects such as unit- and integration tests, it even details how to do continuous delivery and deployment on Amazon EC2.
I can definitely recommend this book to anyone who is interested in real-time processing. Whether you're a Storm expert or still new to the framework, this book is for you!
The book presents a couple of problems and how to solve them in Storm. Or, at least, that's the premise.
The whole problem is that the problem is described in very high level, like "process the logs". While this sounds alright, the book never goes to explain how the logs are store, they format and so on. So you have a solution for a very high level which you have no idea how good the solution could be because you have no idea how the data exists. It simply does and the Storm topology process it and that's it.
Another problem is that, because all solutions are written in Java and Java is too damn verbose, instead of showing the whole code, the book goes into "Go to the file X, use the IDE to automatically add the imports and add this function." You never get a clear picture of how a complete bolt looks like.
The biggest issue about cookbooks is that a lot of them are only a list of examples for explaining how to use a specific tool because the documentation and the community are not enough in themselves to start.
As a consequence, natural questions occur. How to know whether the examples are still valid because the non documented API might have evolved? Where can additional information be found? Is it even worth considering this tool if there is no real community as it means its future is really uncertain?
Storm is not a project which has just appeared yesterday, the documentation (github wiki) is enough in itself and this platform is already used by a number of companies in production (Twitter, Yahoo! or Groupon). Nathan Marz, the creator of Storm, even explained how the Twitter experience made Storm even more solid. And for those interested, the talk is available via InfoQ : http://www.infoq.com/presentations/st....
In short, writing such a cookbook for Storm would not be useful and the author acknowledges it.
The strength of the book is really about the experience of the author in how to use Storm as an element of a bigger solution.
This book is about * how to use the right level of abstraction (standard Java API or Trident) * how to save information before and after Storm processing (Kafka, Redis, Cassandra, HDFS) * how to deploy the cluster automatically (Vagrant, Puppet, VirtualBox, Pallet, AWS) * how to set up a two-speeds architecture (the lambda architecture as described by Nathan Marz) * how to use machine learning (Storm-Pattern, Storm-R, Trident-ML)
As you can see, it covers lots of use cases. The book is still a cookbook. It won't explain everything. So if you are looking for a Storm documentation, it is not a good pick. But if you are looking at using Storm in a project and you need to quickly be able to test/experiment with various solutions, this book is definitely a must read.
About me I started using Hadoop at the beginning of 2010 and I am now a Hadoop consultant and trainer for Xebia. Storm is another way to do distributed computing (in a 'real-time' fashion). And for those who know about the lambda architecture, it is obvious they are are not alternative solutions but really two complementary tools, each solving a different side of the same problem. That's why I was also always interested in Storm and this book was a good way to get up to speed.
Disclaimer I received a free copy from Packt before doing that review but that's the only relation between me and the publisher.
The "Storm Real-Time Processing Cookbook" by Quinton Anderson is a comprehensive set of recipes for getting the most out of a Twitter Storm deployment. One thing that really differentiates the author’s recipes is the focus on the enabling technologies that work together with Storm to provide a complete solution. He discusses building a Storm cluster with Puppet, using Kafka and Redis as data sources for Storm, using R for machine learning, Maven for builds, etc. It is tough to get full value out Storm outside the ecosystem in which it works, and the author makes it possible for someone new to Storm to solve real-world problems,
The coverage is quite comprehensive, and includes bleeding-edge use cases. For instance, the sections on Storm and Hadoop integration, and Real-Time Online Machine Learning were fascinating, and represent the first recipes I've seen for implementing these advanced use cases in Storm.
If there were one section I would add, it would be on operating Storm in production. The author rightly focuses on testing, Continuous Integration and Deployment, but does not mention what options exist for tuning Storm and monitoring its performance in production. An additional chapter on this topic would make it even easier for folks to embrace Storm as a production-ready system.
One other minor criticism I have is that the code examples can be a little tough to read. Since much of the work is in Java, and Java can be quite verbose, syntax highlighting or better justification would go a long way to making the examples more readable.
Overall, this is a great book, and I would highly recommend it to anyone looking to learn Storm.
Recently, I can't help the feeling that PacktPub goes for quantity instead of quality - the majority of their recent books are just basic tutorials you could find on the web (without paying). It just feels like noone really put any real work (except of basic editing) and there's not much value added in those books.
But. BUT. This book is NOT like that. It's a very good introducing to real team processing solution named Storm - some are describing it as a Hadoop in the world of RTP. It's not easy and straightforward as we're talking about some serious distributed computing, so just forget about "Hello World" grade examples. But this books makes it job - it's good and precise in the descriptions, examples are entertaining and engaging. And you can usually map them to the solutions you'd think about in your environment (as they are not too general and in the same time - not to specific).
Cons? Two: 1.) Some may not like mixing Storm with plenty other technologies that are not crucial in the context of learning Storm -> like Closure, Drools etc. I found it awesome (even if I didn't do anything in Closure yet), but hey - it's just me :) 2.) Continuous Delivery chapter didn't fit the book really. I got the idea, but it seems very loosely attached to the rest of the book.
This book has been an interesting read. It involves more about code and practicality rather than just pages of theory. It teaches you all the necessary concepts and features needed to launch a full fledged project on your own. That being said, this is not a book to be read by beginners. You need to have some basic idea about various big data technologies out there, to get some understanding about using Storm.
Be minded that this book also assumes that you have working knowledge about Java. Even if you don't, you could read the book for the concepts and good practices for building real-world applications, but you must know Java and the tools around it to make the best out of this book.
Understanding and checking out Storm is a must if you're into big data processing in real-time, machine learning and the likes.
I went through this book. It is quiet explanatory and really very descriptive.
I would recommend this for all beginners to expert level.
I have not played on with examples but still the code is quiet explanatory and gives some detailed idea from how to start till, how to end up in a better way in production and monitoring,
This book gives detailed information with Java API, using saving information in HDFS, Cassandra and other NoSQL systems. Explained in detail about how to deploy in Cluster environment.
About me
I have been working on Hadoop and NoSQL over quiet sometime. Especially on Cloudera and Cassandra NoSQL. For someone who is interested to get into Big data, its a good way to start with.
I received a free copy from Packt before doing that review but that's the only relation between me and the publisher.
This book is a great way to start with Storm. It describes the steps you need to set up development and deployment environment and in "recipe" format will guide you through creating basic workflow elements in Storm terms. It is not a comprehensive reference, but this book is not trying to be. Another thing I like about this book is that it's only 250 pages long - the material is not diluted with a bunch of non-essential stuff.
Hi All , My name Ruchovets Oleg. I am big data architect. Last couple of months we started to work with real time analytic and of course payed attentions to Storm. There is very limited amount of books about Storm and I found the Storm Real-Time Processing Cookbook (http://goo.gl/UzWUt5) very useful. Especially parts which related to Unit testing. Also the part related to real time machine is very promising.