Jump to ratings and reviews
Rate this book

Programming Hive: Data Warehouse and Query Language for Hadoop

Rate this book
About the Book: Programming Hive: Data Warehouse and QueryLanguage for Hadoop Need to move a relational database application to Hadoop?Thiscomprehensive guide introduces you to Apache Hive, Hadoopsdatawarehouse infrastructure. Youll quickly learn how to use HivesSQLdialect-HiveQL-to summarize, query, and analyze largedatasetsstored in Hadoops distributed filesystem. This example-driven guide shows you how to set up andconfigureHive in your environment, provides a detailed overview ofHadoopand MapReduce, and demonstrates how Hive works within theHadoopecosystem. Youll also find real-world case studies thatdescribehow companies have used Hive to solve unique problemsinvolvingpetabytes of data. Use Hive to create, alter, and drop databases, tables,views,functions, and indexes Customize data formats and storage options, from filestoexternal databases Load and extract data from tables-and use queries,grouping,filtering, joining, and other conventional querymethods Gain best practices for creating user definedfunctions(UDFs) Learn Hive patterns you should use and anti-patterns youshouldavoid Integrate Hive with other data processing programs Use storage handlers for NoSQL databases and otherdatastores Learn the pros and cons of running Hive on AmazonsElasticMapReduce Contents Chapter 1 Introduction Chapter 2 Getting Started Chapter 3 Data Types and File Formats Chapter 4 HiveQL: Data Definition Chapter 5 HiveQL: Data Manipulation Chapter 6 HiveQL: Queries Chapter 7 HiveQL: Views Chapter 8 HiveQL: Indexes Chapter 9 Schema Design Chapter 10 Tuning Chapter 11 Other File Formats and Compression Chapter 12 Developing Chapter 13 Functions Chapter 14 Streaming Chapter 15 Customizing Hive File and Record Formats Chapter 16 Hive Thrift Service Chapter 17 Storage Handlers and NoSQL Chapter 18 Security Chapter 19 Locking Chapter 20 Hive Integration with Oozie Chapter 21 Hive and Amazon Web Services (AWS) Chapter 22 HCatalog Chapter 23 Case Studies Glossary Appendix References Colop

368 pages, Paperback

First published January 1, 2012

39 people are currently reading
100 people want to read

About the author

Edward Capriolo

3 books1 follower

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
16 (17%)
4 stars
41 (44%)
3 stars
28 (30%)
2 stars
6 (6%)
1 star
2 (2%)
Displaying 1 - 8 of 8 reviews
2 reviews
December 1, 2021
This is a rushed book with bunch of useless information and copy paste from hive wiki.
Crippling outdated.
I'm still trying to finish it though.
Profile Image for Rick.
22 reviews
January 11, 2013
This could have been a much better book had it not been for the apparent haste with which O'Reilly rushed it out the door before (really) doing a final edit. The book is riddled with typographical errors, my favorite being the "dangling" second paragraph of Chapter 17, "Storage Handlers and NoSQL", which ends with: "For example, a Hive query could be run that selects a data table that is backed by sequence files, however it could output" (no kidding).

The overall content is worthwhile, but you have been forewarned, it's not as well edited as other books from O'Reilly. Three stars, solely by content.
6 reviews
August 23, 2016
Really good book to get into Hive and dive deeper. The installation is somewhat outdated but mind you, this book is a few years old. And I'm on mac, which I think is still not officially supported. Trying to build something with hive is filled with uncertainty as I am never 100% sure if it fails because I'm not on Linux or because my queries are wrong.
But still, great book to get into Hive. Can't wait for the second edition coming out early 2017.
Profile Image for Karl.
221 reviews26 followers
October 16, 2014
Maybe 2.5 stars? Not as clear as other O'Reilly texts, and with a ton of mistakes, both in text and code snippets. Clearly a rush job. Still, it'll get you going in terms of being a *user* of Hive. If you want to be an administrator, I'd look to other sources - and make sure you have a solid Java background.
Displaying 1 - 8 of 8 reviews

Can't find what you're looking for?

Get help and learn more about the design.