Hadoop Beginner's Guide Free Download

Rating:

(7 reviews)
Author: Garry Turkington
ISBN : 1849517304
New from $44.99
Format: PDF

Posts about Download The Book Hadoop Beginner's Guide Paperback Free Download from mediafire, rapishare, and mirror link

About the Author

Garry Turkington

Garry Turkington has 14 years of industry experience, most of which has been focused on the design and implementation of large-scale distributed systems. In his current roles as VP Data Engineering at Improve Digital and the company’s lead architect he is primarily responsible for the realization of systems that store, process, and extract value from the company's large data volumes. Before joining Improve Digital he spent time at Amazon UK where he led several software development teams, building systems that process the Amazon catalog data for every item worldwide. Prior to this he spent a decade in various government positions in both the UK and USA.

He has BSc and PhD degrees in computer science from the Queens University of Belfast in Northern Ireland and a MEng in Systems Engineering from Stevens Institute of Technology in the USA.

Download latest books on mediafire and other links compilation Hadoop Beginner's Guide Free Download

Product Details
Table of Contents
Reviews

Paperback: 398 pages
Publisher: Packt Publishing (February 22, 2013)
Language: English
ISBN-10: 1849517304
ISBN-13: 978-1849517300
Product Dimensions: 9.1 x 7.5 x 1.2 inches
Shipping Weight: 1.8 pounds (View shipping rates and policies)

Hadoop Beginner's Guide Free Download

'Hadoop Beginner's Guide' by Gary Turkington is a book that helps walk beginners through understanding Hadoop and how to go about using it.

The first two chapters are introductory and cover what Hadoop is and how to install it. The second chapter also walks you through writing a couple basic MapReduce jobs.

Chapters 3, 4 and 5 take you deeper into MapReduce. It starts out where with small simple code and then goes on to more advanced topics such as joining different data sets.

Chapters 6 and 7 are on the administrative side and help you understand what to do when things start breaking and how to keep things running smoothly.

Chapters 8, 9 and 10 take you on a journey of some of the tools in the Hadoop ecosystem. Hive (chapter 8) and Sqoop (chapter 9) are tools that you will find yourself working with if you're more of a relational person or you need to connect Hadoop to your RDBMS. Flume (chapter 10) is a way for moving log data from remote servers into HDFS (among other things).

The last chapter talks about various things such as the different vendor distributions out there, other tools in the ecosystem and where you can find more information to continue on your journey.

As others have stated, there are some errors in the book's code. These are easily overcome by looking at the errata or a quick Google search. Sure it would be great if the book was perfect, but then I don't think that I've read a book that didn't have errors in it. Besides, trying to fix the errors that are presented, can sometimes make you learn more than you would have if everything was just copied and pasted.

I read this book "out of context", meaning that I didn't have an interesting problem solvable by MapReduce at hand and a dire need to learn Hadoop at the time of reading. Instead, I took time to read this book with the purpose of determining whether it's a good beginner book or not. All in all, I'd say that it is. The author really succeeds in creating a context for Hadoop and its ecosystem.

From the second chapter and onwards, Hadoop is gradually introduced using very detailed instructions. The general format for doing this is by listing every single command the user needs to type and its output, so the book is full of terminal session listings. All such listings are followed by sections called "What just happened?" that explain in detail the purpose of the commands and their output. This is actually quite helpful for readers who understand what's happening from just looking at the session listing; such readers can safely skip these sections.

The above approach should enable any reader, regardless of level of experience, to follow along and do the exercises or labs, which is a good thing for a beginner book. I have a remark about this though: the session dumps could have been proofread better! I can't say that I read them through a magnifying glass, but still I found quite a few errors.

As for the contents, the book never shows the monster! In my opinion, the introductory chapter fails to actually establish a case for Hadoop and MapReduce. Yes, it's about big data, scaling and problems and so on, but I couldn't find a logical transition to Hadoop as a solution to these problems. Instead, chapter two illustrates the framework with a distributed calculation of pi and the word counting program (Hadoop's version of the "Hello world" program).

Download Link 1