This book is a practical guide on using the Apache Hadoop projects including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout and Apache Solr. From setting up the environment to running sample applications each chapter is a practical tutorial on using a Apache Hadoop ecosystem project. While several books on Apache Hadoop are available, most are based on the main projects MapReduce and HDFS and none discusses the other Apache Hadoop ecosystem projects and how these all work together as a cohesive big data development platform.What you'll learnHow to set up environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5. How to run a MapReduce jobHow to store data with Apache Hive, Apache HBaseHow to index data in HDFS with Apache SolrHow to develop a Kafka messaging systemHow to develop a Mahout User Recommender SystemHow to stream Logs to HDFS with Apache FlumeHow to transfer data from MySQL database to Hive, HDFS and HBase with SqoopHow create a Hive table over Apache Solr
A unique more in-depth practical book on Hadoop's ecosystem to marketHadoop and Big Data are important topics to today's programmers, developmers and database admins.
In-depth book covering topics that are not covered elsewhere, and how they all work together Provides practical examples Presents one of the two most popular big data frameworks, Hadoop
Deepak Vohra
Hadoop framework big data cloud database HBase Apache Hadoop Apache HBase Apache open source