Elasticsearch for Hadoop

Integrate Elasticsearch into Hadoop to effectively visualize and analyze your data
Preview in Mapt

Elasticsearch for Hadoop

Vishal Shukla

1 customer reviews
Integrate Elasticsearch into Hadoop to effectively visualize and analyze your data
Mapt Subscription
FREE
$29.99/m after trial
eBook
$22.40
RRP $31.99
Save 29%
Print + eBook
$39.99
RRP $39.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$22.40
$39.99
$29.99 p/m after trial
RRP $31.99
RRP $39.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Elasticsearch for Hadoop Book Cover
Elasticsearch for Hadoop
$ 31.99
$ 22.40
Full Stack Angular for Java Developers Book Cover
Full Stack Angular for Java Developers
$ 39.99
$ 28.00
Buy 2 for $35.00
Save $36.98
Add to Cart

Book Details

ISBN 139781785288999
Paperback222 pages

Book Description

The Hadoop ecosystem is a de-facto standard for processing terra-bytes and peta-bytes of data. Lucene-enabled Elasticsearch is becoming an industry standard for its full-text search and aggregation capabilities. Elasticsearch-Hadoop serves as a perfect tool to bridge the worlds of Elasticsearch and Hadoop ecosystem to get best out of both the worlds. Powered with Kibana, this stack makes it a cakewalk to get surprising insights out of your massive amount of Hadoop ecosystem in a flash.

In this book, you'll learn to use Elasticsearch, Kibana and Elasticsearch-Hadoop effectively to analyze and understand your HDFS and streaming data.

You begin with an in-depth understanding of the Hadoop, Elasticsearch, Marvel, and Kibana setup. Right after this, you will learn to successfully import Hadoop data into Elasticsearch by writing MapReduce job in a real-world example. This is then followed by a comprehensive look at Elasticsearch essentials, such as full-text search analysis, queries, filters and aggregations; after which you gain an understanding of creating various visualizations and interactive dashboard using Kibana. Classifying your real-world streaming data and identifying trends in it using Storm and Elasticsearch are some of the other topics that we'll cover. You will also gain an insight about key concepts of Elasticsearch and Elasticsearch-hadoop in distributed mode, advanced configurations along with some common configuration presets you may need for your production deployments. You will have “Go production checklist” and high-level view for cluster administration for post-production. Towards the end, you will learn to integrate Elasticsearch with other Hadoop eco-system tools, such as Pig, Hive and Spark.

Table of Contents

Chapter 1: Setting Up Environment
Setting up Hadoop for Elasticsearch
Setting up Elasticsearch
Running the WordCount example
Exploring data in Head and Marvel
Summary
Chapter 2: Getting Started with ES-Hadoop
Understanding the WordCount program
Going real — network monitoring data
Writing the NetworkLogsMapper job
Getting data from Elasticsearch to HDFS
Summary
Chapter 3: Understanding Elasticsearch
Knowing Search and Elasticsearch
Talking to Elasticsearch
Controlling the indexing process
Elastic searching
Aggregations
Summary
Chapter 4: Visualizing Big Data Using Kibana
Setting up and getting started
Discovering data
Summary
Chapter 5: Real-Time Analytics
Getting started with the Twitter Trend Analyser
Injecting streaming data into Storm
Analyzing trends
Classifying tweets using percolators
Summary
Chapter 6: ES-Hadoop in Production
Elasticsearch in a distributed environment
The ES-Hadoop architecture
Configuring the environment for production
Administration of clusters
Summary
Chapter 7: Integrating with the Hadoop Ecosystem
Pigging out Elasticsearch
SQLizing Elasticsearch with Hive
Cascading with Elasticsearch
Giving Spark to Elasticsearch
ES-Hadoop on YARN
Summary

What You Will Learn

  • Set up the Elasticsearch-Hadoop environment
  • Import HDFS data into Elasticsearch with MapReduce jobs
  • Perform full-text search and aggregations efficiently using Elasticsearch
  • Visualize data and create interactive dashboards using Kibana
  • Check and detect anomalies in streaming data using Storm and Elasticsearch
  • Inject and classify real-time streaming data into Elasticsearch
  • Get production-ready for Elasticsearch-Hadoop based projects
  • Integrate with Hadoop eco-system such as Pig, Storm, Hive, and Spark

Authors

Table of Contents

Chapter 1: Setting Up Environment
Setting up Hadoop for Elasticsearch
Setting up Elasticsearch
Running the WordCount example
Exploring data in Head and Marvel
Summary
Chapter 2: Getting Started with ES-Hadoop
Understanding the WordCount program
Going real — network monitoring data
Writing the NetworkLogsMapper job
Getting data from Elasticsearch to HDFS
Summary
Chapter 3: Understanding Elasticsearch
Knowing Search and Elasticsearch
Talking to Elasticsearch
Controlling the indexing process
Elastic searching
Aggregations
Summary
Chapter 4: Visualizing Big Data Using Kibana
Setting up and getting started
Discovering data
Summary
Chapter 5: Real-Time Analytics
Getting started with the Twitter Trend Analyser
Injecting streaming data into Storm
Analyzing trends
Classifying tweets using percolators
Summary
Chapter 6: ES-Hadoop in Production
Elasticsearch in a distributed environment
The ES-Hadoop architecture
Configuring the environment for production
Administration of clusters
Summary
Chapter 7: Integrating with the Hadoop Ecosystem
Pigging out Elasticsearch
SQLizing Elasticsearch with Hive
Cascading with Elasticsearch
Giving Spark to Elasticsearch
ES-Hadoop on YARN
Summary

Book Details

ISBN 139781785288999
Paperback222 pages
Read More
From 1 reviews

Read More Reviews

Recommended for You

Learning ELK Stack Book Cover
Learning ELK Stack
$ 35.99
$ 25.20
Fast Data Processing with Spark 2 - Third Edition Book Cover
Fast Data Processing with Spark 2 - Third Edition
$ 31.99
$ 22.40
Elasticsearch Server - Third Edition Book Cover
Elasticsearch Server - Third Edition
$ 43.99
$ 30.80
Building Recommendation Engines Book Cover
Building Recommendation Engines
$ 39.99
$ 28.00
Hadoop 2.x Administration Cookbook Book Cover
Hadoop 2.x Administration Cookbook
$ 39.99
$ 28.00
Apache Spark for Data Science Cookbook Book Cover
Apache Spark for Data Science Cookbook
$ 35.99
$ 25.20