Understand, design, build, and optimize your big data search engine with Hadoop and Apache Solr
About This Book
- Explore different approaches to making Solr work on big data ecosystems besides Apache Hadoop
- Improve search performance while working with big data
- A practical guide that covers interesting, real-life use cases for big data search along with sample code
Who This Book Is For
This book is aimed at developers, designers, and architects who would like to build big data enterprise search solutions for their customers or organizations. No prior knowledge of Apache Hadoop and Apache Solr/Lucene technologies is required.
What You Will Learn
- Understand Apache Hadoop, its ecosystem, and Apache Solr
- Explore industry-based architectures by designing a big data enterprise search with their applicability and benefits
- Integrate Apache Solr with big data technologies such as Cassandra to enable better scalability and high availability for big data
- Optimize the performance of your big data search platform with scaling data
- Write MapReduce tasks to index your data
- Configure your Hadoop instance to handle real-world big data problems
- Work with Hadoop and Solr using real-world examples to benefit from their practical usage
- Use Apache Solr as a NoSQL database
In Detail
Together, Apache Hadoop and Apache Solr help organizations resolve the problem of information extraction from big data by providing excellent distributed faceted search capabilities.
This book will help you learn everything you need to know to build a distributed enterprise search platform as well as optimize this search to a greater extent, resulting in the maximum utilization of available resources. Starting with the basics of Apache Hadoop and Solr, the book covers advanced topics of optimizing search with some interesting real-world use cases and sample Java code.
This is a step-by-step guide that will teach you how to build a high performance enterprise search while scaling data with Hadoop and Solr in an effortless manner.
Hrishikesh Vijay Karambelkar
Hrishikesh Vijay Karambelkar is an enterprise architect who has been developing a blend of technical and entrepreneurial experience for more than 14 years. His core expertise lies in working on multiple subjects, which include big data, enterprise search, semantic web, link data analysis, analytics, and he also enjoys architecting solutions for the next generation of product development for IT organizations. He spends most of his time at work, solving challenging problems faced by the software industry. Currently, he is working as the Director of Data Capabilities at The Digital Group. In the past, Hrishikesh has worked in the domain of graph databases; some of his work has been published at international conferences, such as VLDB, ICDE, and others. He has also written Scaling Apache Solr, published by Packt Publishing. He enjoys travelling, trekking, and taking pictures of birds living in the dense forests of India. He can be reached at http://hrishikesh.karambelkar.co.in/.