How does Elasticsearch work? And why should you use it? Let's take a tour of this notable search engine and explain how you can begin to use it to transform your daily activities and improve data clarity.
If you're human, surely you've needed to search for something (or someone...) on the world wide web, right? Maybe you were looking for documentation, an article, or even a coded response on Stack Overflow for that bug of yours.
Now, remember the frustrating feeling of simply not finding what you're looking for? This is the problem Elasticsearch sets out to solve: search!
To explain Elasticsearch in simple terms, we can say that it will help us find what we are looking for.
Image courtesy of CodeChain.
Elasticsearch is a database that provides distributed searches for and analysis of different types of data, almost in real time. It functions as an open source search engine built on top of Apache Lucene. The evolution of Apache Lucene to host Elasticsearch is reflective of the evolution of everything within software engineering: it requires a shorter learning curve. Simplicity and productivity are best friends.
Elasticsearch supports many programming languages, including:
Elasticsearch supports a large volume of data without lowering the quality of its performance. It can be deployed on any system, regardless of platform, by providing a REST API. Furthermore, Elasticsearch is highly scalable, being able to go from one server to many simultaneous servers.
Elasticsearch performs inverted index searches, which works as follows:
This indexing process is what makes Elasticsearch a semi-real-time search engine.
It's relevant to note that Elasticsearch is not a business intelligence tool, but you can perform queries by aggregating data to generate graphs and thereby get some important insights for your business.
One of the biggest benefits of using Elasticsearch is that it is highly scalable and available due to its structure of nodes and clusters. Just for the sake of teaching: a cluster is a group of node instances that are connected so that we can distribute tasks, searches, and indexing. The nodes in the Elasticsearch cluster can be assigned different jobs or responsibilities, as illustrated below:
Beyond the high scalability and availability enabled by this structure, Elasticsearch is remarkable because:
Kibana
Kibana is an elasticsearch data analysis and visualization tool that works as a dashboard for data presentation. It provides options for building queries and presenting results. Kibana also provides some options for managing Elasticsearch, such as authentication and security.
LogStash
As its name suggests, LogStash was originally created to process log records and send them to Elasticsearch, but today it has evolved into a more complete tool. Now it is used to synchronize data from different sources with Elasticsearch, such as a MySQL database. LogStash also does data enrichment, such as getting geographic data from an IP.
Beats
Beats is a collection of agents that can be installed in specific places to send information to Elasticsearch. For example, you can have a Windows agent that sends data related to the operating system (such as memory consumption, processing, logs and various other information), and the same can be done with Linux and other platforms such as WildFly or NGINX.
There are a lot of ways to install Elasticsearch, including through Docker, but for now we will focus on the standard: Install according to your operating system on the Elasticsearch website.
Please Note: To install Elasticsearch, you must already have Java 8 (JVM) installed on your computer. If you are not familiar with Java, be aware that you need to install the JRE -- the runtime to run a Java application --which you may already have installed on your machine. You can learn more about these downloads here.
In the image above, you can see that we're using the version 6.4.2 of ElasticSearch for Windows.
There is no need to change directories.
Image courtesy of DevMedia.
Select "Install as a service."
Image courtesy of DevMedia.
Here you have access to advanced settings.
At this point, you can install additional plugins if that is of interest to you.
And voilá, this is our final installation screen!
Usually, when we use ElasticSearch's own rest API on default port 9200, we get a response like this one (version 6.4):
To install Kibana, follow the steps outlined on the Elasticsearch site and extract the zip file. Note: the standard port for Kibana is 5601! (localhost:5601). After installation, just execute kibana.bat and open localhost:5601 to see this:
You can access sample data to learn more about Kibana and understand its features. But for now we won't cover this information.
In today's post, we gave a quick introduction to the objectives, benefits, inner workings, and ecosystem of Elasticsearch, and now we have an installed version of Elasticsearch on our O.S.
If you have questions or just want to share your favorite thing about Elasticsearch, please feel free to engage in this discussion by adding your comment below!