Elasticsearch Full Text Search – Explained With Implementation Example

Elasticsearch Full Text Search – Explained With Implementation Example

Elasticsearch Full Text Search (FTS) is all about a platform that goes beyond data collection & storage. It enables the analysis of data and the implementation of powerful search functionality.

To implement a full-text search, we want a database that not only stores the data efficiently but also analyses it and provides the results quickly no matter the amount of data or the complexity of the query.

In this blog, we will see what is an elastic search engine and how we can use it. We will cover the following:

What Is Elasticsearch Full text Search? (Key Features)

Elasticsearch is an open-sourced RESTful search engine built on the Apache Lucene library. It is fast and can give the result of complex queries within a fraction of a second. By prioritizing the most relevant searches on the initial page, Elasticsearch FTS enhances the searching experience, attracting more users to your application.

Key Features Of Elasticsearch Full Text Search

There are some key advantages of Elasticsearch full-text search which make it stand out from other search engines:

Features Of Elasticsearch Full Text Search

1. Document-Based

In Elasticsearch full-text search, data is stored as JSON documents, which simplifies data management and querying. Documents can be structured or unstructured, and there is no need to adhere to a fixed schema like in RDBMS.

2. Scalable  

Elasticsearch FTS distributes the documents amongst different shards. Shard is a self-contained instance of Lucene. The concept is that with a cluster of computers, you can distribute these smaller parts (shards) across numerous machines. As your capacity requirements grow, you can smoothly expand your cluster by adding more machines (nodes) and assigning additional shards to them.

3. Real-Time

Using Elasticsearch as a full-text search you can execute complex queries extremely fast. It hardly takes any time from when you index a document to when the document becomes searchable.

4. Highly available

The elastic search engine is a distributed platform. Shards hold several documents inside an index. Elasticsearch for full-text search provides the facility to create a replicate of these shards which hold the exact copies of the document from the primary shard, thus, making it highly available.

5. Developer Friendly

The elastic search engine provides RESTful APIs which makes it easy to use. The client libraries are available in all the major languages which are supported by active open-source communities. It also has query DSL, it is very easy to prepare complex queries and tune them precisely.

ELK (Elasticsearch Logstash Kibana) Stack

Before diving into how we implement Elasticsearch for full-text search, let’s look at Elasticsearch Logstash Kibana Stack. This stack is made up of several components:

ELK (Elasticsearch Logstash Kibana) Stack

1. Elasticsearch

Elasticsearch is the Heart of elastic stack. It is responsible for storing data and indexing it to perform powerful full-text searches.

2. Logstash

Logstash is a data processing pipeline. It can receive data from different systems, be it orders, messages, logs, etc. Logstash receives and transforms data, sending it to various systems such as Elasticsearch, Kafka queues, or HTTP endpoints. 

3. Beats

Beats is a lightweight data shipper. The purpose of Beats is to send the data to Logstash. Different types of beats are available such as filebeats, which collect and send log files, and metricbeats, which collect system and service metrics such as memory and CPU usage.

4. Kibana

Kibana is an analytics and visualisation platform that uses the data stored in Elasticsearch for analytics and visualisation. It’s a dashboard through which you can create graphs, charts, etc.     

5. X-Pack 

X-Pack is a package of additional features that can be added to an elastic search engine and Kibana such as Security (Authentication and Authorisation), Monitoring (CPU, disk, memory), Alerting, Reporting, etc.

We can either use the individual components of the ELK stack (Elasticsearch, Logstash, and Kibana) separately or combine them within the Elasticsearch web application.

How Can Systango Help You?

  • Wikipedia states that it relies on Elasticsearch for Full-text search to provide suggested text.
  • Want your search engine to be equally powerful?
  • Connect with us for consultation & implement Elasticsearch full text search now!

Reach Us Software Development Company | Systango

Implementation Of Elasticsearch FTS

Setup of Elasticsearch engine:

First Step – Install Elasticsearch on your machine, using the official setup guide. 

Second Step – The default port for Elasticsearch is 9200. So after installation, hit http://localhost:9200/ in your browser.

Third Step – Check to see a JSON response containing information regarding your cluster.

If you get the JSON, all is well and we can now store some data.

Understanding How Data is Stored:

Elasticsearch is a NoSQL database. It stores data in the form of JSON documents that contain different fields. Each field has some properties like text, float, bool, etc. Multiple JSON documents can belong to an Index. An index in an Elastic search engine is equivalent to a Database in SQL. In short:   

MySQL            => Databases => Tables => Columns/Rows

Elasticsearch     => Indices => Documents with Properties

Create Index:

Let’s create an index and store some documents in it. This can be done using a curl request to the base URL, i.e. http://localhost:9200/ or you can also use Kibana’s dev tool which is handy for beginners.

> PUT  /movies

That’s it, this will create an Index named movies with default settings. We can also provide the settings explicitly by specifying those in the body of the request. Like:

> PUT /movies

    {

        “settings”:

        {

            “number_of_shards”: 2,

“number_of_replicas”: 1

        }

  }

Here, we told the Elastic search engine to create 2 shards and one replica for each of these shards. So there will be a total of 4 shards. This arrangement of primary and replica shards ensures data availability at any time. 

Indexing a Document:

The post-index API adds a JSON document to a specific index and assigns a random id to it. Here’s an example to insert the JSON document into the “movies” index:

> POST movies/_doc/

  {

      “title”: “The Hobbit: An Unexpected Journey”,

      “released”: 2012

  }

The index operation also creates a dynamic mapping if one does not already exist. The above request would by default assign a text mapping type for the title and a long mapping type for release. 

We can also provide the mappings explicitly by using the mappings API of Elasticsearch full-text search engine. Here’s how the code for that would look:

> PUT movies/_mapping

  {

      “properties”: {

          “title”: {

              “type”: “text”

          },

          “released”: {

            “type”: “long”

          }

      }

  }

Let’s  put some more documents in our movies Index:

{

    “title”: “Journey 2: The Mysterious Island”,

    “released”: 2012

}

{

    “title”: “Journey to the Center of the Earth”,

    “released”: 2008

There are two ways to perform a full-text search using the Elastic search engine:

 URI search – URI search is simple. Here you provide the search parameters in the URL.

> GET movies/_search?q=Journey

URI search will give you all the results which have the search term Journey present in them.

Request Body Search – A Request Body Search will look like this,

> GET movies/_search

  {

      “query”:

      {

          “match”:

          {

                “title”: “Unexpected Journey”

          }

      }

  }

The output of the above query would be:

{

  “took” : 599,

  “timed_out” : false,

  “_shards” : {

  “total” : 1,

    “successful” : 1,

    “skipped” : 0,

    “failed” : 0

  },

  “hits” : {

    “total” : {

      “value” : 3,

      “relation” : “eq”

    },

    “max_score” : 1.1707046,

    “hits” : [

      {

        “_index” : “blog_test”,

        “_type” : “_doc”,

        “_id” : “GfVwKGwBq9qdXZAqo8PX”,

        “_score” : 1.1707046,

        “_source” : {

          “title” : “The Hobbit: An Unexpected Journey”,

          “released” : 2012

        }

      },

      {

        “_index” : “blog_test”,

        “_type” : “_doc”,

        “_id” : “GvVyKGwBq9qdXZAq1cOd”,

        “_score” : 0.14028297,

        “_source” : {

          “title” : “Journey 2: The Mysterious Island”,

          “released” : 2012

        }

      },

      {

        “_index” : “blog_test”,

        “_type” : “_doc”,

        “_id” : “GPVoKGwBq9qdXZAqA8Pv”,

        “_score” : 0.12180667,

        “_source” : {

          “title”: “Journey to the Center of the Earth”,

          “released” : 2008

        }

      }

    ]

  }

}

Notice, with each result, the Elasticsearch, full-text search has assigned a score ( _score ). The higher the score, the more relevant the document is to the search query. By default, Elasticsearch returns 10 documents. You can use from and size parameters to define the starting point and number of records to return respectively. You can also sort and filter the results further using different parameters in the search query. 

Elasticsearch Full-text Search Example

Let’s say we have an Elasticsearch index called “books” with the following mapping:

PUT /books

{

  “mappings”: {

    “properties”: {

      “title”: {

        “type”: “text”

      },

      “author”: {

        “type”: “text”

      },

      “description”: {

        “type”: “text”

      }

    }

  }

}

Now, let’s index some example documents:

POST /books/_doc/1

{

  “title”: “The Catcher in the Rye”,

  “author”: “J.D. Salinger”,

  “description”: “A story about a teenage boy named Holden Caulfield”

}

POST /books/_doc/2

{

  “title”: “To Kill a Mockingbird”,

  “author”: “Harper Lee”,

  “description”: “A novel set in the American South during the Great Depression”

}

POST /books/_doc/3

{

  “title”: “1984”,

  “author”: “George Orwell”,

  “description”: “A dystopian novel set in a totalitarian society”

}

You can use the match query in Elasticsearch to perform a full-text search. For example, let’s search for books that contain the word “novel” in the title or description:

GET /books/_search

{

  “query”: {

    “match”: {

      “title”: “novel”

    }

  }

}

This query will return all documents where the title contains the word “novel”. In our case, it will return the document with the title “To Kill a Mockingbird”.

You can also search for documents using multiple fields. Let’s search for books that mention “teenage” in the title or “boy” in the description:

GET /books/_search

{

  “query”: {

    “multi_match”: {

      “query”: “teenage boy”,

      “fields”: [“title”, “description”]

    }

  }

}

This query will return the document with the title “The Catcher in the Rye” since it contains both “teenage” and “boy” in either the title or the description.

That’s a basic Elasticsearch full-text search example. You can further customise the search queries and utilise various features provided by Elasticsearch to suit your specific requirements.

Key Takeaways:

In this article, we have seen what Elasticsearch is, how the data is indexed and how you can incorporate Elasticsearch for a full-text search in your project. Using Elasticsearch helps you fine-tune your queries and covers complex scenarios based on your business needs and objectives. 

This blog covers a fundamental Elasticsearch full-text search example & all that you need to efficiently index and search your data. If you have any queries related to Elasticsearch FTS data modelling & query optimisation, our team of experts are here to serve you with the best of services. Connect with us now!

Shaonika Saha

April 2, 2020

Leave a Reply

Your email address will not be published. Required fields are marked *