MongoDB Atlas Search: Super Simple, Very Powerful

MongoDB Atlas Search: Super Simple, Very Powerful

Search functionality is really important to modern web apps. It is also important that the search results should be relevant and deliver the user’s query. But it’s hard to make such a search working and also none of the Databases support that kind of facility “out of the box” until I found **MongoDB Atlas Search**. It’s really easy to use and it sits on top of my data, so I didn’t have to make any pipeline like other search services.

I was trying to build a search engine for Hadith (wiki) called Ask Hadith. So I used MongoDB as it’s fast to build and I thought there will be a lot of searches. But then I stumbled on making the search, I used the [$text](docs.mongodb.com/manual/reference/operator/..) search for full-text searching and it was OK, lacking complex queries, phrases, highlights and much more. Still, I used this for the first phase of Ask Hadith. I posted on Reddit and Hackernews and found people are interested in such a search engine. So I started to make it better, in case of code quality and search. In the meantime, Marcus Eagan posted an issue in the GitHub repo and suggested me to have a look at Atlas Search. I took a look at it and found it’s really easy to use and supports many features like phrases, fuzzy searching, highlighting, complex queries, analyzers and much more! I was impressed and tried it out and ended up using this as my search engine for now. I will share my journey, the benefits of Atlas Search, my queries and also some extra queries you can use on your web app.

Starting with my journey, I first had to open an account on MongoDB Atlas. It is easy to start with and also offers a generous free account, here is the link — https://www.mongodb.com/cloud/atlas.

Transfer data to Atlas

Previously I had my data on Mlab, so I had to transfer it to Atlas. To do that I used mongoexport & mongoimport commands, it’s really easy. You can download the database tools from here - https://www.mongodb.com/try/download/database-tools?tck=docs_databasetools. Then export the previous collection and import it to Atlas—

➜ mongoexport -h HOST:PORT -d DATABASE -c COLLECTION -u username -p password -o hadith.json

➜ mongoimport -h HOST:PORT -d DATABASE -c COLLECTION -u username -p password --file hadith.json

Create a search index

Then I need to make an index to start with the searching. I used the default index to quickly jump in.

{
  "mappings": {
    "dynamic": true
  }
}

But I could use this index to save space —

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "body_en": {
        "type": "string",
        "analyzer": "lucene.standard"
      },
      "chapter_en": {
        "type": "string",
        "analyzer": "lucene.standard"
      }
    }
  }
}

This uses Standard Analyzer for language neutralism. Also, I could use Keyword Analyzer for Hadith numbers, so people can directly search for specific Hadith rather than text search. But I just stayed with the default one and will use these analyzers in next phase. You can learn more about them here.

Full text search in Atlas

Now let’s start with my queries. Using Python, I initially wrote this full text search query—

db[collection].aggregate(
    [
        {
            "$search": {
                "text": {
                    "query": term,
                    "path": ["body_en", "chapter_en"],
                    "fuzzy": {"maxEdits": 1, "maxExpansions": 10},
                },
                "highlight": {"path": ["body_en", "chapter_en"]},
            }
        },
        {"$limit": 50},
        {
            "$project": {
                "_id": 0,
                "collection_id": 1,
                "collection": 1,
                "hadith_no": 1,
                "book_no": 1,
                "book_en": 1,
                "chapter_no": 1,
                "chapter_en": 1,
                "narrator_en": 1,
                "body_en": 1,
                "book_ref_no": 1,
                "hadith_grade": 1,
                "score": {"$meta": "searchScore"},
                "highlights": {"$meta": "searchHighlights"},
            }
        },
    ]
)

This is aggregation in Mongo. We can also put the above list in to a variable aggregate_query and get the result like this — result = list(db[collection]aggregate(aggregate_query))

The first aggregation is the $search and it holds the [text](docs.atlas.mongodb.com/reference/atlas-sear..) operator and [highlight](docs.atlas.mongodb.com/reference/atlas-sear..) option. text is the full-text search in Atlas Search and it uses the index we have created earlier. The query holds the search term and if there are multiple terms it searches for each term and produces a matching score that I have used in the $project aggregation to sort them in order. The result is already sorted btw. The path holds on which fields the search is to be done.

And it also offers fuzzy searching! Just by adding fuzzy in the text. The maxEdits inside fuzzy is the number of allowable character to replace and maxExpansions is the number variation. In my fuzzy, for the word cot, it will look for cat, eat, can, fat, at, etc up to 10 variations and you can notice a single character replace.

The highlight option is great too. It produces an array of elements with words that matched and also the surrounding words for a context. In my case, I generated this result —

This is awesome because everything is on my Database!

Phrase search in Atlas

Phrase search is also important for a Hadith search engine. So I tried the phrase search —

db[collection].aggregate(
    [
        {
            "$search": {
                "phrase": {
                    "query": term,
                    "path": "body_en",
                    "slop": 2,
                },
                "highlight": {"path": "body_en"},
            }
        },
        {"$limit": 50},
        {
            "$project": {
                "_id": 0,
                "collection_id": 1,
                "collection": 1,
                "hadith_no": 1,
                "book_no": 1,
                "book_en": 1,
                "chapter_no": 1,
                "chapter_en": 1,
                "narrator_en": 1,
                "body_en": 1,
                "book_ref_no": 1,
                "hadith_grade": 1,
                "score": {"$meta": "searchScore"},
                "highlights": {"$meta": "searchHighlights"},
            }
        },
    ]
)

The implementation is again really simple. slop is the new field here, this is the allowable distance between words that are put in the query field. So here slop: 2 means there can be at most two words between the words I have put on the query. As always the exact matches score higher. This is the result in my case —

You can notice in the second result there is “an” word inserted between “of” and “epidemic”. So the phrase search is really handy for searches like this.

Compound search in Atlas

Until now, both full-text search and phrase search was great for my case. So I needed something like the combination of both. And Atlas didn’t disappoint me. It supports the compound operator which is really great. It helps to combine the operators and also score among them! This is the best feature from Atlas that made me use this as my search engine. Here is the query I used lastly —

db[collection].aggregate(
    [
        {
            "$search": {
                "compound": {
                    "must": [
                        {
                            "text": {
                                "query": term,
                                "path": ["body_en", "chapter_en"],
                                "fuzzy": {"maxEdits": 1, "maxExpansions": 10},
                            }
                        }
                    ],
                    "should": [
                        {
                            "phrase": {
                                "query": term,
                                "path": "body_en",
                                "slop": 2,
                            }
                        }
                    ],
                },
                "highlight": {"path": ["body_en", "chapter_en"]},
            }
        },
        {"$limit": 50},
        {
            "$project": {
                "_id": 0,
                "collection_id": 1,
                "collection": 1,
                "hadith_no": 1,
                "book_no": 1,
                "book_en": 1,
                "chapter_no": 1,
                "chapter_en": 1,
                "narrator_en": 1,
                "body_en": 1,
                "book_ref_no": 1,
                "hadith_grade": 1,
                "score": {"$meta": "searchScore"},
                "highlights": {"$meta": "searchHighlights"},
            }
        },
    ]
)

Inside the compound operator, I used must and should. **must holds the clause that must match to search terms and should is like better to have, if the `should` clause match, the score will be higher**. This makes complex queries so simple.

I put the full-text search inside must, that means I want my search query to match the terms individually, there can be words in the search query that are not present in the whole DB, so I chose this to be the must. And my **should clause holds the phrase search**. It will be better if the search finds a phrase, the score will be higher for this case. Using this I have achieved the following result —

You can notice the plague word is individually matched in the second result. Also the fuzzy for “of” is “if” that is also matched. You can also put multiple clauses inside the should for better scoring depending on your need. Along with must and should there are mustNot and filter. Find the details here — https://docs.atlas.mongodb.com/reference/atlas-search/compound/

My journey with Atlas went smoothly. It supported all kinds of searching I wanted and delivers results really fast. And most importantly it’s easy to use. So I moved from Mlab to Atlas and from $text to Atlas $search.

Searching geo locations in Atlas

As I said I will be sharing extra queries. Here is one. Say you have a collection of restaurants and you want to search by restaurant name within a specific area. You will just need this query —

[
    {
        "$search": {
            "compound": {
                "must": {
                    "text": {
                        "query": "Bangladeshi",
                        "path": "restaurant_name"
                    }
                },
                "should": {
                    "near": {
                        "origin": {
                            "type": "Point",
                            "coordinates": [
                                -73.988713,
                                40.7262672
                            ]
                        },
                        "pivot": 1000,
                        "path": "address.location"
                    }
                }
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "restaurant_name": 1,
            "address": 1,
            "score": {
                "$meta": "searchScore"
            }
        }
    }
]

So you can use the restaurant name text search in must clause and the location search in should clause and that’s it! I think you can imagine how powerful this is, you can write queries like these to achieve any kind of scenario. The near operator also supports date and number.

Find all the operators here — https://docs.atlas.mongodb.com/reference/atlas-search/operators/. You can find the GitHub repo of Ask Hadith here — https://github.com/Ananto30/ask-hadith for any kind of reference.

I thank **Marcus Eagan** for his immense support, guidance and time. He introduced Atlas Search to me, showed me the indexing process, built the initial query and met me in Zoom. This is what inspires us to move with our software projects. If you don’t know him check him out! Here’s a snap of us moving with Atlas 😬

Me & Marcus!Me & Marcus!

Overall the Atlas Search is “out of the box” service for searching, it is really powerful (with the help of compound search) and I enjoyed the simplicity of it.

Let me know if you have used the Atlas search in your projects and share the links in comment.