Search functionality is really important to modern web apps. It is also important that the search results should be relevant and deliver the user’s query. But it’s hard to make such a search working and also none of the Databases support that kind of facility “out of the box” until I found **MongoDB Atlas Search**. It’s really easy to use and it sits on top of my data, so I didn’t have to make any pipeline like other search services.
I was trying to build a search engine for Hadith (wiki) called Ask Hadith. So I used MongoDB as it’s fast to build and I thought there will be a lot of searches. But then I stumbled on making the search, I used the [$text
](docs.mongodb.com/manual/reference/operator/..) search for full-text searching and it was OK, lacking complex queries, phrases, highlights and much more. Still, I used this for the first phase of Ask Hadith. I posted on Reddit and Hackernews and found people are interested in such a search engine. So I started to make it better, in case of code quality and search. In the meantime, Marcus Eagan posted an issue in the GitHub repo and suggested me to have a look at Atlas Search. I took a look at it and found it’s really easy to use and supports many features like phrases, fuzzy searching, highlighting, complex queries, analyzers and much more! I was impressed and tried it out and ended up using this as my search engine for now. I will share my journey, the benefits of Atlas Search, my queries and also some extra queries you can use on your web app.
My journey with Atlas Search
Starting with my journey, I first had to open an account on MongoDB Atlas. It is easy to start with and also offers a generous free account, here is the link — https://www.mongodb.com/cloud/atlas.
Transfer data to Atlas
Previously I had my data on Mlab, so I had to transfer it to Atlas. To do that I used mongoexport
& mongoimport
commands, it’s really easy. You can download the database tools from here - https://www.mongodb.com/try/download/database-tools?tck=docs_databasetools. Then export the previous collection and import it to Atlas—
➜ mongoexport -h HOST:PORT -d DATABASE -c COLLECTION -u username -p password -o hadith.json
➜ mongoimport -h HOST:PORT -d DATABASE -c COLLECTION -u username -p password --file hadith.json
Create a search index
Then I need to make an index to start with the searching. I used the default index to quickly jump in.
{
"mappings": {
"dynamic": true
}
}
But I could use this index to save space —
{
"mappings": {
"dynamic": false,
"fields": {
"body_en": {
"type": "string",
"analyzer": "lucene.standard"
},
"chapter_en": {
"type": "string",
"analyzer": "lucene.standard"
}
}
}
}
This uses Standard Analyzer for language neutralism. Also, I could use Keyword Analyzer for Hadith numbers, so people can directly search for specific Hadith rather than text search. But I just stayed with the default one and will use these analyzers in next phase. You can learn more about them here.
Full text search in Atlas
Now let’s start with my queries. Using Python, I initially wrote this full text search query—
db[collection].aggregate(
[
{
"$search": {
"text": {
"query": term,
"path": ["body_en", "chapter_en"],
"fuzzy": {"maxEdits": 1, "maxExpansions": 10},
},
"highlight": {"path": ["body_en", "chapter_en"]},
}
},
{"$limit": 50},
{
"$project": {
"_id": 0,
"collection_id": 1,
"collection": 1,
"hadith_no": 1,
"book_no": 1,
"book_en": 1,
"chapter_no": 1,
"chapter_en": 1,
"narrator_en": 1,
"body_en": 1,
"book_ref_no": 1,
"hadith_grade": 1,
"score": {"$meta": "searchScore"},
"highlights": {"$meta": "searchHighlights"},
}
},
]
)
This is aggregation in Mongo. We can also put the above list in to a variable aggregate_query
and get the result like this —
result = list(db[collection]aggregate(aggregate_query))
The first aggregation is the $search
and it holds the [text
](docs.atlas.mongodb.com/reference/atlas-sear..) operator and [highlight
](docs.atlas.mongodb.com/reference/atlas-sear..) option. text
is the full-text search in Atlas Search and it uses the index we have created earlier. The query
holds the search term and if there are multiple terms it searches for each term and produces a matching score that I have used in the $project
aggregation to sort them in order. The result is already sorted btw. The path
holds on which fields the search is to be done.
And it also offers fuzzy searching! Just by adding fuzzy
in the text
. The maxEdits
inside fuzzy
is the number of allowable character to replace and maxExpansions
is the number variation. In my fuzzy, for the word cot, it will look for cat, eat, can, fat, at, etc up to 10 variations and you can notice a single character replace.
The highlight
option is great too. It produces an array of elements with words that matched and also the surrounding words for a context. In my case, I generated this result —
This is awesome because everything is on my Database!
Phrase search in Atlas
Phrase search is also important for a Hadith search engine. So I tried the phrase search —
db[collection].aggregate(
[
{
"$search": {
"phrase": {
"query": term,
"path": "body_en",
"slop": 2,
},
"highlight": {"path": "body_en"},
}
},
{"$limit": 50},
{
"$project": {
"_id": 0,
"collection_id": 1,
"collection": 1,
"hadith_no": 1,
"book_no": 1,
"book_en": 1,
"chapter_no": 1,
"chapter_en": 1,
"narrator_en": 1,
"body_en": 1,
"book_ref_no": 1,
"hadith_grade": 1,
"score": {"$meta": "searchScore"},
"highlights": {"$meta": "searchHighlights"},
}
},
]
)
The implementation is again really simple. slop
is the new field here, this is the allowable distance between words that are put in the query
field. So here slop: 2
means there can be at most two words between the words I have put on the query
. As always the exact matches score higher. This is the result in my case —
You can notice in the second result there is “an” word inserted between “of” and “epidemic”. So the phrase search is really handy for searches like this.
Compound search in Atlas
Until now, both full-text search and phrase search was great for my case. So I needed something like the combination of both. And Atlas didn’t disappoint me. It supports the compound
operator which is really great. It helps to combine the operators and also score among them! This is the best feature from Atlas that made me use this as my search engine. Here is the query I used lastly —
db[collection].aggregate(
[
{
"$search": {
"compound": {
"must": [
{
"text": {
"query": term,
"path": ["body_en", "chapter_en"],
"fuzzy": {"maxEdits": 1, "maxExpansions": 10},
}
}
],
"should": [
{
"phrase": {
"query": term,
"path": "body_en",
"slop": 2,
}
}
],
},
"highlight": {"path": ["body_en", "chapter_en"]},
}
},
{"$limit": 50},
{
"$project": {
"_id": 0,
"collection_id": 1,
"collection": 1,
"hadith_no": 1,
"book_no": 1,
"book_en": 1,
"chapter_no": 1,
"chapter_en": 1,
"narrator_en": 1,
"body_en": 1,
"book_ref_no": 1,
"hadith_grade": 1,
"score": {"$meta": "searchScore"},
"highlights": {"$meta": "searchHighlights"},
}
},
]
)
Inside the compound
operator, I used must
and should
. **must
holds the clause that must match to search terms and should
is like better to have, if the `should` clause match, the score will be higher**. This makes complex queries so simple.
I put the full-text search inside must
, that means I want my search query to match the terms individually, there can be words in the search query that are not present in the whole DB, so I chose this to be the must. And my **should
clause holds the phrase search**. It will be better if the search finds a phrase, the score will be higher for this case. Using this I have achieved the following result —
You can notice the plague word is individually matched in the second result. Also the fuzzy for “of” is “if” that is also matched. You can also put multiple clauses inside the should
for better scoring depending on your need. Along with must
and should
there are mustNot
and filter
. Find the details here — https://docs.atlas.mongodb.com/reference/atlas-search/compound/
My journey with Atlas went smoothly. It supported all kinds of searching I wanted and delivers results really fast. And most importantly it’s easy to use. So I moved from Mlab to Atlas and from
$text
to Atlas$search
.
Searching geo locations in Atlas
As I said I will be sharing extra queries. Here is one. Say you have a collection of restaurants and you want to search by restaurant name within a specific area. You will just need this query —
[
{
"$search": {
"compound": {
"must": {
"text": {
"query": "Bangladeshi",
"path": "restaurant_name"
}
},
"should": {
"near": {
"origin": {
"type": "Point",
"coordinates": [
-73.988713,
40.7262672
]
},
"pivot": 1000,
"path": "address.location"
}
}
}
}
},
{
"$project": {
"_id": 0,
"restaurant_name": 1,
"address": 1,
"score": {
"$meta": "searchScore"
}
}
}
]
So you can use the restaurant name text search in must
clause and the location search in should
clause and that’s it! I think you can imagine how powerful this is, you can write queries like these to achieve any kind of scenario. The near
operator also supports date and number.
Find all the operators here — https://docs.atlas.mongodb.com/reference/atlas-search/operators/. You can find the GitHub repo of Ask Hadith here — https://github.com/Ananto30/ask-hadith for any kind of reference.
I thank **Marcus Eagan** for his immense support, guidance and time. He introduced Atlas Search to me, showed me the indexing process, built the initial query and met me in Zoom. This is what inspires us to move with our software projects. If you don’t know him check him out! Here’s a snap of us moving with Atlas 😬
Me & Marcus!
Overall the Atlas Search is “out of the box” service for searching, it is really powerful (with the help of compound search) and I enjoyed the simplicity of it.
Let me know if you have used the Atlas search in your projects and share the links in comment.