Improve Search API using django-haystack and whoosh


In most cases, We use Django Rest Framework filtering module to filter data but if you have large dataset then in those cases we must need indexing engine to index data and then do filtering on them. There are many search indexing engines out there. In this article we will use one of them and will understand how that can be used in Django project along with Django Rest Framework.

All index engines are differently implemented and to use them in python they are providing interfaces. Django Haystack is the python package which provides the bridge between Django and Indexing engines.

Following Search Engines can be used to implement search API in Django Project.

  1. Whoosh (We will use this one in this article)

  2. Elastic Search

  3. Solr

  4. Xapian

Start New Project

To start new project please look at this article -

Install Required additional packages

pip install Django-haystack
pip install djangorestframework
pip install whoosh


   'default': {
       'ENGINE': 'haystack.backends.whoosh_backend.WhooshEngine',
       'PATH': os.path.join(os.path.dirname(__file__), 'whoosh_index')

Also update INSTALLED_APPS variable -


   'haystack', # haystack application from django-haystack

After updating settings, we are ready to generate index and use them but what must be indexed that still to be implemented. So, let's go further and write code to be indexed.

First, start new app called Products

python startapp products

Write product model in

# products/
from django.db import models

class Product(models.Model):
   title = models.CharField(max_length=60)
   description = models.TextField()
   price = models.IntegerField()
   quantity = models.IntegerField()
   created_by = models.ForeignKey('auth.User', on_delete=models.CASCADE)

   def __str__(self):
       return f'{self.title} - {self.quantity}'

Write in products app.

# products/
from haystack import indexes
from products.models import Product

class ProductIndex(indexes.SearchIndex, indexes.Indexable):
   text = indexes.CharField(document=True, use_template=True)
   id = indexes.IntegerField(model_attr='id')
   title = indexes.CharField(model_attr="title")
   description = indexes.CharField(model_attr="description")
   price = indexes.IntegerField(model_attr="price")
   quantity = indexes.IntegerField(model_attr="quantity")
   created_by = indexes.CharField(model_attr='created_by__first_name')

   class Meta:
       model = Product
       fields = ["text", "id", "title", "description", "price", "quantity", "created_by"]

   def get_model(self):
       return Product

   def index_queryset(self, using=None):
       """Used when the entire index for model is updated."""
       return self.get_model().objects.all()

After writing above Index class, you will be able to build index using following command. But to see real indexed data we must add few product and to do that we will register product model in admin and then will add few objects from there.

# products/
from django.contrib import admin
from products.models import Product

If you haven't created superuser, Please create superuser using following command.

python createsuperuser

Now run server and then create few objects from admin panel with superuser. Objects are now available in database for product model so let's build index for the model data. To build index, use following command

python update_index

You will see count of objects which are indexed in console output of above command. Further we need to write code to access those indexed data through REST API. To do that, we need to write one REST API using DRF power.

We have to write one serializer class so, create one in products app and add following code.

# products/
from rest_framework import serializers

class ProductSearchSerializer(serializers.Serializer):
   id = serializers.IntegerField()
   title = serializers.CharField()
   description = serializers.CharField()
   price = serializers.IntegerField()
   quantity = serializers.IntegerField()
   created_by = serializers.CharField()

Please remember that, only those fields can be added here which are considered in index class. Though you can exclude few fields if you don't need in search results. If you need only id and title then remove other lines from this class.

Further let's write required view and then url -

# products/
from django.db.models import Q
from rest_framework import mixins, viewsets
from haystack.query import SearchQuerySet
from products.serializers import ProductSearchSerializer

class ProductSearchViewSet(mixins.ListModelMixin, viewsets.GenericViewSet):
   serializer_class = ProductSearchSerializer

   def get_queryset(self, *args, **kwargs):
       params = self.request.query_params
       query = SearchQuerySet().all()
       keywords = params.get('q')
       if keywords:
           query = query.filter(Q(title=keywords) | Q (created_by=keywords))

       if params.get("created_by", None):
           query = query.filter(created_by__in=params.get("created_by").split(","))
       return query

Add this into root URLs where is available

# <project_name>/
"""MarketStore URL Configuration
from django.contrib import admin
from django.urls import path
from products.views import ProductSearchViewSet

urlpatterns = [
   path('products/search/', ProductSearchViewSet.as_view({'get': 'list'})),

Search API URL can be set anything you need and then instead of using routers you can apply viewsets this way.

Now you are ready to see the feature in action. Run the server and hit following URL in browser.

You should be able to see all created products and if you want to search then pass ?q=.

Thanks for reading this article and I hope you have understood how to use haystack with Django rest framework.

blog comments powered by Disqus