How to fuzzy search with Xata

Explore the world of free text search possibilities with Xata and take a deep dive into fuzzy search.

Written by

Benedicte Raae

Published on

February 13, 2023

One of my favorite features of Xata is the built-in "fuzzy search" functionality. Most database solutions let you search for an exact match. Users these days, though, often expect a more forgiving search, one that will match "olso" to "oslo" and "alez" to "alex".

Fuzzy search to the rescue! 💪

The difference in results with fuzzy search enabled and disabled
Fuzzy search enabled and disabled

FYI: The example code uses a Xata Worker using the Xata SDK + a React component using ReactQuery. However, you may use any type of server-side + client-side setup you are comfortable with.

Fuzzy search is enabled by default

Xata search functionality comes with the fuzziness param set to 1 by default, letting the user make one typo, such as one wrong character ("alez") or one set of swapped characters ("olso") etc. It's a great default, and the one we use for pruneyourfollows.com.

Nonetheless I do like to explicitly state the fuzziness level in the Code, together with the configuration for partially matching words.

import React, { useState } from "react";
import { useDebounce } from "usehooks-ts";
import { useQuery } from "@tanstack/react-query";
import { xataWorker } from "./xata";
 
// Code executed on the server as a Cloudflare Worker
const searchAccount = xataWorker(
  "searchAccount",
  async ({ xata }, { term }) => {
    const results = await xata.search.all(term, {
      tables: [
        {
          table: "accounts",
          target: ["name", "username"],
        },
      ],
      // Fuzziness level
      fuzziness: 1,
      // Partially matching words
      prefix: "phrase",
    });
 
    return results;
  }
);
 
// Code executed in the browser
export default function App() {
  const [term, setTerm] = useState("");
  const debouncedTerm = useDebounce(term, 300);
 
  const { data: results } = useQuery({
    queryKey: ["search", debouncedTerm],
    queryFn: () => {
      return searchAccount({ term: debouncedTerm });
    },
    enabled: Boolean(debouncedTerm),
    placeholderData: [],
  });
 
  return (
    <main>
      <form onSubmit={(event) => event.preventDefault()}>
        <label htmlFor="search-field">Search accounts: </label>
        <input
          id="search-field"
          type="search"
          value={term}
          placeholder="Search accounts: name or username"
          onChange={(event) => {
            setTerm(event.target.value);
          }}
        />
      </form>
 
      <ul>
        {results.map(({ record }) => {
          return (
            <li key={record.username}>
              <a href={`http://twitter.com/${record.username}`}>
                <img src={record.meta.profile_image_url} aria-hidden={true} />
                <span
                  dangerouslySetInnerHTML={{
                    __html: record.name,
                  }}
                />
                <br />
                @
                <span
                  dangerouslySetInnerHTML={{
                    __html: record.username,
                  }}
                />
              </a>
            </li>
          );
        })}
      </ul>
    </main>
  );
}

Other possible levels of fuzziness are 0, to disable it altogether, and 2 to extend the allowed typo tolerance. Higher than 2 is not supported.

Highlight the relevant text

In addition to expecting a more forgiving search, users expect to know precisely why a result matches their search term. With a fuzzy search, that is even more crucial, as it can lead to some unexpected matches.

When your mind thinks "oslo" but you wrote "olso", the "Espen Olson" results might feel strange when not highlighted.

The difference between results with and withuot highlighting
With and without highlighting

Luckily, Xata generates ready-to-use HTML, wrapping the relevant matching text in <em> for us. However, you need to use getMetadata on each record to get access to it.

import React, { useState } from "react";
import { useDebounce } from "usehooks-ts";
import { useQuery } from "@tanstack/react-query";
import { xataWorker } from "./xata";
 
// Code executed on the server as a Cloudflare Worker
const searchAccount = xataWorker(
  "searchAccount",
  async ({ xata }, { term }) => {
    const results = await xata.search.all(term, {
      tables: [
        {
          table: "accounts",
          target: ["name", "username"],
        },
      ],
      fuzziness: 1,
      prefix: "phrase",
    });
 
    // Getting the highlights (and more)
    const enrichedResults = results.map((result) => {
      return {
        ...result,
        ...result.record.getMetadata(),
      };
    });
 
    return enrichedResults;
  }
);
 
//Code executed in the browser
export default function App() {
  const [term, setTerm] = useState("");
  const debouncedTerm = useDebounce(term, 300);
 
  const { data: results } = useQuery({
    queryKey: ["search", debouncedTerm],
    queryFn: () => {
      return searchAccount({ term: debouncedTerm });
    },
    enabled: Boolean(debouncedTerm),
    placeholderData: [],
  });
 
  return (
    <main>
      <form onSubmit={(event) => event.preventDefault()}>
        <label htmlFor="search-field">Search accounts: </label>
        <input
          id="search-field"
          type="search"
          value={term}
          placeholder="Search accounts: name or username"
          onChange={(event) => {
            setTerm(event.target.value);
          }}
        />
      </form>
 
      <ul>
        {results.map(({ record, highlight }) => {
          return (
            <li key={record.username}>
              <a href={`http://twitter.com/${record.username}`}>
                <img src={record.meta.profile_image_url} aria-hidden={true} />
                <span
                  dangerouslySetInnerHTML={{
                    __html: highlight.name || record.name,
                  }}
                />
                <br />
                @
                <span
                  dangerouslySetInnerHTML={{
                    __html: highlight.username || record.username,
                  }}
                />
              </a>
            </li>
          );
        })}
      </ul>
    </main>
  );
}

Search playground

Xata even allows you to play around with search—no code needed—using the Search Engine Playground.

Pro tip: Use the "Get Code Snippet" feature to get a head start while coding.

Xata Playground showing similar code as this article
Xata Playground showing similar code as this article

Where to go from here?

I hope you are excited to dig into fuzzy search with Xata:

If you need any help implementing fuzzy search in your project, join the Xata Discord.

Copyright © 2023 Xatabase Inc.
All rights reserved.

Product

RoadmapFeature requestsPricingStatusAI solutions