Explore the world of free text search possibilities with Xata and take a deep dive into fuzzy search.
Written by
Benedicte Raae
Published on
February 13, 2023
One of my favorite features of Xata is the built-in "fuzzy search" functionality. Most database solutions let you search for an exact match. Users these days, though, often expect a more forgiving search, one that will match "olso" to "oslo" and "alez" to "alex".
Fuzzy search to the rescue! 💪
FYI: The example code uses a Xata Worker using the Xata SDK + a React component using ReactQuery. However, you may use any type of server-side + client-side setup you are comfortable with.
Xata search functionality comes with the fuzziness
param set to 1
by
default, letting the user make one typo, such as one wrong character
("alez") or one set of swapped characters ("olso") etc. It's a great
default, and the one we use for
pruneyourfollows.com.
Nonetheless I do like to explicitly state the fuzziness level in the Code, together with the configuration for partially matching words.
import React, { useState } from "react";
import { useDebounce } from "usehooks-ts";
import { useQuery } from "@tanstack/react-query";
import { xataWorker } from "./xata";
// Code executed on the server as a Cloudflare Worker
const searchAccount = xataWorker(
"searchAccount",
async ({ xata }, { term }) => {
const results = await xata.search.all(term, {
tables: [
{
table: "accounts",
target: ["name", "username"],
},
],
// Fuzziness level
fuzziness: 1,
// Partially matching words
prefix: "phrase",
});
return results;
}
);
// Code executed in the browser
export default function App() {
const [term, setTerm] = useState("");
const debouncedTerm = useDebounce(term, 300);
const { data: results } = useQuery({
queryKey: ["search", debouncedTerm],
queryFn: () => {
return searchAccount({ term: debouncedTerm });
},
enabled: Boolean(debouncedTerm),
placeholderData: [],
});
return (
<main>
<form onSubmit={(event) => event.preventDefault()}>
<label htmlFor="search-field">Search accounts: </label>
<input
id="search-field"
type="search"
value={term}
placeholder="Search accounts: name or username"
onChange={(event) => {
setTerm(event.target.value);
}}
/>
</form>
<ul>
{results.map(({ record }) => {
return (
<li key={record.username}>
<a href={`http://twitter.com/${record.username}`}>
<img src={record.meta.profile_image_url} aria-hidden={true} />
<span
dangerouslySetInnerHTML={{
__html: record.name,
}}
/>
<br />
@
<span
dangerouslySetInnerHTML={{
__html: record.username,
}}
/>
</a>
</li>
);
})}
</ul>
</main>
);
}
Other possible levels of fuzziness
are 0
, to disable it altogether, and 2
to extend the allowed typo tolerance. Higher than 2
is not supported.
In addition to expecting a more forgiving search, users expect to know precisely why a result matches their search term. With a fuzzy search, that is even more crucial, as it can lead to some unexpected matches.
When your mind thinks "oslo" but you wrote "olso", the "Espen Olson" results might feel strange when not highlighted.
Luckily, Xata generates ready-to-use HTML, wrapping the relevant matching text
in <em>
for us. However, you need to use getMetadata
on each record to get
access to it.
import React, { useState } from "react";
import { useDebounce } from "usehooks-ts";
import { useQuery } from "@tanstack/react-query";
import { xataWorker } from "./xata";
// Code executed on the server as a Cloudflare Worker
const searchAccount = xataWorker(
"searchAccount",
async ({ xata }, { term }) => {
const results = await xata.search.all(term, {
tables: [
{
table: "accounts",
target: ["name", "username"],
},
],
fuzziness: 1,
prefix: "phrase",
});
// Getting the highlights (and more)
const enrichedResults = results.map((result) => {
return {
...result,
...result.record.getMetadata(),
};
});
return enrichedResults;
}
);
//Code executed in the browser
export default function App() {
const [term, setTerm] = useState("");
const debouncedTerm = useDebounce(term, 300);
const { data: results } = useQuery({
queryKey: ["search", debouncedTerm],
queryFn: () => {
return searchAccount({ term: debouncedTerm });
},
enabled: Boolean(debouncedTerm),
placeholderData: [],
});
return (
<main>
<form onSubmit={(event) => event.preventDefault()}>
<label htmlFor="search-field">Search accounts: </label>
<input
id="search-field"
type="search"
value={term}
placeholder="Search accounts: name or username"
onChange={(event) => {
setTerm(event.target.value);
}}
/>
</form>
<ul>
{results.map(({ record, highlight }) => {
return (
<li key={record.username}>
<a href={`http://twitter.com/${record.username}`}>
<img src={record.meta.profile_image_url} aria-hidden={true} />
<span
dangerouslySetInnerHTML={{
__html: highlight.name || record.name,
}}
/>
<br />
@
<span
dangerouslySetInnerHTML={{
__html: highlight.username || record.username,
}}
/>
</a>
</li>
);
})}
</ul>
</main>
);
}
Xata even allows you to play around with search—no code needed—using the Search Engine Playground.
Pro tip: Use the "Get Code Snippet" feature to get a head start while coding.
I hope you are excited to dig into fuzzy search with Xata:
If you need any help implementing fuzzy search in your project, join the Xata Discord.