Add Pagefind Search to Hugo
Victor Dorneanu

Victor Dorneanu @dorneanu

About: I work as a Security Architect currently with focus on ETH. I mainly code in Golang, Python and TypeScript. I use (Doom) Emacs for almost everything and ORG mode for generating/publishing content.

Location:
Berlin
Joined:
Dec 3, 2019

Add Pagefind Search to Hugo

Publish Date: Apr 10
0 0

Introduction

Every PKMS/BASB needs a search functionality. Ever since I've created brainfck to host my own collection of thoughts/ideas/resources (aka Zettelkasten) I wanted to be able to actually search within my collection of org-roam based notes. Meanwhile for all my sites I own (this blog, my CV/portfolio, brainfck and defersec) I use hugo. All of them didn't have proper search capabilities. That's why I was looking for a proper way to include search functionalities without any major effort.

Hugo, indeed, has some open-source and commercial search options you can choose from. I have used this fuse.js integration in the past but I wasn't happy with it. It didn't index well, I couldn't find all my content. Of course, I was thinking having Algolia DocSearch do the magic but one has to apply for it. Also, not all of my sites are about technical documentation. So I had to find another alternative. Digging deeper I came across Pagefind.

Here is a screenshot how it currently looks like:

Search function on brainfck.org using Pagefind

Pagefind

Pagefind is a lightweight, static search solution designed specifically for static sites like those built with Hugo. It works by generating a search index during your site's build process, creating a client-side search experience that doesn't require any server infrastructure.

The indexing works by crawling your static HTML files after they're built, extracting content and metadata. Pagefind creates a compressed search index that's both fast and efficient, typically resulting in index sizes of about 1/1000th of your original content size. This means your search functionality remains quick even on larger sites.

When examining a typical Pagefind implementation in a Hugo site, the folder structure looks something like this:

tree public/pagefind
public/pagefind
├── fragment
│   ├── en_02aee83.pf_fragment
...
│   ├── en_2538bf9.pf_fragment
│   ├── en_2550e62.pf_fragment
│   ├── en_2582c6f.pf_fragment
│   ├── en_ffbbc35.pf_index
│   └── en_ffeab69.pf_index
├── pagefind-entry.json
├── pagefind-highlight.js
├── pagefind-modular-ui.css
├── pagefind-modular-ui.js
├── pagefind-ui.css
├── pagefind-ui.js
├── pagefind.en_19e6da436f.pf_meta
├── pagefind.en_1c7f38a66e.pf_meta
├── pagefind.en_1f95755b6e.pf_meta
├── pagefind.en_2a7661ff21.pf_meta
├── pagefind.en_4fdfe13af9.pf_meta
├── pagefind.en_5049ae833f.pf_meta
├── pagefind.en_603dab85b4.pf_meta
├── pagefind.en_6663d2c9c9.pf_meta
├── pagefind.en_7411b4b912.pf_meta
├── pagefind.en_791636c92e.pf_meta
├── pagefind.en_93f1fd4c5c.pf_meta
├── pagefind.en_94bcd0c843.pf_meta
├── pagefind.en_a8061c44f1.pf_meta
├── pagefind.en_a96223e65e.pf_meta
├── pagefind.en_b526d7e391.pf_meta
├── pagefind.en_b6e37679e1.pf_meta
├── pagefind.en_b7a63d2937.pf_meta
├── pagefind.en_c039de83fe.pf_meta
├── pagefind.en_d9c4de17e6.pf_meta
├── pagefind.en_deac3933f1.pf_meta
├── pagefind.en_eb3680ac82.pf_meta
├── pagefind.en_fab96f64ed.pf_meta
├── pagefind.js
├── wasm.en.pagefind
└── wasm.unknown.pagefind

3 directories, 3087 files
Enter fullscreen mode Exit fullscreen mode

Configuration

You also have some configuration options. In my case:

# Pagefind configuration

# Basic options (using Pagefind 1.0 naming)
site: public
output_subdir: pagefind

# Add date-based metadata for sorting
indexing_options:
  # Add date as metadata for all pages
  - metadata_date_field: date
    metadata_date_format: iso

  # Global metadata fields for all pages
  - metadata_fields:
      # Primary metadata sources
      - {
          tag: "meta[property='og:title']",
          as: "title",
          content_attr: "content",
          optional: true,
        }
      - { selector: ".post-title", as: "title", optional: true }
      - { selector: "h1", as: "title", optional: true }

      # Date and other metadata
      - {
          tag: "meta[property='og:type']",
          as: "type",
          content_attr: "content",
          optional: true,
        }
      - {
          tag: "meta[name='date']",
          as: "date",
          content_attr: "content",
          optional: true,
        }
      - {
          tag: "meta[property='article:published_time']",
          as: "published",
          content_attr: "content",
          optional: true,
        }
      - {
          selector: "time",
          as: "date",
          content_attr: "datetime",
          optional: true,
        }

# Added languages
language:
  code: en
  stemming: true

# Search options
search_options:
  ignore_missing_metadata_fields: true
  boost_exact_matches: true
  boost_title: 5.0
Enter fullscreen mode Exit fullscreen mode

This configuration file sets up Pagefind with optimal settings for a personal knowledge base. The key options include:

  • Basic setup: Specifies the source directory (public) and where to output the search files (pagefind subdirectory)
  • Date-based metadata: Adds date fields for each page, enabling chronological sorting of search results
  • Metadata extraction: Configures multiple fallback methods to extract page titles and dates from different HTML elements and meta tags
  • Search optimization: Boosts exact matches and increases the weight of title matches by 5x, ensuring the most relevant results appear first

Hugo integration

Search page

---
title: "Search"
---
Enter fullscreen mode Exit fullscreen mode

Search template

The search page implementation works through several key components:

  • Loading Pagefind scripts: The template loads the necessary CSS and JavaScript files that Pagefind generated during the build process.
  • Search parameter handling: The implementation supports direct linking to search results using URL parameters. When someone visits /search?q=zettelkasten, the page automatically populates the search box with "zettelkasten" and triggers the search.
  • URL updating: As the user types in the search box, the URL is dynamically updated with the current query using the browser's History API (pushState). This creates a seamless experience where users can bookmark or share specific search results.
{{ define "main" }}
<main class="center mv4 content-width ph3">
  <div class="f3 fw6 heading-color heading-font">{{ .Title }}</div>
  <div class="lh-copy mt4">
    <p>Search across all content.</p>

    <!-- Load Pagefind scripts -->
    <link href="/pagefind/pagefind-ui.css" rel="stylesheet" />
    <script src="/pagefind/pagefind-ui.js"></script>

    <!-- Create the search container with your site's styling -->
    <div id="search" class="w-100 mb4"></div>

    <script>
      // Set configuration for pagefind
      window.pagefindConfig = {
        bundlePath: "/pagefind/",
        processTerm: (term) => {
          // Simple processing to remove undefined text from search results
          return term.replace(/undefined/g, "");
        },
      };

      // Function to get URL parameters
      function getUrlParameter(name) {
        name = name.replace(/[\[]/, "\\[").replace(/[\]]/, "\\]");
        const regex = new RegExp("[\\?&]" + name + "=([^&#]*)");
        const results = regex.exec(location.search);
        return results === null ? "" : decodeURIComponent(results[1].replace(/\+/g, " "));
      }

      // Initialize the search UI when the DOM is loaded
      document.addEventListener("DOMContentLoaded", function () {
        // Get the search query from URL parameter 'q' if it exists
        const initialQuery = getUrlParameter("q");

        const searchUI = new PagefindUI({
          element: "#search",
          showSubResults: true,
          showImages: false,
          sort: {
            // Enable sorting by date (newest first) and relevance
            options: [
              { key: "date", label: "Date (Newest First)", order: "desc" },
              { key: "default", label: "Relevance" },
            ],
            // Use date as the default sorting method
            default: "date",
          },
          translations: {
            placeholder: "Search your notes...",
            zero_results: "No results found",
            many_results: "Found {results} results",
            sort_by: "Sort by:",
          },
        });

        // Set up a listener to watch for changes in the search input
        setTimeout(() => {
          // Find the search input after PagefindUI has initialized
          const searchInput = document.querySelector(
            ".pagefind-ui__search-input",
          );
          const searchForm = document.querySelector(".pagefind-ui__form");

          if (searchInput) {
            // Update URL when the search input changes
            searchInput.addEventListener("input", function () {
              updateSearchUrl(this.value);
            });

            // Also capture form submission (when user presses Enter)
            if (searchForm) {
              searchForm.addEventListener("submit", function (e) {
                // Don't prevent default as we want the search to execute
                updateSearchUrl(searchInput.value);
              });
            }
          }

          // Helper function to update the URL with the search query
          function updateSearchUrl(query) {
            const url = new URL(window.location);

            if (query && query.trim() !== "") {
              url.searchParams.set("q", query);
            } else {
              url.searchParams.delete("q");
            }

            window.history.pushState({}, "", url);
          }
        }, 500); // Short delay to ensure PagefindUI has initialized

        // If there's an initial query, set it and trigger the search
        if (initialQuery) {
          // Use a small timeout to ensure PagefindUI is fully initialized
          setTimeout(() => {
            searchUI.triggerSearch(initialQuery);
          }, 100);
        }
      });
    </script>
  </div>
</main>
{{ end }}
Enter fullscreen mode Exit fullscreen mode

CSS

/* Simple pagefind overrides to match site styling */
:root {
  --pagefind-ui-scale: 1;
  --pagefind-ui-primary: #3e5622;
  --pagefind-ui-text: #333;
  --pagefind-ui-background: #fff;
  --pagefind-ui-border: #ddd;
  --pagefind-ui-border-radius: 4px;
}

/* Search input styling */
.pagefind-ui__search-input {
  padding: 0.5rem !important;
  width: 100% !important;
  border: 1px solid #ddd !important;
  border-radius: 0.25rem !important;
  font-size: 1rem !important;
  /* Remove the search icon */
  background-image: none !important;
  padding-right: 16px !important;
  padding-left: 16px !important;
}

/* Adjust clear button position */
.pagefind-ui__search-clear {
  right: 16px !important;
}

/* Result title styling */
.pagefind-ui__result-title {
  font-weight: 600 !important;
}

.pagefind-ui__result-link {
  color: #3e5622 !important;
  text-decoration: none !important;
}

/* Highlight search terms */
.pagefind-ui__result-excerpt mark {
  background-color: rgba(255, 255, 0, 0.4) !important;
  color: inherit !important;
}

/* Load more button styling */
.pagefind-ui__button {
  background-color: #3e5622 !important;
  color: white !important;
  border: none !important;
  border-radius: 0.25rem !important;
  padding: 0.5rem 1rem !important;
  cursor: pointer !important;
}

/* Simple fix for undefined text */
.pagefind-ui__result-excerpt:contains('undefined'),
.pagefind-ui__result-title:contains('undefined') {
  visibility: hidden;
}
Enter fullscreen mode Exit fullscreen mode

Netlify deployment

Deploying your Pagefind-enabled Hugo site on Netlify is straightforward with this configuration:

[build]
  publish = "public"
  command = "hugo && npx pagefind --site public --output-subdir pagefind"

[build.environment]
  HUGO_VERSION = "0.145.0"  # Replace with your current Hugo version
  NODE_VERSION = "18"       # Ensure we have a recent Node version for Pagefind

# Cache control for Pagefind files
[[headers]]
  for = "/pagefind/*"
  [headers.values]
    Cache-Control = "public, max-age=604800"
Enter fullscreen mode Exit fullscreen mode

The configuration does three critical things:

  • Build command: It extends the standard Hugo build process by adding Pagefind indexing as a second step. This ensures your search index is generated after Hugo creates all the HTML files.
  • Environment setup: It specifies the required Hugo and Node.js versions.
  • Cache configuration: It adds cache headers for all Pagefind files, setting a one-week cache period. This improves performance for returning visitors, as browsers won't need to download the search index again on every visit.

For local development this might be also useful:

{
  "name": "brainfck-roam",
  "version": "1.0.0",
  "description": "Your org-roam notes on Hugo",
  "scripts": {
    "build": "hugo && npx pagefind --site public --output-subdir pagefind",
    "dev": "hugo server -b http://127.0.0.1:1315/ --disableFastRender --port 1315 --noHTTPCache --logLevel debug --gc",
    "search-index": "npx pagefind --site public --output-subdir pagefind",
    "full-dev": "npm run build && npm run dev"
  },
  "dependencies": {
    "pagefind": "^1.0.0"
  }
}
Enter fullscreen mode Exit fullscreen mode

The package.json file provides several convenient npm scripts that streamline your workflow:

  • npm run dev
    • Starts a local Hugo server with debug logging and cache disabled.
  • npm run search-index
    • Runs only the Pagefind indexing on your current build output. This is helpful when you want to rebuild just the search index without regenerating the entire site.
  • npm run build
    • Performs the complete production build process—generating the Hugo site and then creating the Pagefind search index afterward.
  • npm run full-dev
    • A comprehensive development command that builds the complete site with search index and then starts the development server. This is ideal when you need to test search functionality locally.

Resources

Comments 0 total

    Add comment