Pattern Matching with Glob: Finding Files by Pattern 5/9
Rez Moss

Rez Moss @rezmoss

About: I’m a Golang & Node.js Developer with 10+ years of experience in cloud and server architecture, specializing in AWS and DevOps

Location:
Canada
Joined:
Apr 19, 2024

Pattern Matching with Glob: Finding Files by Pattern 5/9

Publish Date: Jun 28
5 0

Glob Function Fundamentals

The filepath.Glob function in Go provides a powerful way to find files and directories using pattern matching. At its core, glob matching allows you to specify file patterns using wildcards and special characters, returning a slice of paths that match your criteria.

package main

import (
    "fmt"
    "path/filepath"
)

func main() {
    matches, err := filepath.Glob("*.txt")
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        return
    }

    for _, match := range matches {
        fmt.Println(match)
    }
}
Enter fullscreen mode Exit fullscreen mode

The glob function operates on the current working directory by default, but you can specify absolute or relative paths in your patterns. It returns two values: a slice of matching file paths and an error. The error handling is straightforward - the only error you'll encounter is filepath.ErrBadPattern, which occurs when your pattern syntax is malformed.

// Valid patterns
matches1, _ := filepath.Glob("/home/user/*.log")
matches2, _ := filepath.Glob("../config/*.json")
matches3, _ := filepath.Glob("data/2023/*/report.csv")

// Invalid pattern that would return ErrBadPattern
matches4, err := filepath.Glob("[invalid")
if err == filepath.ErrBadPattern {
    fmt.Println("Pattern syntax error")
}
Enter fullscreen mode Exit fullscreen mode

Under the hood, filepath.Glob uses the same pattern matching logic as path.Match, but extends it to work with file system hierarchies. While path.Match operates on individual path segments, glob can traverse directory structures and match patterns across multiple levels.

The function doesn't follow symbolic links and only returns paths that actually exist in the file system. This behavior ensures that your matches correspond to real, accessible files and directories, making it reliable for file operations that follow the glob call.

Pattern Syntax Deep Dive

Understanding glob pattern syntax is essential for crafting precise file matching expressions. The pattern language builds on familiar shell globbing conventions but has specific rules and limitations you need to master.

Wildcard Characters and Their Meanings

The asterisk (*) is your most versatile tool, matching any sequence of characters except the path separator. This makes it perfect for matching file names with varying content but predictable structure.

// Match all .go files in current directory
matches, _ := filepath.Glob("*.go")

// Match files starting with "test_"
matches, _ = filepath.Glob("test_*.log")

// Match files with any name but specific extension
matches, _ = filepath.Glob("*.json")
Enter fullscreen mode Exit fullscreen mode

The question mark (?) matches exactly one character, giving you precise control over variable positions in file names.

// Match files like "day1.txt", "day2.txt", but not "day10.txt"
matches, _ := filepath.Glob("day?.txt")

// Match "file_a.dat", "file_b.dat", etc.
matches, _ = filepath.Glob("file_?.dat")

// Combine for more complex patterns
matches, _ = filepath.Glob("backup_????_??.sql")
Enter fullscreen mode Exit fullscreen mode

Character Classes and Ranges

Square brackets define character classes, allowing you to specify sets or ranges of acceptable characters at a position.

// Match files ending with digits
matches, _ := filepath.Glob("report_[0-9].txt")

// Match specific characters
matches, _ = filepath.Glob("config_[abc].yaml")

// Match uppercase letters
matches, _ = filepath.Glob("LOG_[A-Z].txt")

// Combine ranges and specific characters
matches, _ = filepath.Glob("file_[0-9a-f].hex")
Enter fullscreen mode Exit fullscreen mode

Character classes support negation using the caret (^) as the first character, matching anything except the specified set.

// Match files not ending with digits
matches, _ := filepath.Glob("temp_[^0-9].tmp")

// Match non-alphabetic characters
matches, _ = filepath.Glob("data_[^a-zA-Z].csv")
Enter fullscreen mode Exit fullscreen mode

Escaping Special Characters

When you need to match literal wildcard characters, Go's glob implementation uses backslash escaping, though the behavior depends on your operating system.

// On Unix systems, escape with backslash
matches, _ := filepath.Glob("file\\*.txt")  // Matches "file*.txt" literally

// Windows handling varies - test your patterns
matches, _ = filepath.Glob("backup\\[daily\\].zip")
Enter fullscreen mode Exit fullscreen mode

For cross-platform compatibility, consider using path.Match to test individual components when dealing with literal special characters, or restructure your file naming conventions to avoid conflicts with glob metacharacters.

The pattern syntax doesn't support regular expression features like quantifiers or alternation. Each pattern element has a specific, limited scope that keeps glob fast and predictable for file system operations.

GlobFS Interface Optimization

The io/fs package introduced the GlobFS interface in Go 1.16, enabling file systems to provide optimized glob implementations. This interface allows custom file systems to implement native pattern matching, potentially offering significant performance improvements over the default traversal-based approach.

type GlobFS interface {
    fs.FS
    Glob(pattern string) ([]string, error)
}
Enter fullscreen mode Exit fullscreen mode

When you use fs.Glob with a file system that implements GlobFS, Go automatically detects and uses the native implementation instead of falling back to manual directory traversal.

import (
    "io/fs"
    "os"
)

func findConfigs(fsys fs.FS) ([]string, error) {
    // This will use native glob if fsys implements GlobFS
    return fs.Glob(fsys, "config/*.yaml")
}

// Using with os.DirFS
fsys := os.DirFS("/etc")
matches, err := fs.Glob(fsys, "nginx/*.conf")
Enter fullscreen mode Exit fullscreen mode

Performance Implications

Native glob implementations can dramatically outperform manual traversal, especially on large directory structures. Database-backed file systems, network file systems, and compressed archives can leverage their internal indexing or query capabilities to filter matches without examining every file.

// Potentially slow: manual traversal
func slowGlob(root string) []string {
    var matches []string
    filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
        if matched, _ := filepath.Match("*.log", d.Name()); matched {
            matches = append(matches, path)
        }
        return nil
    })
    return matches
}

// Fast: uses native implementation if available
func fastGlob(fsys fs.FS) ([]string, error) {
    return fs.Glob(fsys, "**/*.log")
}
Enter fullscreen mode Exit fullscreen mode

The performance difference becomes more pronounced with complex patterns or when searching deep directory hierarchies. Native implementations can often skip entire directory branches or use metadata indexes to accelerate matching.

Fallback Behavior

When a file system doesn't implement GlobFS, Go automatically falls back to a directory walking implementation. This fallback ensures your code works consistently across different file system types without requiring changes.

func robustGlobSearch(fsys fs.FS, pattern string) ([]string, error) {
    // This works regardless of whether fsys implements GlobFS
    matches, err := fs.Glob(fsys, pattern)
    if err != nil {
        return nil, fmt.Errorf("glob failed: %w", err)
    }

    // Additional filtering can be applied to results
    var filtered []string
    for _, match := range matches {
        if info, err := fs.Stat(fsys, match); err == nil && !info.IsDir() {
            filtered = append(filtered, match)
        }
    }

    return filtered, nil
}
Enter fullscreen mode Exit fullscreen mode

The fallback implementation maintains the same semantic behavior as native implementations, ensuring that switching between file system types doesn't break your application logic. However, you should be aware of potential performance differences when working with large-scale file operations.

Testing with both native and fallback implementations helps ensure your glob patterns work correctly across different deployment scenarios and file system configurations.

Advanced Pattern Techniques

Complex file organization scenarios require sophisticated pattern matching approaches that go beyond basic wildcards. Mastering these advanced techniques enables you to handle intricate directory structures and implement precise filtering logic.

Multi-Level Directory Patterns

While standard glob patterns don't support recursive matching with **, you can achieve multi-level directory traversal by combining glob with directory walking or using strategic pattern construction.

func findNestedConfigs(root string) ([]string, error) {
    var allMatches []string

    // Pattern for immediate subdirectories
    level1, _ := filepath.Glob(filepath.Join(root, "*", "*.conf"))
    allMatches = append(allMatches, level1...)

    // Pattern for second-level subdirectories
    level2, _ := filepath.Glob(filepath.Join(root, "*", "*", "*.conf"))
    allMatches = append(allMatches, level2...)

    // Pattern for third-level subdirectories
    level3, _ := filepath.Glob(filepath.Join(root, "*", "*", "*", "*.conf"))
    allMatches = append(allMatches, level3...)

    return allMatches, nil
}
Enter fullscreen mode Exit fullscreen mode

For truly recursive searching, combine glob with filepath.WalkDir to apply pattern matching at each directory level:

func recursiveGlob(root, pattern string) ([]string, error) {
    var matches []string

    err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return nil // Skip problematic directories
        }

        if !d.IsDir() {
            if matched, _ := filepath.Match(pattern, d.Name()); matched {
                matches = append(matches, path)
            }
        }
        return nil
    })

    return matches, err
}
Enter fullscreen mode Exit fullscreen mode

Complex Filtering Scenarios

Real-world applications often require filtering based on multiple criteria. You can chain glob operations or combine them with additional validation logic:

func findRecentLogs(logDir string, days int) ([]string, error) {
    // First, find all potential log files
    candidates, err := filepath.Glob(filepath.Join(logDir, "*.log"))
    if err != nil {
        return nil, err
    }

    // Filter by modification time
    cutoff := time.Now().AddDate(0, 0, -days)
    var recent []string

    for _, candidate := range candidates {
        info, err := os.Stat(candidate)
        if err != nil {
            continue
        }

        if info.ModTime().After(cutoff) {
            recent = append(recent, candidate)
        }
    }

    return recent, nil
}
Enter fullscreen mode Exit fullscreen mode

Pattern composition allows you to build complex selection criteria by combining multiple glob results:

func findDataFiles(baseDir string) (map[string][]string, error) {
    results := make(map[string][]string)

    // Find CSV files
    csvFiles, _ := filepath.Glob(filepath.Join(baseDir, "*.csv"))
    results["csv"] = csvFiles

    // Find JSON files with specific naming
    jsonFiles, _ := filepath.Glob(filepath.Join(baseDir, "data_*.json"))
    results["json"] = jsonFiles

    // Find backup files (multiple extensions)
    backups := make([]string, 0)
    for _, ext := range []string{"bak", "backup", "old"} {
        pattern := filepath.Join(baseDir, "*."+ext)
        matches, _ := filepath.Glob(pattern)
        backups = append(backups, matches...)
    }
    results["backups"] = backups

    return results, nil
}
Enter fullscreen mode Exit fullscreen mode

Combining Glob with Other Operations

Effective file processing often requires combining glob results with sorting, filtering, or transformation operations:

func processLatestBackups(backupDir string, limit int) error {
    // Find all backup files
    backups, err := filepath.Glob(filepath.Join(backupDir, "backup_*.sql"))
    if err != nil {
        return err
    }

    // Sort by modification time (newest first)
    sort.Slice(backups, func(i, j int) bool {
        info1, _ := os.Stat(backups[i])
        info2, _ := os.Stat(backups[j])
        return info1.ModTime().After(info2.ModTime())
    })

    // Process only the most recent files
    processCount := limit
    if len(backups) < limit {
        processCount = len(backups)
    }

    for i := 0; i < processCount; i++ {
        if err := processBackupFile(backups[i]); err != nil {
            return fmt.Errorf("failed to process %s: %w", backups[i], err)
        }
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

These advanced techniques enable you to build robust file discovery systems that can handle complex organizational schemes and varying requirements across different deployment environments.

Error Resilience

Glob operations interact with the file system, making them susceptible to various runtime conditions. Understanding Go's approach to error handling in glob functions helps you build resilient applications that gracefully handle both pattern validation issues and file system anomalies.

I/O Error Ignoring Behavior

The filepath.Glob function follows a pragmatic approach to I/O errors during directory traversal. When it encounters permission denied errors, temporarily unavailable directories, or other transient file system issues, it silently continues processing rather than terminating the entire operation.

func demonstrateErrorHandling() {
    // This will return matches from accessible directories
    // and silently skip directories with permission issues
    matches, err := filepath.Glob("/var/log/*/*.log")

    // err will only be non-nil for pattern syntax errors
    if err != nil {
        fmt.Printf("Pattern error: %v\n", err)
        return
    }

    // matches contains all accessible files, even if some directories were skipped
    fmt.Printf("Found %d log files\n", len(matches))
}
Enter fullscreen mode Exit fullscreen mode

This behavior ensures that partial file system access doesn't prevent your application from processing available data. However, you won't receive notifications about skipped directories, so consider logging or monitoring when complete directory access is critical:

func auditableGlob(pattern string) ([]string, []error) {
    matches, err := filepath.Glob(pattern)
    if err != nil {
        return nil, []error{err}
    }

    // Verify access to parent directories if needed
    var accessErrors []error
    parentDirs := make(map[string]bool)

    for _, match := range matches {
        dir := filepath.Dir(match)
        if !parentDirs[dir] {
            if _, err := os.Stat(dir); err != nil {
                accessErrors = append(accessErrors, fmt.Errorf("directory access issue: %s: %w", dir, err))
            }
            parentDirs[dir] = true
        }
    }

    return matches, accessErrors
}
Enter fullscreen mode Exit fullscreen mode

Pattern Validation

The only error that filepath.Glob explicitly returns is filepath.ErrBadPattern, which occurs when your pattern contains malformed character classes or invalid escape sequences.

func validateAndExecuteGlob(pattern string) ([]string, error) {
    // Test the pattern first if you want explicit validation
    _, err := filepath.Match(pattern, "test")
    if err == filepath.ErrBadPattern {
        return nil, fmt.Errorf("invalid pattern syntax: %s", pattern)
    }

    // Execute the glob operation
    matches, err := filepath.Glob(pattern)
    if err == filepath.ErrBadPattern {
        return nil, fmt.Errorf("pattern validation failed during glob: %w", err)
    }

    return matches, nil
}
Enter fullscreen mode Exit fullscreen mode

Common pattern validation errors include unclosed character classes, invalid ranges, and platform-specific escape sequence issues:

func handlePatternErrors() {
    invalidPatterns := []string{
        "[abc",           // Unclosed character class
        "[z-a]",         // Invalid range (reverse order)
        "file\\",        // Trailing escape (platform-dependent)
    }

    for _, pattern := range invalidPatterns {
        _, err := filepath.Glob(pattern)
        if err == filepath.ErrBadPattern {
            fmt.Printf("Invalid pattern detected: %s\n", pattern)
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

For robust applications, implement pattern validation before executing glob operations, especially when patterns come from user input or configuration files:

func safeGlob(userPattern string) ([]string, error) {
    // Sanitize and validate user input
    if strings.Contains(userPattern, "..") {
        return nil, fmt.Errorf("directory traversal not allowed")
    }

    // Test pattern validity
    if _, err := filepath.Match(userPattern, ""); err != nil {
        return nil, fmt.Errorf("invalid pattern format: %w", err)
    }

    // Execute with confidence
    matches, err := filepath.Glob(userPattern)
    if err != nil {
        return nil, fmt.Errorf("glob execution failed: %w", err)
    }

    return matches, nil
}
Enter fullscreen mode Exit fullscreen mode

This error resilience strategy ensures your applications remain stable while providing meaningful feedback when pattern construction issues occur, maintaining a clear distinction between syntax errors and runtime file system conditions.

Practical Examples

Real-world applications benefit from concrete examples that demonstrate glob patterns in common file management scenarios. These examples showcase patterns you'll frequently encounter in system administration, build processes, and application maintenance tasks.

Log File Collection

System administrators often need to collect log files across multiple applications and time periods. Glob patterns excel at identifying files based on naming conventions and directory structures.

func collectSystemLogs(logRoot string) (map[string][]string, error) {
    logCategories := make(map[string][]string)

    // Collect Apache/Nginx access logs
    webLogs, _ := filepath.Glob(filepath.Join(logRoot, "apache2", "access.log*"))
    nginxLogs, _ := filepath.Glob(filepath.Join(logRoot, "nginx", "access.log*"))
    logCategories["web"] = append(webLogs, nginxLogs...)

    // Collect application logs with date patterns
    appLogs, _ := filepath.Glob(filepath.Join(logRoot, "app", "application-????-??-??.log"))
    logCategories["application"] = appLogs

    // Collect error logs from various sources
    errorPatterns := []string{
        filepath.Join(logRoot, "*", "error.log"),
        filepath.Join(logRoot, "*", "*.err"),
        filepath.Join(logRoot, "errors", "*.log"),
    }

    var allErrors []string
    for _, pattern := range errorPatterns {
        matches, _ := filepath.Glob(pattern)
        allErrors = append(allErrors, matches...)
    }
    logCategories["errors"] = allErrors

    return logCategories, nil
}
Enter fullscreen mode Exit fullscreen mode

For log rotation scenarios, you can target specific file generations or time ranges:

func findRecentRotatedLogs(logDir string, service string) ([]string, error) {
    // Match rotated logs: service.log, service.log.1, service.log.2, etc.
    basePattern := filepath.Join(logDir, service+".log")
    rotatedPattern := filepath.Join(logDir, service+".log.[0-9]")
    compressedPattern := filepath.Join(logDir, service+".log.[0-9].gz")

    var allLogs []string

    // Current log file
    if current, err := filepath.Glob(basePattern); err == nil {
        allLogs = append(allLogs, current...)
    }

    // Recent rotated files
    if rotated, err := filepath.Glob(rotatedPattern); err == nil {
        allLogs = append(allLogs, rotated...)
    }

    // Compressed archives
    if compressed, err := filepath.Glob(compressedPattern); err == nil {
        allLogs = append(allLogs, compressed...)
    }

    return allLogs, nil
}
Enter fullscreen mode Exit fullscreen mode

Configuration File Discovery

Applications often need to locate configuration files across multiple possible locations, following standard directory conventions or deployment-specific layouts.

func discoverConfigFiles(appName string) (map[string]string, error) {
    configs := make(map[string]string)

    // Standard system locations
    systemPaths := []string{
        "/etc/" + appName + "/" + appName + ".conf",
        "/etc/" + appName + ".conf",
        "/usr/local/etc/" + appName + ".conf",
    }

    // User-specific locations
    homeDir, _ := os.UserHomeDir()
    userPaths := []string{
        filepath.Join(homeDir, "."+appName, "config"),
        filepath.Join(homeDir, "."+appName+"rc"),
        filepath.Join(homeDir, ".config", appName, "config.yaml"),
    }

    // Check each location
    allPaths := append(systemPaths, userPaths...)
    for _, path := range allPaths {
        if matches, err := filepath.Glob(path); err == nil && len(matches) > 0 {
            for _, match := range matches {
                if info, err := os.Stat(match); err == nil && !info.IsDir() {
                    configs[filepath.Base(filepath.Dir(match))] = match
                }
            }
        }
    }

    return configs, nil
}
Enter fullscreen mode Exit fullscreen mode

Environment-specific configuration discovery handles different deployment scenarios:

func findEnvironmentConfigs(configDir, environment string) ([]string, error) {
    patterns := []string{
        // Environment-specific files
        filepath.Join(configDir, environment, "*.yaml"),
        filepath.Join(configDir, environment, "*.json"),

        // Files with environment suffix
        filepath.Join(configDir, "*-"+environment+".yaml"),
        filepath.Join(configDir, "*_"+environment+".json"),

        // Override files
        filepath.Join(configDir, "override", environment, "*"),
    }

    var allConfigs []string
    for _, pattern := range patterns {
        if matches, err := filepath.Glob(pattern); err == nil {
            allConfigs = append(allConfigs, matches...)
        }
    }

    return allConfigs, nil
}
Enter fullscreen mode Exit fullscreen mode

Build Artifact Cleanup

Build systems generate numerous temporary files and artifacts that require periodic cleanup. Glob patterns help identify and remove these files safely.

func cleanBuildArtifacts(projectRoot string) error {
    // Define cleanup patterns for different artifact types
    cleanupPatterns := []string{
        // Compiled objects and libraries
        filepath.Join(projectRoot, "**", "*.o"),
        filepath.Join(projectRoot, "**", "*.so"),
        filepath.Join(projectRoot, "**", "*.dylib"),

        // Build directories
        filepath.Join(projectRoot, "build", "*"),
        filepath.Join(projectRoot, "dist", "*"),
        filepath.Join(projectRoot, "target", "*"),

        // Temporary files
        filepath.Join(projectRoot, "**", "*.tmp"),
        filepath.Join(projectRoot, "**", ".DS_Store"),
        filepath.Join(projectRoot, "**", "Thumbs.db"),

        // Cache directories
        filepath.Join(projectRoot, "node_modules", ".cache", "*"),
        filepath.Join(projectRoot, ".pytest_cache", "*"),
    }

    var removedCount int
    for _, pattern := range cleanupPatterns {
        matches, err := filepath.Glob(pattern)
        if err != nil {
            continue
        }

        for _, match := range matches {
            if err := os.RemoveAll(match); err == nil {
                removedCount++
            }
        }
    }

    fmt.Printf("Cleaned %d build artifacts\n", removedCount)
    return nil
}
Enter fullscreen mode Exit fullscreen mode

Age-based cleanup combines glob matching with file modification times:

func cleanOldArtifacts(buildDir string, maxAge time.Duration) error {
    artifacts, err := filepath.Glob(filepath.Join(buildDir, "*"))
    if err != nil {
        return err
    }

    cutoff := time.Now().Add(-maxAge)
    var cleaned int

    for _, artifact := range artifacts {
        info, err := os.Stat(artifact)
        if err != nil {
            continue
        }

        if info.ModTime().Before(cutoff) {
            if err := os.RemoveAll(artifact); err == nil {
                cleaned++
            }
        }
    }

    fmt.Printf("Removed %d old artifacts\n", cleaned)
    return nil
}
Enter fullscreen mode Exit fullscreen mode

These practical examples demonstrate how glob patterns integrate into real-world workflows, providing reliable file discovery and management capabilities across diverse application domains.

Comments 0 total

    Add comment