File System Walking with WalkDir: Recursive Tree Traversal 4/9
Rez Moss

Rez Moss @rezmoss

About: I’m a Golang & Node.js Developer with 10+ years of experience in cloud and server architecture, specializing in AWS and DevOps

Location:
Canada
Joined:
Apr 19, 2024

File System Walking with WalkDir: Recursive Tree Traversal 4/9

Publish Date: Jun 21
5 0

WalkDir Function Comprehensive Guide

The filepath.WalkDir function represents Go's modern approach to recursive directory traversal, replacing the older filepath.Walk function with improved performance and cleaner interface design. Understanding its mechanics is essential for any developer working with file system operations at scale.

Function Signature and Parameters

func WalkDir(root string, fn WalkDirFunc) error
Enter fullscreen mode Exit fullscreen mode

The function signature deliberately keeps things simple. The root parameter accepts any valid file system path - whether it points to a file or directory. When you pass a file path, WalkDir processes only that single file. Directory paths trigger recursive traversal of the entire subtree.

The fn parameter expects a function matching the WalkDirFunc signature:

type WalkDirFunc func(path string, d DirEntry, err error) error
Enter fullscreen mode Exit fullscreen mode

This callback executes for every file and directory encountered during traversal. The function receives the full path, a DirEntry interface providing file metadata, and any error that occurred while accessing the entry.

Lexical Ordering Guarantees

WalkDir provides deterministic traversal through lexical ordering. Within each directory, entries are processed in sorted order by name. This predictability proves crucial for testing and debugging file system operations.

// Directory structure:
// /project/
//   ├── README.md
//   ├── main.go
//   └── utils/
//       ├── helper.go
//       └── parser.go

// Traversal order:
// 1. /project/README.md
// 2. /project/main.go  
// 3. /project/utils/
// 4. /project/utils/helper.go
// 5. /project/utils/parser.go
Enter fullscreen mode Exit fullscreen mode

The lexical ordering applies only within individual directories. Parent directories are always processed before their children, but sibling directories follow alphabetical order.

Memory Usage Considerations

WalkDir optimizes memory usage through several design decisions. Unlike filepath.Walk, it uses DirEntry instead of FileInfo, avoiding expensive stat system calls until you explicitly request detailed file information.

// Memory-efficient approach
err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
    // Only call Info() when you need detailed metadata
    if strings.HasSuffix(path, ".log") {
        info, err := d.Info()
        if err != nil {
            return err
        }
        // Now you have full FileInfo
        processLogFile(path, info)
    }
    return nil
})
Enter fullscreen mode Exit fullscreen mode

The function processes one directory at a time, reading directory entries incrementally rather than loading entire directory trees into memory. This approach scales well even with deeply nested directory structures containing thousands of files.

For large traversals, be mindful of callback function allocations. Avoid creating unnecessary string concatenations or slice allocations within the callback, as these multiply across thousands of file system entries.

WalkDirFunc Callback Patterns

The WalkDirFunc callback serves as your primary interface for processing file system entries during traversal. Mastering its parameter handling and return value semantics gives you precise control over the walking behavior.

Function Parameters: path, DirEntry, error

Each callback invocation receives three parameters that work together to provide complete context about the current file system entry.

The path parameter contains the full file path from the root. This path uses the operating system's native separator and includes the original root prefix:

filepath.WalkDir("/home/user/projects", func(path string, d fs.DirEntry, err error) error {
    // path examples:
    // "/home/user/projects"
    // "/home/user/projects/main.go"
    // "/home/user/projects/src/utils.go"

    // Extract relative path if needed
    relPath, _ := filepath.Rel("/home/user/projects", path)
    return nil
})
Enter fullscreen mode Exit fullscreen mode

The DirEntry parameter provides efficient access to basic file metadata without requiring expensive system calls. Use its methods to check file types and names:

func processEntry(path string, d fs.DirEntry, err error) error {
    if d.IsDir() {
        fmt.Printf("Directory: %s\n", d.Name())
        return nil
    }

    // Check file type without calling Info()
    if d.Type()&fs.ModeSymlink != 0 {
        fmt.Printf("Symlink: %s\n", path)
        return nil
    }

    // Only call Info() when you need full metadata
    if strings.HasSuffix(d.Name(), ".large") {
        info, err := d.Info()
        if err != nil {
            return err
        }
        if info.Size() > 1024*1024 {
            fmt.Printf("Large file: %s (%d bytes)\n", path, info.Size())
        }
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

The error parameter indicates problems accessing the current entry. This error handling happens before your callback logic executes, allowing you to decide whether to continue or abort traversal.

Return Value Meanings and Control Flow

Your callback's return value directly controls traversal behavior. Understanding these return patterns enables sophisticated directory walking logic.

Returning nil continues normal traversal:

func normalProcessing(path string, d fs.DirEntry, err error) error {
    // Process the entry
    processFile(path)

    // Continue traversal
    return nil
}
Enter fullscreen mode Exit fullscreen mode

Returning filepath.SkipDir skips the current directory's contents but continues traversing siblings:

func skipHiddenDirs(path string, d fs.DirEntry, err error) error {
    if d.IsDir() && strings.HasPrefix(d.Name(), ".") {
        return filepath.SkipDir  // Skip .git, .vscode, etc.
    }
    return nil
}
Enter fullscreen mode Exit fullscreen mode

Returning filepath.SkipAll terminates the entire traversal immediately:

func findFirstMatch(target string) filepath.WalkDirFunc {
    return func(path string, d fs.DirEntry, err error) error {
        if d.Name() == target {
            fmt.Printf("Found: %s\n", path)
            return filepath.SkipAll  // Stop searching
        }
        return nil
    }
}
Enter fullscreen mode Exit fullscreen mode

Any other error value stops traversal and propagates up to the WalkDir caller:

func strictProcessing(path string, d fs.DirEntry, err error) error {
    if err != nil {
        return fmt.Errorf("access error for %s: %w", path, err)
    }

    if err := processFile(path); err != nil {
        return fmt.Errorf("processing failed for %s: %w", path, err)
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

Error Propagation Strategies

Different applications require different error handling approaches. Consider these common patterns based on your fault tolerance requirements.

The fail-fast approach stops on any error:

func failFast(path string, d fs.DirEntry, err error) error {
    if err != nil {
        return err  // Propagate immediately
    }
    return processEntry(path, d)
}
Enter fullscreen mode Exit fullscreen mode

Advanced Traversal Control

Fine-grained control over directory traversal enables efficient file system operations by avoiding unnecessary work. The key lies in understanding when and how to skip portions of the directory tree based on your specific requirements.

SkipDir vs SkipAll Usage

The distinction between filepath.SkipDir and filepath.SkipAll determines the scope of traversal interruption. Understanding their behavior prevents common mistakes in traversal logic.

SkipDir affects only the current directory when returned for a directory entry:

func skipLargeDirs(path string, d fs.DirEntry, err error) error {
    if d.IsDir() {
        // Check if directory should be skipped
        if d.Name() == "node_modules" || d.Name() == ".git" {
            fmt.Printf("Skipping directory: %s\n", path)
            return filepath.SkipDir
        }
    }

    // Process files normally
    fmt.Printf("Processing: %s\n", path)
    return nil
}

// Directory structure:
// /project/
//   ├── src/file1.go
//   ├── node_modules/  <- Skipped entirely
//   │   └── package/
//   ├── docs/
//   │   └── readme.md  <- Still processed
//   └── main.go

// Output:
// Processing: /project
// Processing: /project/src
// Processing: /project/src/file1.go
// Skipping directory: /project/node_modules
// Processing: /project/docs
// Processing: /project/docs/readme.md
// Processing: /project/main.go
Enter fullscreen mode Exit fullscreen mode

SkipAll terminates the entire traversal regardless of where it's returned:

func findAndStop(target string) filepath.WalkDirFunc {
    return func(path string, d fs.DirEntry, err error) error {
        fmt.Printf("Visiting: %s\n", path)

        if d.Name() == target {
            fmt.Printf("Found target: %s\n", path)
            return filepath.SkipAll  // Stop everything
        }

        return nil
    }
}

// Usage: find first occurrence of "config.json"
err := filepath.WalkDir("/app", findAndStop("config.json"))
// Traversal stops immediately when config.json is found
Enter fullscreen mode Exit fullscreen mode

Important: SkipDir has no effect when returned for file entries. Only directory entries can be skipped.

Conditional Directory Skipping

Complex applications often require dynamic skipping logic based on directory contents, depth, or external conditions. Implement these patterns using closure-captured state.

Depth-based skipping prevents traversal beyond a certain level:

func maxDepthWalker(maxDepth int) filepath.WalkDirFunc {
    rootDepth := -1

    return func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        // Calculate current depth
        if rootDepth == -1 {
            rootDepth = strings.Count(path, string(filepath.Separator))
        }

        currentDepth := strings.Count(path, string(filepath.Separator)) - rootDepth

        if d.IsDir() && currentDepth >= maxDepth {
            return filepath.SkipDir
        }

        fmt.Printf("Depth %d: %s\n", currentDepth, path)
        return nil
    }
}

// Usage: traverse only 2 levels deep
err := filepath.WalkDir("/project", maxDepthWalker(2))
Enter fullscreen mode Exit fullscreen mode

Content-based skipping examines directory properties before entering:

func skipEmptyOrHidden(path string, d fs.DirEntry, err error) error {
    if err != nil {
        return err
    }

    if d.IsDir() {
        // Skip hidden directories
        if strings.HasPrefix(d.Name(), ".") {
            return filepath.SkipDir
        }

        // Skip empty directories
        entries, err := os.ReadDir(path)
        if err != nil {
            return err  // Can't read, don't skip
        }

        if len(entries) == 0 {
            fmt.Printf("Skipping empty directory: %s\n", path)
            return filepath.SkipDir
        }
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

Pattern-based skipping uses matching rules for directory names:

type SkipPattern struct {
    patterns []string
}

func (sp *SkipPattern) shouldSkip(name string) bool {
    for _, pattern := range sp.patterns {
        if matched, _ := filepath.Match(pattern, name); matched {
            return true
        }
    }
    return false
}

func (sp *SkipPattern) walkFunc(path string, d fs.DirEntry, err error) error {
    if err != nil {
        return err
    }

    if d.IsDir() && sp.shouldSkip(d.Name()) {
        return filepath.SkipDir
    }

    // Process entry
    return processEntry(path, d)
}

// Usage
skipper := &SkipPattern{
    patterns: []string{"temp*", "*_backup", "node_modules"},
}
err := filepath.WalkDir("/project", skipper.walkFunc)
Enter fullscreen mode Exit fullscreen mode

Early Termination Patterns

Early termination patterns optimize performance by stopping traversal once specific conditions are met. These patterns are essential for search operations and resource-constrained environments.

The first-match pattern stops after finding the first occurrence:

func findFirst(predicate func(string, fs.DirEntry) bool) filepath.WalkDirFunc {
    found := false

    return func(path string, d fs.DirEntry, err error) error {
        if err != nil || found {
            return err
        }

        if predicate(path, d) {
            found = true
            fmt.Printf("First match: %s\n", path)
            return filepath.SkipAll
        }

        return nil
    }
}

// Find first .go file
predicate := func(path string, d fs.DirEntry) bool {
    return !d.IsDir() && strings.HasSuffix(path, ".go")
}
err := filepath.WalkDir("/src", findFirst(predicate))
Enter fullscreen mode Exit fullscreen mode

The quota-based pattern stops after processing a certain number of entries:

func processWithQuota(quota int) filepath.WalkDirFunc {
    processed := 0

    return func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        if processed >= quota {
            return filepath.SkipAll
        }

        if !d.IsDir() {
            processed++
            fmt.Printf("Processing %d/%d: %s\n", processed, quota, path)
        }

        return nil
    }
}
Enter fullscreen mode Exit fullscreen mode

The timeout-based pattern stops after a time limit:

func processWithTimeout(timeout time.Duration) filepath.WalkDirFunc {
    start := time.Now()

    return func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        if time.Since(start) > timeout {
            fmt.Println("Timeout reached, stopping traversal")
            return filepath.SkipAll
        }

        return processEntry(path, d)
    }
}
Enter fullscreen mode Exit fullscreen mode

Error Handling in Tree Walking

File system traversal encounters various error conditions that require different handling strategies. The WalkDir function provides a two-phase error reporting mechanism that gives you fine-grained control over error recovery and propagation.

Two-Phase Error Reporting

WalkDir implements a sophisticated error handling model where errors can occur both during directory reading and individual entry access. Understanding this distinction is crucial for building robust file system tools.

The first phase occurs when WalkDir attempts to read a directory's contents. If this fails, your callback receives the directory path with a non-nil error parameter:

func handleDirectoryErrors(path string, d fs.DirEntry, err error) error {
    if err != nil {
        // This error occurred while reading the directory
        fmt.Printf("Directory read error for %s: %v\n", path, err)

        // Decide whether to continue or abort
        if os.IsPermission(err) {
            fmt.Printf("Skipping due to permissions: %s\n", path)
            return nil  // Continue with siblings
        }

        return err  // Abort on other errors
    }

    // Normal processing for successfully read entries
    return processEntry(path, d)
}
Enter fullscreen mode Exit fullscreen mode

The second phase happens when individual entries within a readable directory have access problems. In this case, the callback receives the entry with its specific error:

func handleEntryErrors(path string, d fs.DirEntry, err error) error {
    if err != nil {
        // This error is specific to this entry
        fmt.Printf("Entry access error for %s: %v\n", path, err)

        // Log and continue with other entries
        logError(path, err)
        return nil
    }

    // Entry is accessible, process normally
    return processEntry(path, d)
}
Enter fullscreen mode Exit fullscreen mode

Here's a comprehensive handler that distinguishes between error types:

type ErrorHandler struct {
    dirErrors   []error
    entryErrors []error
}

func (eh *ErrorHandler) handleBothPhases(path string, d fs.DirEntry, err error) error {
    if err != nil {
        // Determine error phase based on context
        if d == nil {
            // Directory reading failed
            eh.dirErrors = append(eh.dirErrors, 
                fmt.Errorf("directory %s: %w", path, err))

            if os.IsPermission(err) {
                return nil  // Skip but continue
            }
            return err  // Fail on critical errors
        } else {
            // Individual entry error
            eh.entryErrors = append(eh.entryErrors, 
                fmt.Errorf("entry %s: %w", path, err))
            return nil  // Continue with other entries
        }
    }

    return processEntry(path, d)
}
Enter fullscreen mode Exit fullscreen mode

Pre-read vs Post-read Error Handling

The timing of error detection affects your handling strategy. Pre-read errors prevent access to directory contents entirely, while post-read errors affect individual entries after successful directory enumeration.

Pre-read errors typically indicate system-level issues:

func handlePreReadErrors(path string, d fs.DirEntry, err error) error {
    if err != nil && d == nil {
        // Pre-read error: couldn't enumerate directory
        switch {
        case os.IsPermission(err):
            fmt.Printf("Permission denied: %s\n", path)
            auditLog.LogDeniedAccess(path)
            return nil  // Continue traversal

        case os.IsNotExist(err):
            fmt.Printf("Path disappeared: %s\n", path)
            return nil  // Handle race conditions gracefully

        case errors.Is(err, syscall.ELOOP):
            fmt.Printf("Symlink loop detected: %s\n", path)
            return nil  // Skip problematic symlinks

        default:
            return fmt.Errorf("critical directory error: %w", err)
        }
    }

    return processNormally(path, d, err)
}
Enter fullscreen mode Exit fullscreen mode

Post-read errors occur after successful directory reading but indicate problems with specific entries:

func handlePostReadErrors(path string, d fs.DirEntry, err error) error {
    if err != nil && d != nil {
        // Post-read error: entry exists but has problems
        fmt.Printf("Entry problem %s: %v\n", path, err)

        // Try to extract what information we can
        if d.IsDir() {
            fmt.Printf("  Problematic directory, skipping contents\n")
        } else {
            fmt.Printf("  Problematic file, logging for later review\n")
            problemFiles.Add(path, err)
        }

        return nil  // Continue with other entries
    }

    return processNormally(path, d, err)
}
Enter fullscreen mode Exit fullscreen mode

Recovery Strategies

Different applications require different recovery approaches when file system errors occur. Implement recovery strategies based on your application's fault tolerance requirements.

The retry strategy attempts to recover from transient errors:

func withRetry(maxRetries int, backoff time.Duration) filepath.WalkDirFunc {
    return func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            for attempt := 0; attempt < maxRetries; attempt++ {
                fmt.Printf("Retry %d for %s\n", attempt+1, path)
                time.Sleep(backoff * time.Duration(attempt+1))

                // Attempt to re-read the problematic entry
                if d == nil {
                    // Directory read failure, try again
                    entries, retryErr := os.ReadDir(path)
                    if retryErr == nil {
                        fmt.Printf("Retry successful for directory %s\n", path)
                        // Process recovered entries
                        for _, entry := range entries {
                            entryPath := filepath.Join(path, entry.Name())
                            if procErr := processEntry(entryPath, entry); procErr != nil {
                                return procErr
                            }
                        }
                        return filepath.SkipDir  // Already processed
                    }
                } else {
                    // Entry access failure, try to get info again
                    if _, retryErr := d.Info(); retryErr == nil {
                        fmt.Printf("Retry successful for entry %s\n", path)
                        return processEntry(path, d)
                    }
                }
            }

            // All retries failed
            return fmt.Errorf("failed after %d retries: %w", maxRetries, err)
        }

        return processEntry(path, d)
    }
}
Enter fullscreen mode Exit fullscreen mode

The graceful degradation strategy continues operation with reduced functionality:

type GracefulWalker struct {
    processedCount int
    errorCount     int
    maxErrors      int
}

func (gw *GracefulWalker) walkWithDegradation(path string, d fs.DirEntry, err error) error {
    if err != nil {
        gw.errorCount++

        // Log error but continue
        fmt.Printf("Error %d: %s - %v\n", gw.errorCount, path, err)

        // Abort if too many errors
        if gw.errorCount > gw.maxErrors {
            return fmt.Errorf("too many errors (%d), aborting", gw.errorCount)
        }

        return nil  // Continue despite error
    }

    gw.processedCount++
    return processEntry(path, d)
}
Enter fullscreen mode Exit fullscreen mode

The error isolation strategy quarantines problematic areas while continuing elsewhere:

type IsolatingWalker struct {
    quarantinedPaths map[string]error
    mutex           sync.RWMutex
}

func (iw *IsolatingWalker) walkWithIsolation(path string, d fs.DirEntry, err error) error {
    if err != nil {
        iw.mutex.Lock()
        iw.quarantinedPaths[path] = err
        iw.mutex.Unlock()

        fmt.Printf("Quarantined: %s (%v)\n", path, err)

        // Skip this subtree but continue elsewhere
        if d != nil && d.IsDir() {
            return filepath.SkipDir
        }
        return nil
    }

    // Check if we're in a quarantined subtree
    iw.mutex.RLock()
    for quarantined := range iw.quarantinedPaths {
        if strings.HasPrefix(path, quarantined) {
            iw.mutex.RUnlock()
            return filepath.SkipDir
        }
    }
    iw.mutex.RUnlock()

    return processEntry(path, d)
}
Enter fullscreen mode Exit fullscreen mode

The error collection approach gathers all errors for batch reporting:

type ErrorCollector struct {
    errors []error
}

func (ec *ErrorCollector) collectErrors(path string, d fs.DirEntry, err error) error {
    if err != nil {
        ec.errors = append(ec.errors, fmt.Errorf("%s: %w", path, err))
        return nil  // Continue despite error
    }

    if procErr := processEntry(path, d); procErr != nil {
        ec.errors = append(ec.errors, fmt.Errorf("%s: %w", path, procErr))
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

The selective error handling approach treats different error types differently:


func selectiveHandling(path string, d fs.DirEntry, err error) error {
    if err != nil {
        // Log permission errors but continue
        if os.IsPermission(err) {
            log.Printf("Permission denied: %s", path)
            return nil
        }

        // Fail on other errors
        return err
    }

    return processEntry(path, d)
}
Enter fullscreen mode Exit fullscreen mode

Symbolic Link Behavior

WalkDir implements specific symbolic link handling policies that differ significantly from traditional file system traversal tools. Understanding these behaviors prevents security vulnerabilities and infinite loops while maintaining predictable traversal characteristics.

Root Symlink Resolution

When the root path passed to WalkDir is itself a symbolic link, the function resolves it before beginning traversal. This resolution applies only to the root path and establishes the actual starting point for the walk operation.

// File system setup:
// /real/project/
//   ├── src/main.go
//   └── docs/readme.md
// /links/current -> /real/project

func demonstrateRootResolution() {
    fmt.Println("Walking symlinked root:")

    err := filepath.WalkDir("/links/current", func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        fmt.Printf("Path: %s\n", path)
        return nil
    })

    if err != nil {
        log.Fatal(err)
    }
}

// Output:
// Path: /links/current          <- Root symlink kept in paths
// Path: /links/current/src
// Path: /links/current/src/main.go
// Path: /links/current/docs  
// Path: /links/current/docs/readme.md
Enter fullscreen mode Exit fullscreen mode

Notice that while WalkDir resolves the root symlink to determine what to traverse, it preserves the original symlink path in the callback parameters. This behavior maintains path consistency for your application logic while ensuring the traversal reaches the intended content.

The resolution only affects traversal scope, not path reporting:

func analyzeRootSymlink() {
    symlinkRoot := "/links/project-v2"  // -> /real/project-v2

    err := filepath.WalkDir(symlinkRoot, func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        // All paths maintain the symlink prefix
        if strings.HasPrefix(path, symlinkRoot) {
            fmt.Printf("Consistent path: %s\n", path)
        }

        // But the actual traversal follows the resolved target
        if path == symlinkRoot {
            realPath, _ := filepath.EvalSymlinks(path)
            fmt.Printf("Root %s resolves to %s\n", path, realPath)
        }

        return nil
    })
}
Enter fullscreen mode Exit fullscreen mode

Root symlink resolution has security implications for applications that perform path-based access control:

func secureWalk(allowedRoot string) filepath.WalkDirFunc {
    // Resolve the allowed root to handle symlinks
    resolvedRoot, err := filepath.EvalSymlinks(allowedRoot)
    if err != nil {
        resolvedRoot = allowedRoot  // Fallback to original
    }

    return func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        // Verify we're still within allowed boundaries
        resolvedPath, _ := filepath.EvalSymlinks(path)
        if !strings.HasPrefix(resolvedPath, resolvedRoot) {
            return fmt.Errorf("path escaped allowed root: %s", path)
        }

        return processEntry(path, d)
    }
}
Enter fullscreen mode Exit fullscreen mode

Directory Symlink Non-Following

Unlike root symlinks, WalkDir does not follow symbolic links to directories encountered during traversal. This policy prevents infinite loops and maintains bounded traversal behavior.

// File system setup:
// /project/
//   ├── src/
//   │   ├── main.go
//   │   └── vendor -> ../../shared/vendor
//   ├── docs/
//   │   └── api -> ../external/api-docs
//   └── backup -> /archive/project-backup

func demonstrateSymlinkNonFollowing() {
    err := filepath.WalkDir("/project", func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        if d.Type()&fs.ModeSymlink != 0 {
            if d.IsDir() {
                fmt.Printf("Directory symlink (not followed): %s\n", path)
            } else {
                fmt.Printf("File symlink: %s\n", path)
            }
        } else {
            fmt.Printf("Regular entry: %s\n", path)
        }

        return nil
    })
}

// Output:
// Regular entry: /project
// Regular entry: /project/src
// Regular entry: /project/src/main.go
// Directory symlink (not followed): /project/src/vendor
// Regular entry: /project/docs
// Directory symlink (not followed): /project/docs/api
// Directory symlink (not followed): /project/backup
Enter fullscreen mode Exit fullscreen mode

The non-following behavior applies only to directory symlinks. File symlinks are reported but not dereferenced:

func handleSymlinks(path string, d fs.DirEntry, err error) error {
    if err != nil {
        return err
    }

    if d.Type()&fs.ModeSymlink != 0 {
        // Get symlink target for informational purposes
        target, linkErr := os.Readlink(path)
        if linkErr != nil {
            fmt.Printf("Broken symlink: %s\n", path)
            return nil
        }

        if d.IsDir() {
            fmt.Printf("Directory symlink %s -> %s (not traversed)\n", path, target)
            // WalkDir will not descend into this directory
        } else {
            fmt.Printf("File symlink %s -> %s\n", path, target)
            // Process as needed, but content comes from target
        }
    } else {
        fmt.Printf("Regular %s: %s\n", 
            map[bool]string{true: "directory", false: "file"}[d.IsDir()], 
            path)
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

This behavior protects against common symlink attack patterns:

// Dangerous symlink structure that WalkDir handles safely:
// /tmp/safe/
//   ├── data/
//   │   └── important.txt
//   ├── loop -> loop          <- Self-referencing symlink
//   ├── escape -> /etc        <- Directory escape attempt
//   └── cycle -> ../safe      <- Parent directory cycle

func safeTraversal() {
    count := 0
    err := filepath.WalkDir("/tmp/safe", func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        count++
        if count > 1000 {  // Safety check
            return fmt.Errorf("traversal too deep, possible loop")
        }

        fmt.Printf("Safe traversal: %s\n", path)
        return nil
    })

    fmt.Printf("Completed safely with %d entries\n", count)
}
Enter fullscreen mode Exit fullscreen mode

When you need to follow directory symlinks, implement custom logic with cycle detection:

type SymlinkFollower struct {
    visited map[string]bool
    mutex   sync.RWMutex
}

func (sf *SymlinkFollower) followingWalk(root string) error {
    sf.visited = make(map[string]bool)
    return sf.walkWithFollowing(root)
}

func (sf *SymlinkFollower) walkWithFollowing(root string) error {
    return filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            return err
        }

        // Check for directory symlinks
        if d.IsDir() && d.Type()&fs.ModeSymlink != 0 {
            target, err := filepath.EvalSymlinks(path)
            if err != nil {
                return nil  // Skip broken symlinks
            }

            sf.mutex.Lock()
            if sf.visited[target] {
                sf.mutex.Unlock()
                fmt.Printf("Cycle detected, skipping: %s -> %s\n", path, target)
                return nil
            }
            sf.visited[target] = true
            sf.mutex.Unlock()

            // Recursively walk the symlink target
            fmt.Printf("Following directory symlink: %s -> %s\n", path, target)
            return sf.walkWithFollowing(target)
        }

        return processEntry(path, d)
    })
}
Enter fullscreen mode Exit fullscreen mode

Real-World Applications

The practical value of WalkDir emerges through real-world implementations that solve common file system problems. These examples demonstrate how to combine the traversal patterns into production-ready tools.

File Search Implementation

Building efficient file search tools requires combining multiple WalkDir features: pattern matching, early termination, and smart filtering. Here's a comprehensive search implementation:

type FileSearcher struct {
    patterns    []string
    maxResults  int
    maxDepth    int
    skipHidden  bool
    caseSensitive bool
    results     []SearchResult
}

type SearchResult struct {
    Path     string
    Size     int64
    ModTime  time.Time
    IsDir    bool
}

func NewFileSearcher(patterns []string, options ...SearchOption) *FileSearcher {
    fs := &FileSearcher{
        patterns:      patterns,
        maxResults:    100,
        maxDepth:      -1,
        caseSensitive: true,
        results:       make([]SearchResult, 0),
    }

    for _, opt := range options {
        opt(fs)
    }

    return fs
}

func (fs *FileSearcher) Search(root string) ([]SearchResult, error) {
    rootDepth := strings.Count(root, string(filepath.Separator))

    err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            // Log but continue on permission errors
            if os.IsPermission(err) {
                return nil
            }
            return err
        }

        // Check depth limit
        if fs.maxDepth >= 0 {
            currentDepth := strings.Count(path, string(filepath.Separator)) - rootDepth
            if d.IsDir() && currentDepth > fs.maxDepth {
                return filepath.SkipDir
            }
        }

        // Skip hidden files/directories if requested
        if fs.skipHidden && strings.HasPrefix(d.Name(), ".") {
            if d.IsDir() {
                return filepath.SkipDir
            }
            return nil
        }

        // Check if we've hit the result limit
        if len(fs.results) >= fs.maxResults {
            return filepath.SkipAll
        }

        // Apply pattern matching
        if fs.matchesPattern(d.Name()) {
            info, err := d.Info()
            if err != nil {
                return nil  // Skip entries we can't stat
            }

            fs.results = append(fs.results, SearchResult{
                Path:    path,
                Size:    info.Size(),
                ModTime: info.ModTime(),
                IsDir:   d.IsDir(),
            })
        }

        return nil
    })

    return fs.results, err
}

func (fs *FileSearcher) matchesPattern(name string) bool {
    if !fs.caseSensitive {
        name = strings.ToLower(name)
    }

    for _, pattern := range fs.patterns {
        searchPattern := pattern
        if !fs.caseSensitive {
            searchPattern = strings.ToLower(pattern)
        }

        // Support both glob patterns and substring matching
        if matched, _ := filepath.Match(searchPattern, name); matched {
            return true
        }

        if strings.Contains(name, searchPattern) {
            return true
        }
    }

    return false
}

// Usage example
func demonstrateFileSearch() {
    searcher := NewFileSearcher(
        []string{"*.go", "*test*"},
        WithMaxResults(50),
        WithMaxDepth(3),
        WithSkipHidden(true),
    )

    results, err := searcher.Search("/project")
    if err != nil {
        log.Fatal(err)
    }

    for _, result := range results {
        fmt.Printf("Found: %s (%d bytes, %s)\n", 
            result.Path, result.Size, result.ModTime.Format("2006-01-02"))
    }
}
Enter fullscreen mode Exit fullscreen mode

Directory Cleanup Tools

Cleanup tools require careful error handling and confirmation mechanisms to avoid data loss. This implementation provides safe cleanup with rollback capabilities:

type CleanupTool struct {
    dryRun      bool
    rules       []CleanupRule
    deletedSize int64
    deletedCount int
    errors      []error
}

type CleanupRule interface {
    ShouldDelete(path string, d fs.DirEntry, info os.FileInfo) bool
    Description() string
}

type AgeBasedRule struct {
    maxAge time.Duration
    pattern string
}

func (r *AgeBasedRule) ShouldDelete(path string, d fs.DirEntry, info os.FileInfo) bool {
    if d.IsDir() {
        return false  // Don't delete directories by age
    }

    if r.pattern != "" {
        if matched, _ := filepath.Match(r.pattern, d.Name()); !matched {
            return false
        }
    }

    return time.Since(info.ModTime()) > r.maxAge
}

func (r *AgeBasedRule) Description() string {
    return fmt.Sprintf("Files older than %v matching %s", r.maxAge, r.pattern)
}

type SizeBasedRule struct {
    minSize int64
    pattern string
}

func (r *SizeBasedRule) ShouldDelete(path string, d fs.DirEntry, info os.FileInfo) bool {
    if d.IsDir() {
        return false
    }

    if r.pattern != "" {
        if matched, _ := filepath.Match(r.pattern, d.Name()); !matched {
            return false
        }
    }

    return info.Size() > r.minSize
}

func (r *SizeBasedRule) Description() string {
    return fmt.Sprintf("Files larger than %d bytes matching %s", r.minSize, r.pattern)
}

func (ct *CleanupTool) AddRule(rule CleanupRule) {
    ct.rules = append(ct.rules, rule)
}

func (ct *CleanupTool) Clean(root string) error {
    fmt.Printf("Starting cleanup of %s (dry run: %v)\n", root, ct.dryRun)

    for _, rule := range ct.rules {
        fmt.Printf("Rule: %s\n", rule.Description())
    }

    err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            ct.errors = append(ct.errors, fmt.Errorf("access error %s: %w", path, err))
            return nil  // Continue despite errors
        }

        // Skip the root directory
        if path == root {
            return nil
        }

        // Get file info for rule evaluation
        info, err := d.Info()
        if err != nil {
            ct.errors = append(ct.errors, fmt.Errorf("stat error %s: %w", path, err))
            return nil
        }

        // Check if any rule applies
        shouldDelete := false
        var matchedRule CleanupRule

        for _, rule := range ct.rules {
            if rule.ShouldDelete(path, d, info) {
                shouldDelete = true
                matchedRule = rule
                break
            }
        }

        if shouldDelete {
            ct.deletedSize += info.Size()
            ct.deletedCount++

            if ct.dryRun {
                fmt.Printf("Would delete: %s (%d bytes) - %s\n", 
                    path, info.Size(), matchedRule.Description())
            } else {
                if err := os.Remove(path); err != nil {
                    ct.errors = append(ct.errors, fmt.Errorf("delete error %s: %w", path, err))
                } else {
                    fmt.Printf("Deleted: %s (%d bytes)\n", path, info.Size())
                }
            }

            // Skip directory contents if we deleted the directory
            if d.IsDir() {
                return filepath.SkipDir
            }
        }

        return nil
    })

    ct.printSummary()
    return err
}

func (ct *CleanupTool) printSummary() {
    action := "Would delete"
    if !ct.dryRun {
        action = "Deleted"
    }

    fmt.Printf("\nSummary:\n")
    fmt.Printf("%s %d files totaling %d bytes\n", action, ct.deletedCount, ct.deletedSize)

    if len(ct.errors) > 0 {
        fmt.Printf("Encountered %d errors:\n", len(ct.errors))
        for _, err := range ct.errors {
            fmt.Printf("  %v\n", err)
        }
    }
}

// Usage example
func demonstrateCleanup() {
    cleaner := &CleanupTool{dryRun: true}

    // Clean up old temporary files
    cleaner.AddRule(&AgeBasedRule{
        maxAge:  7 * 24 * time.Hour,  // 7 days
        pattern: "*.tmp",
    })

    // Clean up large log files
    cleaner.AddRule(&SizeBasedRule{
        minSize: 100 * 1024 * 1024,  // 100MB
        pattern: "*.log",
    })

    err := cleaner.Clean("/tmp/project")
    if err != nil {
        log.Fatal(err)
    }
}
Enter fullscreen mode Exit fullscreen mode

File System Auditing

Auditing tools analyze file system structure and permissions to identify security issues and compliance violations:

type FileSystemAuditor struct {
    checks       []SecurityCheck
    findings     []SecurityFinding
    totalFiles   int
    totalDirs    int
    totalSize    int64
}

type SecurityCheck interface {
    Check(path string, d fs.DirEntry, info os.FileInfo) *SecurityFinding
    Name() string
}

type SecurityFinding struct {
    Check    string
    Path     string
    Severity string
    Message  string
    Details  map[string]interface{}
}

type PermissionCheck struct {
    maxPermissions os.FileMode
}

func (pc *PermissionCheck) Check(path string, d fs.DirEntry, info os.FileInfo) *SecurityFinding {
    if info.Mode().Perm() > pc.maxPermissions {
        return &SecurityFinding{
            Check:    pc.Name(),
            Path:     path,
            Severity: "Medium",
            Message:  "File has overly permissive permissions",
            Details: map[string]interface{}{
                "current":    info.Mode().Perm().String(),
                "maximum":    pc.maxPermissions.String(),
                "is_dir":     d.IsDir(),
            },
        }
    }
    return nil
}

func (pc *PermissionCheck) Name() string {
    return "Permission Check"
}

type SuidCheck struct{}

func (sc *SuidCheck) Check(path string, d fs.DirEntry, info os.FileInfo) *SecurityFinding {
    if !d.IsDir() && (info.Mode()&os.ModeSetuid != 0 || info.Mode()&os.ModeSetgid != 0) {
        severity := "High"
        if info.Mode()&os.ModeSetuid != 0 {
            severity = "Critical"
        }

        return &SecurityFinding{
            Check:    sc.Name(),
            Path:     path,
            Severity: severity,
            Message:  "File has setuid/setgid permissions",
            Details: map[string]interface{}{
                "setuid": info.Mode()&os.ModeSetuid != 0,
                "setgid": info.Mode()&os.ModeSetgid != 0,
                "mode":   info.Mode().String(),
            },
        }
    }
    return nil
}

func (sc *SuidCheck) Name() string {
    return "SUID/SGID Check"
}

type HiddenFileCheck struct {
    allowedHidden []string
}

func (hfc *HiddenFileCheck) Check(path string, d fs.DirEntry, info os.FileInfo) *SecurityFinding {
    if strings.HasPrefix(d.Name(), ".") {
        // Check if this hidden file is in the allowed list
        for _, allowed := range hfc.allowedHidden {
            if matched, _ := filepath.Match(allowed, d.Name()); matched {
                return nil
            }
        }

        return &SecurityFinding{
            Check:    hfc.Name(),
            Path:     path,
            Severity: "Low",
            Message:  "Unexpected hidden file found",
            Details: map[string]interface{}{
                "name":   d.Name(),
                "is_dir": d.IsDir(),
            },
        }
    }
    return nil
}

func (hfc *HiddenFileCheck) Name() string {
    return "Hidden File Check"
}

func (fsa *FileSystemAuditor) AddCheck(check SecurityCheck) {
    fsa.checks = append(fsa.checks, check)
}

func (fsa *FileSystemAuditor) Audit(root string) error {
    fmt.Printf("Starting security audit of %s\n", root)

    err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
        if err != nil {
            finding := &SecurityFinding{
                Check:    "Access Check",
                Path:     path,
                Severity: "Medium",
                Message:  "Unable to access file system entry",
                Details:  map[string]interface{}{"error": err.Error()},
            }
            fsa.findings = append(fsa.findings, *finding)
            return nil  // Continue audit despite access errors
        }

        // Update statistics
        if d.IsDir() {
            fsa.totalDirs++
        } else {
            fsa.totalFiles++
            if info, err := d.Info(); err == nil {
                fsa.totalSize += info.Size()
            }
        }

        // Run security checks
        info, err := d.Info()
        if err != nil {
            return nil  // Skip checks if we can't get file info
        }

        for _, check := range fsa.checks {
            if finding := check.Check(path, d, info); finding != nil {
                fsa.findings = append(fsa.findings, *finding)
            }
        }

        return nil
    })

    fsa.generateReport()
    return err
}

func (fsa *FileSystemAuditor) generateReport() {
    fmt.Printf("\n=== Security Audit Report ===\n")
    fmt.Printf("Files scanned: %d\n", fsa.totalFiles)
    fmt.Printf("Directories scanned: %d\n", fsa.totalDirs)
    fmt.Printf("Total size: %d bytes\n", fsa.totalSize)
    fmt.Printf("Security findings: %d\n\n", len(fsa.findings))

    // Group findings by severity
    severityCount := make(map[string]int)
    for _, finding := range fsa.findings {
        severityCount[finding.Severity]++
    }

    for severity, count := range severityCount {
        fmt.Printf("%s: %d findings\n", severity, count)
    }

    fmt.Printf("\nDetailed findings:\n")
    for _, finding := range fsa.findings {
        fmt.Printf("[%s] %s: %s\n", finding.Severity, finding.Check, finding.Path)
        fmt.Printf("  %s\n", finding.Message)
        for key, value := range finding.Details {
            fmt.Printf("  %s: %v\n", key, value)
        }
        fmt.Println()
    }
}

// Usage example
func demonstrateAudit() {
    auditor := &FileSystemAuditor{}

    // Add security checks
    auditor.AddCheck(&PermissionCheck{maxPermissions: 0644})
    auditor.AddCheck(&SuidCheck{})
    auditor.AddCheck(&HiddenFileCheck{
        allowedHidden: []string{".git", ".gitignore", ".env"},
    })

    err := auditor.Audit("/var/www")
    if err != nil {
        log.Fatal(err)
    }
}
Enter fullscreen mode Exit fullscreen mode

Comments 0 total

    Add comment