Implementing Concurrency in Shell Scripts
Siddhant Khare

Siddhant Khare @siddhantkcode

About: Engineering @Gitpod | github1s maintainer | OSS @Greenstand @metacallio & @iosf_india | Opinions are personal | Working 996 | prev-Intern @GitHub | Personal blog: https://siddhantkhare2694.medium.com

Location:
Jabalpur, India
Joined:
Nov 20, 2019

Implementing Concurrency in Shell Scripts

Publish Date: May 11 '24
96 8

In this blog, we'll explore various techniques to achieve concurrency in shell scripts. Concurrency allows for executing multiple operations in parallel, which can dramatically reduce the processing time by handling tasks simultaneously. We'll cover several methods, from basic background execution to more sophisticated tools like GNU Parallel.

Basics of Concurrency in Shell Scripts

Concurrency in shell scripting is facilitated by executing multiple processes simultaneously. This parallel processing allows a script to initiate subsequent tasks without waiting for the previous tasks to complete.

1. Background Execution

In shell scripting, you can run commands in the background by appending an ampersand (&) to the end of the command. This tells the shell to start the command in a new background process, thus allowing the script to continue without waiting for the command to finish.

# Run a script in the background
./some_script.sh &
Enter fullscreen mode Exit fullscreen mode

When you execute a command in the background, the shell creates a new process, allowing immediate continuation of the script.

2. Using wait

The wait command is used to pause script execution until all specified background processes have completed. This is essential when your script must not proceed until earlier tasks have finished.

# Start multiple tasks in the background
./task1.sh &
./task2.sh &
./task3.sh &

# Wait for all background processes to finish
wait
echo "All tasks are completed."
Enter fullscreen mode Exit fullscreen mode

3. Managing Processes

To efficiently manage multiple processes, it's useful to store the process IDs (PIDs). You can then explicitly wait for specific processes to complete, which provides finer control over task synchronization.

#!/bin/bash

# Define tasks as functions
process1(){
    sleep 10
}

process2(){
    sleep 20
}

# Start each process in the background
process1 &
pid1=$!
process2 &
pid2=$!

# Wait for each process to complete
wait $pid1
wait $pid2

echo "Process 1 and Process 2 are completed."
Enter fullscreen mode Exit fullscreen mode

4. Parallel Processing with xargs

The xargs command is used to build and execute command lines from standard input. Using the -P option, you can specify the number of processes to run in parallel.

# Run commands in parallel across 4 processes
echo {1..10} | xargs -n1 -P4 ./process_item.sh
Enter fullscreen mode Exit fullscreen mode

In this example, the process_item.sh script is executed in parallel across four processes, efficiently managing up to ten tasks.

5. Using GNU Parallel

GNU Parallel is a powerful tool for running jobs concurrently. It is particularly useful for complex parallel processing scenarios and can manage a large number of simultaneous operations easily.

# Execute multiple scripts simultaneously using GNU Parallel
parallel ./script_{} ::: {1..10}
Enter fullscreen mode Exit fullscreen mode

This command will run script_1 to script_10 simultaneously, leveraging GNU Parallel’s capabilities to handle complex, high-volume parallel tasks.

Comparison of Commands

  • & and wait: This is the simplest method to achieve concurrency but may require manual process management.
  • xargs: Allows simple parallel execution with minimal system load, suitable for tasks that are not interdependent.
  • GNU Parallel: Offers extensive features for complex scenarios, ideal for large-scale data processing.

Summary

Utilizing concurrency in shell scripts can significantly reduce execution time. By applying methods such as background execution, xargs, and GNU Parallel, scripts can achieve high efficiency and manage multiple tasks concurrently. Proper use of these tools enables effective and efficient system operations, enhancing overall productivity.


Stay Connected and Get More Insights

If you found this guide helpful and are dealing with similar challenges, don't hesitate to reach out for personalized consulting at Superpeer. For more tech insights and updates, consider following me on GitHub. Let's innovate together!

Comments 8 total

  • Linus Fernandes
    Linus FernandesMay 11, 2024

    Great post.

  • Danish
    DanishMay 12, 2024

    Indeed a must needed post.

  • Scott Singley
    Scott SingleyMay 12, 2024

    Thank you for the post. I found it very informative and helpful to clarify the different methods.

  • Jason Cravens
    Jason CravensMay 12, 2024

    You can also use the shell variable directly: wait $!

    #!/bin/bash
    
    for i in {5..1}; do echo -n "$i.. "; sleep 1; done & wait $! || { echo "Error."; exit 1; }
    echo; echo "1: Success."; echo
    
    for i in {1..10}; do echo -n "$i.. "; sleep 1; done & wait $! || { echo "Error."; exit 1; }
    echo -e "\n2: Success.\n"
    
    exit 0
    
    Enter fullscreen mode Exit fullscreen mode

    Extra's, just for fun:

    1. || Is the opposite of &&. || runs if command fails.
    2. { ...; ...; } Command grouping. Treats the group as one command.
    3. echo -e The -e enables escape sequences.

    Command #1 uses an echo for new line's.
    Command #2 uses \n.

    Update: The xargs command is looking for textual throughput. It can work for some other data streams (utf-8 compatible), but if it contains any incompatible characters, you'll have a corrupted file. (For example, audio/video streams.)

    mkfifo my_pipe; echo {1..10} > my_pipe & ./process_item.sh
    
    Enter fullscreen mode Exit fullscreen mode

    This achieves the same parallel processing without any data constraints.

    • Joe
      JoeMay 14, 2024

      Nice, but using $! is fragile. If you later insert anything else that starts a process, then $! will change which may produce unexpected results. Also, a side benefit of using a variable is that its assignment will show up in a bash trace so you can more easily see what process(s) to examine if something is going wrong.

      • Jason Cravens
        Jason CravensMay 14, 2024

        That is a great thing to point out. I agree completely. This is intended entirely for in-line use. With this use case, you want the next process to take it's place. Never use $! directly, not if you need to reference it later.

  • Ellis
    EllisMay 14, 2024

    Very interesting/useful post.

    Try $ echo 111 & pwd & echo 222
    It gives the following for me:

    [1] 319
    111
    [2] 320
    222
    /home/xyz
    [1]   Done                    echo 111
    [2]+  Done                    pwd
    
    Enter fullscreen mode Exit fullscreen mode

    It shows one way to start multiple scripts at one go in parallel. In this occurrence both "echo"s seem to have completed first. You may want to add it to your post as one of the options. 👍

  • Code Hunter
    Code HunterJun 10, 2024

    Nice article. heads-up on wait. It can 'hang' your script if the child hangs (like a simple 'read'). Didn't know about gnu parallel. Judging from the manual, it warrants further study.

Add comment