Monday, May 18, 2009

Parallel Bash

With todays multi-core processors Bash has a problem, as by default it only does sequential evaluation. One can launch processes into the background with the & operator, but that doesn't give much control over how many processes are launch at once. There however is a quick workaround:

function pwait() {
if [ "$#" -eq 0 ]; then
MAXPROC=12
else
MAXPROC=$1
fi

while [ $(jobs -p | wc -l) -ge $MAXPROC ]; do
sleep 1
done
}

The above can be inserted into .bashrc and then used like this:

for i in *; do
dosomething $i &
pwait 10
done

The parameter to pwait gives the number of parallel processes to run. You can toy around with that value as well as with the sleep time to improve results. It would be better to be able to use wait instead of busy waiting, but there doesn't seem to be an easy way to accomplish that.

Edit: The dosomething must not be in (), since else jobs won't catch it and the thing will not work.

1 comment:

tange said...

GNU Parallel http://www.gnu.org/software/parallel/ provides a more general solution to running jobs in parallel - even on remote computers.

Watch the basic video at http://www.youtube.com/watch?v=LlXDtd_pRaY