Source - MorgueFile.com

Parallelizing Batch Jobs for Fun and Profit

I’ve been spending time lately on doing lots of boring batch jobs. Lots of slow boring batch jobs. So, I went on a mini crusade of parallelizing batch jobs to make them go faster.

On Linux, OSX and other unixes, GNU Parallel is your friend. No, not xargs. Xargs works, but GNU Parallel is built to take a list of stuff and manage making sure that x number of processes are doing it until the list is exhausted… without mixing the output into an unintelligible stream of computer consciousness. So to get a lot done  (in this case prettying up a lot of XML) quickly you can:

$ ls *.xml|parallel -j 15 xmllint --format {} ->

On Windows, PowerShell is your friend… until it isn’t. Turns out if calling COM and .NET objects is what you want to do, PowerShell has you covered. When it comes to doing something like using a queue to call a Windows executable, PowerShell has different and incompatible parsers that make it so you just can’t pass parameters without writing a non-PowerShell app to do it for you. The whole idea with a scripting language is to avoid using stuff like C# for automation… so out with PowerShell and in with Python… which leads to my next post:

Parallelizing Windows Batch Jobs with Python.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>