Archives

Archives / 2007 / May
  • A simple PowerShell script to find and replace using regular expressions in multiple files

    One thing I need which I come across from time to time is the ability to perform a find and replace operation in multiple files, using regular expressions. When this happens, I usually tend to exploit Visual Studio's own support for this kind of necessity; soon, however, I have to give it up and blame my favorite IDE for the lack of adherence with the regular expressions syntax adopted by the .NET framework, which I'm used to.
    So, today, after my umpteenth unsuccessful attempt with Visual Studio, I resolved to implement a simple PowerShell script stub, which would act as a strating point for performing this job for me hereafter. No, this is not by far a complete grep-like tool; I would like it to be just a demonstration of how easy, powerful and "clean" are PowerShell scripts like this one. And yes, I know there is plenty of third party tools which do this kind of things...

    To go down into the specifics of my problem, I was trying to combine a set of html files, that I grabbed after a CHM to HTM conversion, into a single one; since images inside these documents are just thumbnails contained inside an hyperlink which let the user eventually click to see the image at the original size, I want to perform some regular expressions substitution in order to have the original size image embedded directly into the document, have the thumbnails removed and the header and footer of each individual html file removed before being combined into the target one.
    Since PowerShell is a .NET managed shell, we can naturally use our beloved Regex class to perform our regular expressions substitution, thus adopting the syntax we are accustomed with.