Automated Search and Replace in Word 2007 documents with C#

I worked on an interesting problem last night and thought I'd post the code.  I'm working on a software conversion project which has a new requirements/use case structure, and I had a list of about 700 requirement numbers that each needed to be replaced with a new requirement number, throughout 20 Word documents that averaged 20 pages apiece.

 

Going through each document and doing 700 "Replace Alls" didn't sound like much fun, and there are lots more documents and requirements coming down the pike that will need this same operation done to them, so I embarked on a VSTO expedition.

 

I created a console app in Visual Studio to run the code, and the first thing I noticed is that the Office 12 (Office 2007) Primary Interop Assemblies were not registered on my PC.  A quick search came up with this Microsoft download that lets you install these to your GAC with an MSI.

 

Next, I found a great VB.Net code snippet in a Microsoft forum (it's the second post in the thread, from "Spotty") that gives the basic code needed to do this for a single file.

 

I would say that if you are going to do a lot of interop work, it may be worthwhile to use VB.Net; the support for optional parameters saves a lot of time.  But my initial conversion of Spotty's VB code looks like this:

 

Spotty's original VB.Net code:

Dim word As New Microsoft.Office.Interop.Word.Application
Dim doc As Microsoft.Office.Interop.Word.Document
Try
doc = word.Documents.Open("c:\test.doc")
doc.Activate()
Dim myStoryRange As Microsoft.Office.Interop.Word.Range
For Each myStoryRange In doc.StoryRanges
With myStoryRange.Find
.Text = "findme"
.Replacement.Text = "findyou"
.Wrap = Microsoft.Office.Interop.Word.WdFindWrap.wdFindContinue
.Execute(Replace:=Microsoft.Office.Interop.Word.WdReplace.wdReplaceAll)
End With
Next myStoryRange
doc.SaveAs("c:\test1.doc")
Catch ex As Exception
MessageBox.Show("Error accessing Word document.")
End Try

 

My conversion to C#:

(note: add a reference to Microsoft.Office.Interop.Word (version 12) and the Using statement below)

using Word = Microsoft.Office.Interop.Word;
        public static void DoSearchAndReplaceInWord()
        {
            // Create the Word application and declare a document
            Word.Application word = new Word.Application();
            Word.Document doc = new Word.Document();

            // Define an object to pass to the API for missing parameters
            object missing = System.Type.Missing;

            try
            {
                // Everything that goes to the interop must be an object
                object fileName = @"C:\myDocument.doc";

                // Open the Word document.
                // Pass the "missing" object defined above to all optional
                // parameters.  All parameters must be of type object,
                // and passed by reference.
                doc = word.Documents.Open(ref fileName,
                    ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing);

                // Activate the document
                doc.Activate();

                // Loop through the StoryRanges (sections of the Word doc)
                foreach (Word.Range tmpRange in doc.StoryRanges)
                {
                    // Set the text to find and replace
                    tmpRange.Find.Text = "findme";
                    tmpRange.Find.Replacement.Text = "findyou";

                    // Set the Find.Wrap property to continue (so it doesn't
                    // prompt the user or stop when it hits the end of
                    // the section)
                    tmpRange.Find.Wrap = Word.WdFindWrap.wdFindContinue;

                    // Declare an object to pass as a parameter that sets
                    // the Replace parameter to the "wdReplaceAll" enum
                    object replaceAll = Word.WdReplace.wdReplaceAll;

                    // Execute the Find and Replace -- notice that the
                    // 11th parameter is the "replaceAll" enum object
                    tmpRange.Find.Execute(ref missing, ref missing, ref missing,
                        ref missing, ref missing, ref missing, ref missing,
                        ref missing, ref missing, ref missing, ref replaceAll,
                        ref missing, ref missing, ref missing, ref missing);
                }

                // Save the changes
                doc.Save();

                // Close the doc and exit the app
                doc.Close(ref missing, ref missing, ref missing);
                word.Application.Quit(ref missing, ref missing, ref missing);
            }
            catch (Exception ex)
            {
                doc.Close(ref missing, ref missing, ref missing);
                word.Application.Quit(ref missing, ref missing, ref missing);
            }
        }

After this was up and running, setting up the data reader and looping though the directory to operate on all files was pretty straightforward -- the biggest tricks were declaring the "missing" object variable for Type.Missing, and adding the code to close the doc and exit the application.

 

If you set up a VSTO project, you get the "missing" object declared as a global variable, so you don't need to declare it.  But for stand-alone Word interop, I think this is pretty clean.

1 Comment

  • KG/Koen -

    I haven't come across this error -- if you want to send me a Word doc that is causing the code to throw this error, I can see if I can recreate the error and figure out what is happening.

    email to gstarbuck (at) hotmail.com

Comments have been disabled for this content.