SSIS - Skipping rows and stripping subtotals in Excel
I was working on an SSIS project to dump data from an Excel spreadsheet into a database table. However, the Excel file happened to be a beautifully formatted, colorful, presentation of charts and graphs, subtotal rows and all sorts of interesting images. The actual data I needed to dump didn't even start until about 15 rows down and 6 rows across. My mind started wandering on some custom component that would skip rows, strip subtotal columns and basically rewrite the entire spreadsheet. But SSIS had better plans.
Thank goodness for the OpenRowset() property on the Excel Datasource component. This property allows you to specify the range to be considered by the datasource in the format Sheet1$B15:Z2000. By specifying a range, I was able to ignore all the titles and data spread all over the sheet and concentrate just on my data range. I thought skipping rows and ignoring headers / titles was going to add considerable hours to this project. I was wrong.
The next challenge was getting rid of the subtotal rows. Subtotal rows were distinguished by a cell with the word "RESULT" and an empty cell next to it. I added a Conditional Split using the ISNULL([expression]) to separate "real data rows" from "subtotal rows". After this, the import was straightforward into a OLEDB Destination.