More HTML and text parsing woes

I mentioned last week how I absolutely hate regular expressions, and that rewriting the POP Forums text parsing engine is going to be the death of me (especially to get it to work with FreeTextBox, "forum code" and various mixtures of IE and Mozilla-generated HTML).

I've managed to come up with an OK algorithm that makes sure tags are closed, and do not overlap. If they do overlap, they're properly nested. The challenge is getting block elements right, namely <p> and <blockquote>, to make sure that they appear as they should. So far, I'm sucking at that part pretty bad.

Somewhere just out of reach of my brain, I know I have a fresh solution that will cure cancer, but sitting in front of the two monitors for hours on end is starting to take its toll in terms of me maintaining an attention span. That and I'm supposed to be finishing the book revisions.

I have to keep reminding myself that these are good problems to have. I didn't have to get up early, drive in rush hour or answer to The Man today.

No Comments