You could parse the fragment as a HTML document using the DOM loadHTML function, then spit it back out again as HTML. That will fix any HTML well-formedness errors.
This is not a completely trivial process, as you need to think about
- Encodings
- Adding / removing headers / footers
- How whitespace is handled
Mark