Something that might be worth looking at is highlight_string() or highlight_file(), and see if the HTML markup it generates would give you sufficient handles (e.g. ID or class attributes) that you could filter out via the DOM extension. (No guarantees, just an off-the-top-of-my-head thought.)