well... since you know the plain text comes first, followed by the html version, why not just parse the return from getBody to only scrape from the beginning to right before the <html> tag?
e.g.
this is plain text
<html>
...
this is plain text
...
</html>
grab everything before <html>