Word files are a proprietary format. You can try to decipher the format, or you can pay Micro$oft a fee to get the deciphering license. You're in for a long haul, no matter what.
The alternative being suggested is best -- or better, at least. Save the articles as HTML, or if that seems too easy, then as RTF (Rich Text Format). Micro$oft supports RTF extremely well and you can with some effort create an RTF reader to parse RTF.
You will get nowhere trying to parse Word files on your own IMO. The same content saved on different computers can look very different, as Word includes info associated with the generating computer, for example, available fonts, style sheets (even if not used), etc.
Be smart and learn from people who have bloodied their heads by beating them against the Word brick wall.