Hey Everyone,
I'm trying to parse through an HTML file using preg_match to find the author of a given artile in my archive. The title field for the article has the title and author, separated by the word "by"
<title>Confused by Terror by Mark Tooley</title>
Normally this would give me the phrase containg the author, because I know the phrase will start with "by" and end in "</title>":
"/ by [ .\w]*<\/title>/i"
That would give me "by Mark Tooley</title>" and I could just use strip_tags to get rid of the HTML tag and delete the first 3 characters to get rid of "by".
However, in this case, my Regex will give me:
"by Terror by Mark Tooley</title>"
because the word "by" is in the title. I can't limit the author's name to 2 words, because I've seen some authors with names of 4 words and over. How can I set up a regular expression that will insist that the phrase start with "by", end with "</title>", AND not have the word "by" in the middle?