Greetings community !
I've build a simple script that parses HTML content, which is written in a plain text file.
In each HTML file, there is a single item that I want to grab, always delimited by specifc HTML code. Problem is, those delimiters may change, and if its the case, I want to be warned, and that my script does not end looping forever, which is the case right now.
What im doing is quite simple. Im opening the file, reading the content line by line, and trying to match a Regex patern on each line to try and find delimiter_start of the item.
If it is found, I loop through remaining lines, fetch and manipulate HTML content, until I find the delimiter_end.
So here are bits and pieces.
$f = fopen ($file_path, "r");
while (!feof($f))
{
/* Get file content, one line at a time */
$line = fgets($f, 1024);
/* First HTML line we will seek */
if (preg_match("/$item_delimiter_start/", $line))
{
/* Read until this line */
while (!preg_match("/$item_delimiter_end/", $line))
{
// HTML manipulation, striping tags, fetching content and etc.
}
// content manipulation goes here...
}
}
I think there might be a problem of logic in my script, but need help figuring out where, and why.
In a much better case, I would need to stop if delimiter_start cannot be found, and warn if delimiter_end did not match any remaining lines.
Any pointers would be appreciated !
Thanks in advance.