Here are a couple examples, the concept is rather simple, the devil is in the details.
This example reads all the files in the current directory:
<?php
if ($handle = opendir('.')) {
while (false !== ($file = readdir($handle))){
if (!is_dir($file)){
echo "$file<br>";
}
}
closedir($handle);
}
?>
This example pulls in a remote page and parses everything between the <title> tag and the </title> tag. The code for your project would be making comparisons on unique aspects of the text surrounding your client information:
<?php
$file = fopen ("http://www.yahoo.com/", "r");
while (!feof ($file)) {
$line = fgets ($file, 1024);
if (eregi ("<title>(.*)</title>", $line, $out)) {
$title = $out[1];
break;
}
}
fclose($file);
echo $title;
?>
What I think you need to do first is something along the lines of the second example (except using local files). Have it address a page explicitly and get the comparison operators to match the unique text in your documents to pull the data out properly. After you get it working, put it inside the 'while' loop of the first example, add a couple lines of code to insert into the database, and WAHLAH! Miller time!
Of course, there might be a more elegant solution. If so, I am sure somebody will mention it 😃
(disclaimer: I do NOT drink Miller)