My approach would be this:
Index your static content, db content, and perl content all seperately. Then in your btree index put a type on the document id - so you have document 1111 and it's type 3 which means you look in some table for id 1111 which correlates to a static page named your_faq.html of which you have indexed. Type 2 could be db content and thus just do a join on the db with the info you have indexed. Perl in the same manner.
In order to index static pages just do something like this:
<pre>
<?
$LS = ls /path/to/htdocs;
$static_files = explode("\n",$LS);
while(list($key,$val) = each($static_files)){
$fp = fopen($static_files);
// read it in
// grab between <title></title> and throw it in the DB
// throw the name in the DB and give it an id
// throw the first 50 words in the db to be indexed
fclose($fp);
}
?>
</pre>
Then you index the static pages from that table. I have been thinking about doing this on my company's site (TONS of static content, shitloads of DB content, etc.)
--Joe
http://www.miester.org/Joe/Stump/Joe_Stump_resume.html