I've got 15 directories, each containing between 100 and 1000 subdirs, each containinging between 100-1000 .html files -- all .html files were machine generated with poor code:
<HTML><HEAD><TITLE>AVOCADO DIP</TITLE></HEAD>
<!--#INCLUDE VIRTUAL="includes/recads.inc"-->
<FONT SIZE=5>AVOCADO DIP<FONT SIZE=3><br>
<hr>
3 avocados coarsely mashed<br>
1 tsp. salt<br>
1 Tbsp. lemon or lime juice<br>
1 tsp. Worcestershire sauce<br>
1 clove garlic, mashed<br>
2 medium tomatoes, peeled,<br>
seeded and chopped<br>
<br>Combine all ingredients; cover and chill.<br>
<!--#INCLUDE VIRTUAL="includes/recfoot.inc"-->
</HTML>
Now, I want to have it read all files in all subdirs and beable to write one (1) master csv with the following format
title,ingredients,instructions,category
title is the text between <title> & </title>, ingredients is from the <ht> to the two <br> tags... category should be foldername (numeric)with a marking to indicate subfolder (i.e. 12|368) I've got code in place to read and display all the file names- -- how would I start approaching this -- it's been a long time since I've had to do this scale of automated file reading
any help would be appreciated