I'm currently working on some code that parses HTML content found on a page and is put into a downloadable .csv file. I've used this guide as a reference and the tool found here. Here is my code that parses the HTML:
<?php
include "simple_html_dom.php";
$html = file_get_html('http://siteurl.com');
header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');
$fp = fopen("php://output", "w");
foreach($html->find('tr') as $element) {
$td = array();
foreach( $element->find('th') as $row)
{
$td [] = $row->plaintext;
}
fputcsv($fp, $td);
$td = array();
foreach( $element->find('td') as $row)
{
$td [] = $row->plaintext;
}
fputcsv($fp, $td);
}
fclose($fp);
?>
Everything is working perfectly except for one thing: It adds two blank rows below the rows with table header cells and an additional blank row after each row with normal table cells. Here is a screenshot of the outputted excel file:
[ATTACH]4921[/ATTACH]
How can I prevent it from doing this?
Oh and one additional question in regards to formatting: How can I make the date inside the csv file not be in military time? I want it to be this way as soon as the file is downloaded. In other words, the source HTML file has the date and time like this:
July 31, 2013 5:00 pm
but its output is this:
07/31/2013 17:00
I want it to show up with AM or PM (I'm okay with the '/'s).
Thanks for your help!
excel.jpg