01 Ernie 20090331 19:30 20090331 00:00
First problem is that the ending date is wrong (unless Ernie managed to finish before he started).
h4x0rmx wrote:at the same time I'm reading the file and parsing it, I have to store the parsed data on the DB
Or you could read all the data at once, fix it, then write it all to the database.
h4x0rmx wrote:without timing out the script.
500 records shouldn't take too long (I just tried the below code for 1600 records and it ran in under two seconds). Besides, even if it did take a long time it's possible to adjust timeout whenever necessary.
You don't say what the ID relates to. Name? Shift? What I am going to assume is that two records with the same ID belong to the same shift by the same person.
I reckon this could all be done in the database using a temporary table for scratch space.
This does the first part (the shift differential stuff depends on information not given). All it really does is fix those broken end dates and joins the bits that were split over midnight. I've got it writing a new csv file for the sake of having something to look at. Also, as it happens, it changes the timestamp into something that would be easier to insert into a database's time type.
/* All standard disclaimers about forum-posted example code apply */
// Read the CSV
$csv = fopen('junk.csv','rb');
$lines = array();
while(!feof($csv))
{
$lines[] = fgetcsv($csv,0,"\t");
}
fclose($csv);
array_pop($lines);
// Process
foreach($lines as $line)
{
// $junk because there are _two_ field delimiters between starttime and enddate
list($id,$name,$startdate,$starttime,$junk,$enddate,$endtime) = $line;
// Fix dates and join.
$start = strtotime("$startdate $starttime");
$end = strtotime("$enddate $endtime");
// the ending dates are wrong at midnight.
if($end<$start) $end = strtotime('+1 day', $end);
$start = date('Y-m-d H:i', $start);
$end = date('Y-m-d H:i', $end);
echo "$id\t$name\t$start\t$end\n";
if(isset($records[$id][$name]))
{
// This record has been split. Find when the existing part started and ended, and see how
// the new piece relates.
$found_start = $records[$id][$name]['start'];
$found_end = $records[$id][$name]['end'];
if($found_end == $start)
{
$records[$id][$name]['end'] = $end;
}
elseif($found_start == $end)
{
$records[$id][$name]['start'] = $start;
}
}
else
{
$records[$id][$name] = array('start'=>$start, 'end'=>$end);
}
}
ksort($records);
// Write a fixed CSV
$csv = fopen('fixed.csv','wb');
foreach($records as $id=>$name_record)
{
foreach($name_record as $name=>$times)
{
$start = $times['start'];
$end = $times['end'];
fputcsv($csv, array($id, $name, $start, $end), "\t");
}
}
Using a temporary table in the db, and SQL to do further processing could take some of the load off the application.