Hi All,
I have a user table that I need to delete all records that are duplicates (based on the EmailAddress field). I have my site coded to not allow dupes (by query first then either update or insert), but I also have a page for employees to import records from a CSV with fgetscv that sometimes inserts records where that email address exists.
I tried about a dozen different approaches, and none worked.
Here is a sample of what I tried and end up with it timing out (The table has 1.8 million records).
$sql = "SELECT MIN(maxID) AS totalcount FROM max8 GROUP BY EmailAddress";
$result = mysql_query($sql) or die (mysql_error());
$num_rows = mysql_num_rows($result);
$unique_ids = array();
for($i=0;$i<$num_rows;$i++) {
$unique_ids[] = mysql_result($result,$i,0);
}
mysql_free_result($result);
$id_list = implode(',',$unique_ids);
$sql = "DELETE FROM max8 WHERE maxID not in($id_list)";
$query = mysql_query($sql) or die (mysql_error());
mysql_close($conn);
echo 'All duplicates where deleted';
I also noticed some addressed this by making a temp table, then exporting a unique select query, then comparing and deleting, but that seems like a round-about way.
My end goal it to make a process page, that I can run and either delete all the dupes (or if it is a timeout issue) delete them state by state.
Thoughts?
Regards,
Don