next id for INSERTS on a MASTER-MASTER balanced system

redna379

Hello.
This post is about getting an ID for insert statements in PHP pages.
I am running an app i wrote which really may lead to many concurrent users, and so the risk of errors due to key duplication or wrong IDs is high.

The main problem is naturally related to multiple and linked tables inserts, IE:
1) insert a row for a company identifyed by company_id into "companies" table
2) insert various products for that company into "products" table (by filling correctly a field called "company_id")
3) insert some config values into "configs" table (by filling correctly two fields called "company_id" and "product_id")

If more users will do this in the same moment, it may happen that some record will not be saved because of duplicated keys, especially on step 3.
This is just a simple example, my working ones sometimes links many more tables, and inserts may meed some seconds to be fully completed.

For various reasons i don't want to use mysql autoincrement for index, so initially i simply did this to get an unique ID, using my prev sample:

$nextid=mysql_result(mysql_query('SELECT max(id) FROM `companies`'),0,0);
$nextid++;
INSERT COMPANY (using $nextid as id)
$nextid_prod=mysql_result(mysql_query('SELECT max(id) FROM `products`'),0,0);
foreach products {
	$nextid_prod++;
	INSERT PRODUCT (using $nextid_prod as id and $nextid as company_id)
	$nextid_conf=mysql_result(mysql_query('SELECT max(id) FROM `configs`'),0,0);
	foreach configs {
		$nextid_conf++;
		INSERT CONFIG (using $nextid_conf as id, $nextid_prod as product_id and $nextid as company_id)
	}
}

This may be good in some way, but not in case of heavy usage.
If an operation will take a bit more time, another concurrent one may get the same ID and both "users" will try to insert a row with the very same ID (which is obviously bad).

So now i am trying using an auxiliary table to get an ID before almost every INSERT statement of my app: here it is.

function db_next_id($table) {
	$query='SELECT value FROM `id_locker` WHERE `table`="'.$table.'"';
	$result=mysql_query($query);
	if (mysql_num_rows($result)>0) {
		mysql_query('LOCK TABLES `id_locker` WRITE');
		$nextid=mysql_result($result,0,'value');
		$nextid++;
		$upquery='UPDATE `id_locker` SET `value`="'.$nextid.'" WHERE `table`="'.$table.'"';
		mysql_query($upquery);
		mysql_query('UNLOCK TABLES');
	} else {
		$nextid=mysql_result(mysql_query('SELECT max(id) FROM `'.$table.'`'),0,0);
		if ($nextid>0) { $nextid++; } else { $nextid=1; }
		$upquery='INSERT INTO `id_locker` (`table`,`value`) VALUES ("'.$table.'","'.$nextid.'")';
		mysql_query($upquery);
	}
	return $nextid;
}

As you can see the "SELECT max(id)" is still used, but only once to create the wanted record, if it does not exists already.
In some way, this is just a simple method to simulate an autoincrement, but which is controlled by the app.
No matter if a record with the returned ID will be really inserted or not: the point is to create an ID that's reserved for the script who asked for it.

A side question i have is, of course, if you think this is a good approach for an heavy production LAMP app.

To me all sounds ok and plain, infact my real question is different: I built a cluster to run al of this, which is actually using two apache servers and only one mysql server.
I'd like to do at least two mysql as well, and i already succesfully created a MASTER-MASTER replication (both mysql are Master and Slave of each other), as well as an HAProxy based mysql load balancer to distrib requests between them.
The point i couldn't found info on is about LOCK TABLES and SLAVES.

The function i posted seems fine on a single server, but in case of replication slaves, is the lock spreaded? Or it may still happen that a client getting connected to the "other" machine will still risk to obtain the same $nextid?

Or, as i can still change anything if i want, would you have some better suggestions for a robust "db_next_id" function which should work over a balanced MASTER-MASTER system?

Thank you in advance for any help

laserlight

A solution lies in [man]mysql_insert_id/man. However, I suggest that you use the PDO extension or the MySQLi extension instead of the legacy MySQL extension. They have the same functionality in this area, but provide other useful functionality such as support for prepared statements.

redna379

mysql_insert_id() is not really one for me, unfortunatly.
As i mentioned, for various reasons i don't want to use mysql autoincrement for index, but that's precisely what mysql_insert_id requires (Retrieves the ID generated for an AUTO_INCREMENT column by the previous INSERT query).

I will look at MySQLi, evenif that will long... i should revise around 20 MB of PHP scripts only, i'll let you imagine the amount of code!
But thanks for the hint, i was so used to legacy functions that i never looked at it.

Out of that, and also becasue that may be useful for me in future in any case: do you know if and how lock tables behaves on a mysql slave replication system?

laserlight

redna379 wrote:
mysql_insert_id() is not really one for me, unfortunatly.
As i mentioned, for various reasons i don't want to use mysql autoincrement for index, but that's precisely what mysql_insert_id requires (Retrieves the ID generated for an AUTO_INCREMENT column by the previous INSERT query).

Ah, I did not catch that part. Why do you not want to use auto increment?

Have you considered the use of transactions with stored procedures instead of an auxiliary table? I have never tried such a thing myself, but it may be a cleaner and possibly more efficient solution.

redna379

I don't want auto increment because of 2 main reason:

1) While auto increment is really nice in case of records where ID is not so important immediately, when performing actions over many tables linked together by their IDs it may give issues.
For example, there are parts where i save all queries i need to perform into an array, and they will be executed only if all checks i do in the middle are ok.
In that case, autoincrement is unuseful, as i will have no executed query to know the ID that will be used... but still, other queries in the array will need to refer to that ID.
So i need to know what the ID will be before actually executing the queries.

2) In my app there are also tons of delete, not only inserts.
Up to know, i have a DB with 54 tables and 1,422,638 records total.
Auto increment always... increase, while for my need it's better to evenutally "reuse" an ID that's no more there (this is why i used the select MAX).
It's usual that an average operation of the tool adds or removes around 90 records each time, so i would really waste an incredible amount if space between IDS using auto increment.

I will now look also at transactions now, thanks for the advice: I'm into this issue only since yesterday, becasue i realized the potential "concurrent" issue just 2 days ago (i know, my fault... but i'm still human 😛).
Every word and hint can surely help, so i will keep you informed here if i'll find out something useful.

Unfortunatly there are not so many references about PHP+Mysql where mysql is a balanced Master-Master, which is the main cause of the issue itself.

redna379

Small update:

as i am now looking for MySQLi and i found about common issues in migrating from MySQL, i wrote a quick tutorial on the solution i adopted for myself.
It's here:
Converting from MySQL to MySQLi