I have an issue that isn't really terribly specific to either part of our system, but I will explain the general setup and the issue we're having and I am hoping someone will have a good suggestion to deal with the issue.
We have an application which serves up a user interface to our customers. It is connected to a MySQL database. We run the application from two separate server locations. DNS has a record for each location and gives them each out for lookups. As a result, some clients may choose a location and run from it for as long as their session but other clients (browsers or otherwise) could potentially bounce from server to server with each request.
To deal with the issue of location movement on the fly, we use database storage for the PHP session data. We (currently) use PEAR::HTTP_Session2 with PEAR::MDB2 to hook this up (using the same database connection the application already uses).
Each location's web server also has its own MySQL server (separate machine) to which it connects. The database servers in each location are using mysql replication (circularly) to keep each database server updated in real time. So, if a user lands on Server A and starts a session and then clicks to another link and ends up requesting from Server B, everything should be seemless, as the session data is consistent.
The issue is coming up from time to time, however, where database replication crashes and burns when trying to create sessions that already exist (as part of replicating queries). The only possible reason I can come up with that this happens is because of latency. There may be times where either of our locations might have a spike or dip in latency between locations. I am guessing that a user is clicking through pages fast enough that they are hitting Server B before the newly created session from Server A has replicated. Thus, it creates the session row on Server B and then shortly after, Server B tries to replication the same "INSERT INTO sessiondata ..." query and breaks because the row already exists.
Taking it as given that we can not guarantee eliminating any possible latency issues, I need to find a way of working around this possible issue while still allowing the application to run from both locations as it currently does. Assuming these requirements cannot change, I need to just prevent this issue from happening.
As I see it, if my suspicion is correct regarding the reason for the issue, the fix isn't going to be perfectly seemless, because we're still going to have a user arriving at a server with what they believe is an active session but which will not exist. Instead of replication breaking though, I need the most friendly solution I can end up with.
I suspect that my best bet may be to extend HTTP_Session2 and override the methods that deal with creation and update of sessions and add some handling in there, like using REPLACE queries or something (modifying only my extended classes and not the core library).
I am very open to, and would appreciate, any suggestions.