DB clustering/replication scheme where each node handles reads & writes?

sneakyimp

I'm working with a guy who has a Joomla 2.5 site with a lot of customized code. Thankfully, most of this customized code takes care to use the Joomla database access layer (JDatabase). His site is currently growing rapidly and it's looking like we'll need to eventually move to a replicated/clustered database to handle the traffic.

Having just read the MySQL documentation's Using Replication for Scale-Out page, it would appear that I'm going to need to make some modifications to the code to use the replication scheme described therein for this reason:

MySQL Docs wrote:
If the part of your code that is responsible for database access has been properly abstracted/modularized, converting it to run with a replicated setup should be very smooth and easy. Change the implementation of your database access to send all writes to the master, and to send reads to either the master or a slave. If your code does not have this level of abstraction, setting up a replicated system gives you the opportunity and motivation to clean it up.

(I'm so glad for this "opportunity" :p)

While Joomla 2.5 was smart enough to build in a few methods to distinguish inserts and updates from queries (e.g., insertObject and updateObject), The code I'm dealing with has many instances where developers neglected to use these distinct methods and instead inserted or updated records by manually constructing SQL:

	$sql = "INSERT INTO some_table (col1, col2, col3, col4, col5) VALUES (1, 2, 3, 4, 5)";
$jdbo->setQuery($sql);
$jdbo->execute();

So I'm wondering a few things:
Q1: Is there a MySQL replication/clustering scheme where the client doesn't need to route reads to a slave and writes to the master?
I.e., is there some other type of replication/clustering (hopefully free) where I can just leave all those manual SQL inserts/updates/deletes alone and just let Joomla speak to whichever database it is using without worrying about reads vs. writes?

Q2: Is it feasible to reliably sniff a query to determine if it should go to the master?
I.e., might it be possible to modify the execute method of the JDatabase object so that it uses pattern matching or parsing somehow to distinguish which queries should go to the master and which to a slave? While it seems simple enough to check a query to see if the first word is INSERT or UPDATE or DELETE, I expect there might be some INSERT...SELECT queries or even more bizarre selects with inserts and joins or something that might require a more nuanced approach. Any thoughts about an effective pattern-matching scheme would be much appreciated.

Q3: Must there always be a one-to-one relationship between slave db servers and application servers?
The diagram in the page I linked above has one application server for each db slave. I can imagine a scenario where one application server is enough but the database is getting worked to death so it might be helpful to have one application server connect to one of a few slaves. Is that right? If so, would the db slave be chosen at random or can someone recommend a more enlightened scheme (e.g., based on load average or something) which doesn't hamper performance?

johanafm

sneakyimp;11021721 wrote:
(I'm so glad for this "opportunity" :p)

Yay for you! \o/

sneakyimp;11021721 wrote:
The code I'm dealing with has many instances where developers neglected to use these distinct methods and instead inserted or updated records by manually constructing SQL:

Well, isn't that exactly the kind of code opportunity you're glad for? In other words: rewrite it! And should it for some reason make sense to have direct SQL code, you'd rewrite it by routing it through someInstance->queryRead() or someInstance->queryWrite() depending on read/write query type, so that the distinction becomes clear. But do use cooler names than queryRead and queryWrite.

sneakyimp;11021721 wrote:
So I'm wondering a few things:
Q1: Is there a MySQL replication/clustering scheme where the client doesn't need to route reads to a slave and writes to the master?

Not as far as I know, but I use stuff that others administrate... But personally I'd rather ask this question at a mysql forum.

sneakyimp;11021721 wrote:
I.e., is there some other type of replication/clustering (hopefully free) where I can just leave all those manual SQL inserts/updates/deletes alone and just let Joomla speak to whichever database it is using without worrying about reads vs. writes?

Perhaps there is - Have you checked PostgreSQL? They do have middleware solutions (as in all queries pass the middleware - writes are sent to all dbs, reads to one). Short explanation of options here (PostgreSQL Docs).

sneakyimp;11021721 wrote:
Q2: Is it feasible to reliably sniff a query to determine if it should go to the master?

Yes. If a statement starts with DESCRIBE, SHOW, SELECT (others?) - then it's a read. If not, it's a write. However, do note that this should be sniffed ONCE and NOW, not later and always. You did start clustering stuff for performance issues after all... In other words, use shell, your text editor or a php script to search for all occurances of those words, sift out the queries and then see:

sneakyimp;11021721 wrote:
(I'm so glad for this "opportunity" :p)

sneakyimp;11021721 wrote:
INSERT...SELECT queries or even more bizarre selects with inserts and joins or something that might require a more nuanced approach

Doesn't matter. INSERT ... SELECT is an insert statement. Yes, some of the inserted data is provided by means of a select, which in turn could come from wherever, but
1. It might just be one select query collecting data (from whichever server) and then issuing the insert (to the master)
2. An actual INSERT ... SELECT ... which could be split up to the above
3. An actual INSERT ... SELECT which, most likely for good reasons, is left as is and sent to the master. It is after all a write statement. And moving data from slave to master seems slower than handling it within master. But wether this is noticed might come down to network quality for all I know. And perhaps there is some obscure case where it would be faster selecting from slave to insert on master, but in that case I'm certain you'll pull your hair because it's allready too slow and then realizing that's how you have to do it, rather than starting in that end of the weird tree.

sneakyimp;11021721 wrote:
Any thoughts about an effective pattern-matching scheme would be much appreciated.

#describe|select|show#i

I'm guessing you had something more elaborate and fancy in mind, but in my opinion you have to search for these keywords only (and possibly others - go check the reserved words page of the docs to dig up all read-related keywords). Should you mix them with anything else, you may miss out on stuff for the simple reason that you might have code such as

$q = "SELECT";
/* bla bla bla */
$q .= "stuff FROM tbl";

sneakyimp;11021721 wrote:
Q3: Must there always be a one-to-one relationship between slave db servers and application servers?

No. Your code will create the db conncetions (new PDO, new mysqli...) which means your code determine where each connection goes. You could have master + 2 slaves with 3 web servers, where one issues reads to the master. Or master + slave with 3 web servers that send all writes to master but all reads to slave. Or 2 web servers which send everything to master, but if/when master fails - former slave is now master (= failover only).

I can imagine a scenario where one application server is enough but the database is getting worked to death so it might be helpful to have one application server connect to one of a few slaves. Is that right?[/QUOTE]
Everything is possible. It will depend entirely on the entire eco system: use cases, application code, server configuration, caching.

For example
1. Turning off query cache MAY improve performance (too high number of queries for decent cache hit ratio) => save time by never writing to query cache.
2. If application server runs smooth, but db doesn't: are you utilizing memcached to handle rarely changed data (e.g. rows memcached as serialized objects). Are you utilizing memcached to handle possibly more frequently updated data, but which also exist on every single page? One memcached server for several web servers will most likely work fine (direct access to RAM).
3. If application server runs smooth, are you using some kind of page caching techniques that allows you to store both entire pages and parts of pages as html for direct passthrough to client without needing any php parsing and thus will access no databases? That is, pages or parts thereof which can be considered static (within certain limits). Smarty's abilities in this regard is one of the reasons I like it, and it handles cache in two stages. The first is called pre-compiled templates (the template is turned into php code and looks butt ugly) - this usually only needs to be done if you change the template file (as in modify presentation code). The second stage is caching the output of a template, which is done as (almost) plain html. There are some constructs which allows for dynamic updates of the cache.
4. Nothing beats a db to death like inefficient queries as requests and data increase. In the same way a terribly slow O(log n) search will always beat a very fast O(n²⁾ search for n > than some threshold. Make sure everything is indexed (but only if needed). Make sure you do processing once, before write, if possible, rather than always on read.

Do note that if you use any kind of file caching on load balanced web servers (such as that built into smarty) you may have to set a script to listen for UDP multicasts for files to clear so that one server can issue cache invalidation on all servers.

Derokorian

sneakyimp;11021721 wrote:
Q1: Is there a MySQL replication/clustering scheme where the client doesn't need to route reads to a slave and writes to the master?
I.e., is there some other type of replication/clustering (hopefully free) where I can just leave all those manual SQL inserts/updates/deletes alone and just let Joomla speak to whichever database it is using without worrying about reads vs. writes?

I think the main response I have to this is that if you were to write an insert to the same table on 2 different slaves at the same time, you may have a collision if you have an auto incrementing primary key. Which is why you have to write to the master, so that the primary keys replicate properly when the replication occurs.Otherwise you need to have a system that can look for collisions and update not just the collision itself, but all FK references that may have been produced before collisions began being resolved. Otherwise something that should be linked to Adam's auto parts, instead gets linked to Megan's Message. Hopefully that makes sense, and helps clear up why you need to write to the master.

On a side note: all my systems use 2 connections, one for CRUD which uses a user with read only permissions; and another with full privileges to CRUD. Following this helps prevent hijacking a step further and also makes upgrading to replication if/when needed much easier.

Weedpacket

Just thought I'd mention that johanafm pointed to the docs for an old version of PostgreSQL. It's there on the linked page, but I thought it wouldn't hurt to explicitly link to the current version (for the benefit of spiders):
PostgreSQL. High Availability, Load Balancing, and Replication.

johanafm

Just stumbled over this when looking for other things: http://dev.mysql.com/downloads/cluster/

With a distributed, multi-master architecture …

so it would seem you can indeed get multi-master with MySQL.

sneakyimp

johanafm, thanks for your detailed response!

johanafm;11021881 wrote:
But do use cooler names than queryRead and queryWrite.

Like what? Those function names look pretty good to me.

johanafm;11021881 wrote:
Not as far as I know, but I use stuff that others administrate... But personally I'd rather ask this question at a mysql forum.

I'll see what I can get over at mysql.com. I'm hoping to refine the question a bit

johanafm;11021881 wrote:
Perhaps there is - Have you checked PostgreSQL? They do have middleware solutions (as in all queries pass the middleware - writes are sent to all dbs, reads to one). Short explanation of options here (PostgreSQL Docs).

That sounds suspiciously similar to the query-sniffing database layer I was talking about...

I'll be checking PostGres for more detail.

johanafm;11021881 wrote:
Yes. If a statement starts with DESCRIBE, SHOW, SELECT (others?) - then it's a read. If not, it's a write. However, do note that this should be sniffed ONCE and NOW, not later and always. You did start clustering stuff for performance issues after all...

Let me first say that you make a good point and yes I agree that fixing the code is the right way to go, but the cost of development to revisit all the code we have will be substantial. I fully realize that a little efficiency can be very helpful to solve scaling problems when the scale is large, things are sufficiently small now that throwing a little extra hardware in certainly seems cheaper. Also, I can only imagine that a really nice preg_match statement wouldn't be much worse than a layer of middleware.

johanafm;11021881 wrote:
Doesn't matter. INSERT ... SELECT is an insert statement.

I appreciate your thoughtful input on these query-related questions despite the fact that you have suggested fixing the code itself. I was kind of hoping for some magical suggestion for a good article on turning the mysql query descriptions into some sort of awesome preg pattern. I can think of some things where it might get tricky like:

#INSERT query comment here might confuse a query-sniffer
INSERT INTO mytable blah blah blah

or perhaps:

SELECT * FROM mytable WHERE somecol="INSERT"

or (and I don't know if this is possible) some query that starts with SELECT but contains deep inside it an insert/update/delete query? If that's not every possible, I'd be glad to know it for certain.

The idea of writing elaborate logic to break a single large query down and route its constituent queries to different servers sounds like a real chore and would likely be more than I am ready for. I feel reasonably happy assuming that any INSERT...SELECT queries could be happily sent to the server under the assumption that it is in fact the real data and any differences on a particular slave will eventually be overwritten by what happens on the master.

johanafm;11021881 wrote:
I could at least improve on that by starting with a ^\s to make sure
I'm guessing you had something more elaborate and fancy in mind, but in my opinion you have to search for these keywords only (and possibly others - go check the reserved words page of the docs to dig up all read-related keywords). Should you mix them with anything else, you may miss out on stuff for the simple reason that you might have code such as
$q = "SELECT";
/* bla bla bla */
$q .= "stuff FROM tbl";

This begins to suggest how tricky it'll be digging through tons of code (yes the right thing to do but *BARF*). Note that in my query-sniffing scheme, the sniffing would be done in my data abstraction layer immediately before the query is applied. Also, might it be easier to sniff for write-related keywords starting the query than read-related keywords. I can think of a few: INSERT/UPDATE/DELETE/ALTER/DROP/TRUNCATE. Maybe some others.

johanafm;11021881 wrote:
No. Your code will create the db conncetions (new PDO, new mysqli...) which means your code determine where each connection goes.

Wondering if anyone knows of common schemes? AFAIK, one's PHP logic is not privy to the current load average on all of the database servers and so would be unable to make sure it used a db server that is not busy. Would a random algorithm be sufficient? Additionally, there are probably session-related concerns. I expect my application would probably want to delegate local session-related actions to the slaves -- to enhance the read/write ratio in order to better utilize the advantage of the master/slave architecture.

johanafm;11021881 wrote:
For example
1. Turning off query cache MAY improve performance (too high number of queries for decent cache hit ratio) => save time by never writing to query cache.
2. If application server runs smooth, but db doesn't: are you utilizing memcached to handle rarely changed data (e.g. rows memcached as serialized objects). Are you utilizing memcached to handle possibly more frequently updated data, but which also exist on every single page? One memcached server for several web servers will most likely work fine (direct access to RAM).
3. If application server runs smooth, are you using some kind of page caching techniques that allows you to store both entire pages and parts of pages as html for direct passthrough to client without needing any php parsing and thus will access no databases? That is, pages or parts thereof which can be considered static (within certain limits). Smarty's abilities in this regard is one of the reasons I like it, and it handles cache in two stages. The first is called pre-compiled templates (the template is turned into php code and looks butt ugly) - this usually only needs to be done if you change the template file (as in modify presentation code). The second stage is caching the output of a template, which is done as (almost) plain html. There are some constructs which allows for dynamic updates of the cache.
4. Nothing beats a db to death like inefficient queries as requests and data increase. In the same way a terribly slow O(log n) search will always beat a very fast O(n²⁾ search for n > than some threshold. Make sure everything is indexed (but only if needed). Make sure you do processing once, before write, if possible, rather than always on read.

Good advice -- some of which I had not seen before. Thank you.

johanafm;11021881 wrote:
Do note that if you use any kind of file caching on load balanced web servers (such as that built into smarty) you may have to set a script to listen for UDP multicasts for files to clear so that one server can issue cache invalidation on all servers.

Another good tip, thank you again.

sneakyimp

Weedpacket;11021919 wrote:
Just thought I'd mention that johanafm pointed to the docs for an old version of PostgreSQL. It's there on the linked page, but I thought it wouldn't hurt to explicitly link to the current version (for the benefit of spiders):
PostgreSQL. High Availability, Load Balancing, and Replication.

Thanks for that!

sneakyimp

Derokorian;11021889 wrote:
I think the main response I have to this is that if you were to write an insert to the same table on 2 different slaves at the same time, you may have a collision if you have an auto incrementing primary key. Which is why you have to write to the master, so that the primary keys replicate properly when the replication occurs.Otherwise you need to have a system that can look for collisions and update not just the collision itself, but all FK references that may have been produced before collisions began being resolved. Otherwise something that should be linked to Adam's auto parts, instead gets linked to Megan's Message. Hopefully that makes sense, and helps clear up why you need to write to the master.

My thinking was running more or less identically to this, but I have learned not to underestimate the ingenuity of really serious developers. So far I've seen references to MASTER-MASTER configurations and other craziness. Also johanafm mentioned layers of middleware and such. Thank god for specialization. My limited experience with multi-threaded programming has taught me to fear concurrency, deadlock, race conditions, etc. The idea of writing a multi-machine, multithreaded beast that insures data consistency by numerous masters makes me want to wet my pants.

Derokorian;11021889 wrote:
On a side note: all my systems use 2 connections, one for CRUD which uses a user with read only permissions; and another with full privileges to CRUD. Following this helps prevent hijacking a step further and also makes upgrading to replication if/when needed much easier.

I like this notion -- I'm always interested in ways of enhancing security. Any thoughts on session handling in this context?

sneakyimp

johanafm;11021925 wrote:
Just stumbled over this when looking for other things: http://dev.mysql.com/downloads/cluster/

so it would seem you can indeed get multi-master with MySQL.

I had seen that and assumed (wrongly) that it was an expensive product that one must buy. I'll be looking into it more. Thanks again.

johanafm

sneakyimp;11021935 wrote:
I have learned not to underestimate the ingenuity of really serious developers. So far I've seen references to MASTER-MASTER configurations and other craziness.
The idea of writing a multi-machine, multithreaded beast that insures data consistency by numerous masters makes me want to wet my pants.

The reason you might go for a master-slave setup is that it is indeed a lot simpler to implement, and simpler locking mechanisms - or being able to skip them in some circumstances - is more efficient. With master-slave, you can have either synchronous or asynchronous updates. I.e. either you allow (the risk of) short term stale data to be fetched from the slave before it has been updated from master, or you do not. But either way, implementing the locks on the master is pretty much the same as implementing the locks in a stand-alone db server.
Implementing db-level locking across 2+ db masters wouldn't be too hard either. Just lock em all! But, that has serious implications for the rate of concurrency which will quickly approach 0 or near enough. So you still need to be able to implement row level locking, but across several servers. Iirc Oracle was the first to solve this problem many many years back, and I'd guess most dbms has this ability today. But, there is still overhead for handling these locks, so if you can choose a way to avoid them, all the better.
And indeed, pants-wetting is allowed and possibly even appropriate! There are good reasons relational dbms didn't come with clustered masters using row level locks to begin with. I'm fairly certain I'd have a hard time understanding how it works, even if someone with the know-how took the time to explain the techniques involved to me. Hell, I might not even understand it. But, that doens't mean we can't use it 🙂

sneakyimp;11021935 wrote:
I like this notion -- I'm always interested in ways of enhancing security. Any thoughts on session handling in this context?

Well, this is pretty much what you'd set up your db abstraction layer to do. If you're dealing with writes (CrUD), you'd be using the connection to the master, with privs for create, update and delete. If it's reads (cRud), you'd use a connection with read privs and a connection to one of the slaves (or the master).

johanafm

sneakyimp;11021931 wrote:
Let me first say that you make a good point and yes I agree that fixing the code is the right way to go, but the cost of development to revisit all the code we have will be substantial. I fully realize that a little efficiency can be very helpful to solve scaling problems when the scale is large, things are sufficiently small now that throwing a little extra hardware in certainly seems cheaper. Also, I can only imagine that a really nice preg_match statement wouldn't be much worse than a layer of middleware.

I appreciate your thoughtful input on these query-related questions despite the fact that you have suggested fixing the code itself. I was kind of hoping for some magical suggestion for a good article on turning the mysql query descriptions into some sort of awesome preg pattern. I can think of some things where it might get tricky like:

Like you stated later on in your reply, if you don't go for fixing the code now, but rather sniff at runtime, you would indeed do the check in middleware / database abstraction layer. When you have the final query as one string. And checking that shouldn't be too hard. The only statements I can think of that does not involve writes starts with: DESCRIBE, SHOW or SELECT. SELECT will most likely make up 95-100% of all reads. But, what is important to note here is that issuing reads to master is perfectly valid, possibly even disireable. You could have a 1 master, 1 slave setup. And something as low as a 3:1 ratio between reads and writes. Sending all reads to the slave would push 75% of the queries to the slave, which may not be desireable if you want to balance things.
Thus, you could issue all queries to the master by default (similat to a one-server architecture).
Then you do a simple match on the first word, and if it's SELECT you send it to "some server" according to some balancing scheme.
Should there be a couple of describes or shows, they'd go to the master, but that should be no problem. Especially not if you have 3 slaves handling all (found) select queries.

This leaves two things to handle. First off, how to find the read queries. Assuming most queries are uppercase and contain no leading WS I'd go with the simplest possible way

if (substr($query, 0, 6) == 'SELECT')

I only have assumptions to go by, while you could write a script which grabs all lines containing case insensitive SELECT to get an idea on how it looks before you choose how to do it. But, if my assumption is correct, it doesn't really matter if we miss 10% due to " SELECT", "select" etc. Those 10% of all selects would go straight to the master. The remaining 90% would go someplace else (depending on setup and number of servers). If you have a 5:1 ratio of reads:writes, you'd have another 17% writes going to the master. Well, you get my line of reasoning.

I'm also assuming that all queries, wether they use the intended built in methods or not, eventually pass a database abstraction layer. If they do, it's very easy to implement a crude query logging for a week or two. Just keep track of the number of "SELECT"s, other " selECT"s and any other query (which will be assumed to be writes). This way, you'd get a good estimate on read:write ratio and how exact your pattern matching has to be.

Next up is selecting a scheme to balance queries. I'm guessing, that if you use PostgreSQLs middleware thingy (don't remember the name but it was in the docs), it will probably handle this for you. If you have a simple master-slave setup without middleware, you might have to deal with balancing yourself. But the topic ought to be similar to how you balance other things, which shouls provide enough articles on the topic. Examples on how to do it:
- sticky: one web server -> reads from one slave
- round robin: your database abstraction layer sends every next query to the next slave in line.
- random: randomly select server
- application implemented: use asynchronous updates, send certain queries to master (so that they never get stale data), while all other queries uses one of the above schemes
- decision based on current loads: Not sure if it's possible to get an estimate on current load from a db through the ordinary api, or if it's possible through db monitoring software. Or how costly it would be. But perhaps it's possible to check the servers at the beginning of an HTTP request and then issue all db queries to the db server with the lowest load for that HTTP request.

And should you wish to deal with code rewriting over time, you could also implement logging in the dataase abstraction layer using debug backtraces to collect data on what classes, files etc most calls come from. Or what queries are the most common. And then rewrite those to use queryRead() and queryWrite() methods respecitvely. Oh and regarding those names, I was simply super tired, easily distracted by myself and kept losing track of the questions at hand. Which gave me the idea that they deserved some names with better marketing value, like "Web 2.0" and "AJAX" 😉