I’m developing a web application and would like to make it as scalable as possible before having to go to multiple servers etc. I’ve got a few questions I’m hoping I can get some feedback on before coding:
InnoDB vs. MyISAM
I don’t need transactions, but I will have people reading and updating separate records of a few main tables frequently. There’s also a small chance that someone would be trying to read a record at the same time someone else is trying to update that same record. It’s not vital that the reader get the most up to date record, but I don’t want to have data corrupted in this situation. My main concern is high performance at scale.
My understanding is that with MyISAM, only one query can be executed against a table at any time, which is not the case with InnoDB?
If you’re solely logging data with very few reads, are you better off with MyISAM?
Would it be more efficient to have a Boolean flag for a yes/no value that you can filter in a WHERE clause, or have a separate table that has only the x_id value for the “yes” records to query with WHERE EXISTS or WHERE NOT EXISTS? Does it change depending on how likely a row is to be yes or no?
One part of the application will be to have a table that is: tag_id & tag. Some lookups will be giving a textual tag in the url that will need to lookup the id, and some will be looking up the tag for an id elsewhere in the database. These will both be very frequent occurnces. Would it be better to have two separate tables with different primary keys, or to have a tag_id > tag table with a 3 or 4 character index on the tag field? In either case, there will be a lot more reads than writes, but A LOT of reads, which storage engine would be preferable?
If speed is much more important than storage space, should the index be only on n characters of tag, or just on the entire field, which will probably be varchar(50)?
Is a primary key automatically an index, or is there more you could do to a primary key to improve performance?
Likewise, if I have a compound key with 2 or 3 fields, is there anything else to do for maximum speed?
After doing some searches on things like “best practices for scalable databases” I get a lot of results for data warehouse-type applications, are there any good resources for more basic and relatively plain English best practices?
Thanks in advance