MySQL INNODB Wishlist
Kind of interesting, but I don't pay much attention to MySQL any more. Wonder how many PostgreSQL already has?
PostgreSQL has page compression, to help save space there. and, since you've got functional indexes, it's possible to build something like this.
1. Packed indexes. In many cases then difference in Performace with MyISAM and Innodb is huge for read only workload it is caused by the fact Innodb does not have prefix compression for its indexes, as MyISAM has which in certain cases makes them 10 times + larger. Heikki plans to have gzip like page compression but it would only give up to 2 times space compression.
This is not a problem for PostgreSQL. Inserts are already buffered, but the manner in which roll backs are performed makes them very cheap and fast.
2. Insert buffer for delete. For Insert operation Innodb uses nice tecnique called Insert Buffer which speeds up inserts for non-unique indexes a lot. However deletes are not handled so they can be 10 times+ slower, which is especially bad for rollback of failed inserts which may take too long to complete
Again, not a problem for postgresql. Functional indexes together with clustering give you all this and more.
3. Index build by sort. This is in ToDo but so important I’ll still list it. It would allow data load to be much faster as well as ALTER TABLE (expecially together with ability to create index without table rebuilding)
Not really an equivalent I can think of in pgsql.
4. Buffer pool improvements. I would like to see couple of things here. To have kind of middle point insertion so full table scan do not wipe it off as well as different priorities for different kind of data, for example flushing of undo pages might be avoided if possible as we know they are going to be purged.
Given the current maximum size of an individual text field at ~ 1 billion characters, this is not a problem in postgresql. When you've only got one table type, you can't just say "use the other table type for that"... It's interesting that they reference a postgresql feature here.
5. Long Rows. One ugly thing Innodb currently does is storing first 700 bytes or something like that from any blob/string in the page. Which gives you the nice surprise of row being unable to fit to the page if many columns are long strings. I would like to see it at least doing it for indexed columns only or even better have option similar to PostgreSQL to specify where this column should be stored - with the row in the page or outside of it. The other item with blobs - currently if I’m not mistaken blobs are stored each in its own “segment” which means having many blobs in the same row is going to be slow. For medium sized blobs it would be much better to merge them and store on the same segment and ie only store blobs over 1M in their own segments.
Well, since every row should have some kind of primary key, and making a compound unique key using that key and the other non-unique key isn't so hard, I really don't think this is worth putting a lot of effort into.
6. Clustering. I love Innodb’s ability to cluster data by primary key it is so helpful in many cases, but in other cases you would like to cluster data by index which is not unique - for example cluster data by customer or by timestamp. Creating artificial second column is ugly and inconvenient. Having independent CLUSTER=key option would be great, it also would help to avoid penalty of very long primary key - you can simply select not to cluster by it instead of creating fake primary key and promoting real one to unique.
Had 'em since 8.0 came out.
7. Tablespaces. To call “innodb_file_per_table” option tablespaces is a joke. It is better than nothing but needs to be improved allowing many to many map of objects to the tablespaces. Especially being able to map different indexes to different tablespaces is important. This could be helpful in certain workloads. Yes you could simply use RAID and forget about tablespaces by spreading the load, but you often can do it better manually
[quote] 8. I/O On Linux you’re currently limited to innodb_file_io_threads=4. Some people do not know it but setting it to different values does not have any effect In certain cases multiple threads for some tasks would be helpful, especially for flushing dirty buffers. I would actually like to see it separated to different options - for example it does not make much sense to have more than one log flush thread. The other big lacking (especially as patch for it existed for years) is Async IO support in Linux. It should be very helpful with direct IO as well as with innodb_file_per_table=1 allowing to flush all of modified tablespaces in the parallel.[quote]
PostgreSQL is not built around a threaded model, but multi-process. Not sure how much about it and MySQL's innodb would be common.
Given the small difference gain in going to block sizes under 8k, I have a hard time getting excited over this idea. It just doesn't seem worth th effort of development for what I'm willing to be is a very small gain, given modern hard drive and raid controller's read ahead buffering and what not.
9. Multiple page sizes. One size does not fit all and for certain workloads 16K is too much while for some others too little. Being able to specify page size per object or at least per tablespace would be great. This will complicate buffer management a bit but Innodb is already to deal with 2 page sizes by supporting compressed pages so it might be not that bad. Also other guys are able to handle it - even MyISAM has 1K-4KB key blocks so it is well possible
If you've been reading the pgsql-hackers list the last year, you'll see Tom Lane has gotten nearly linearly scaling for postgresql on heavy loads on up to 32 cpus. It's quite exciting. Add in what's been done with eh bizgres postgresql folks, and postgresql is light years ahead of innodb right now. But Heiki's a good coder, who knows what he'll do.
10. Scalability It is actually bug but it is so bad I will list it. Basically Innodb has problems scaling both with large number of active threads and with large number of CPUs (in this bug you can see it is too bad even with 4 concurrent queries). The problem seems to be at least partially due to global mutexes in many places so it is not easy to fix. But I guess Heikki will have to do it otherwise with raise of MultiCores Innodb will become unusable for more and more types of workloads
PostgreSQL, because your data matters.
And what Oracle will let him do .
Originally Posted by Sxooter
I thought I recently read about yet another table type for mysql...
That other type was probably NDB, which is a weird mix of clustering and in memory database. Useful for certain small, high speed applications with little or no coupling between threads.
PostgreSQL, because your data matters.
I think it was solid solidDB:
solidDB Storage Engine for MySQL brings forth the robust, proven OLTP capabilities that Solid Information Technology has been providing and enhancing for more than 14 years. Key features of the solidDB Storage Engine include multi-version concurrency control, full transactional integrity with robust ACID support, online backup capabilities, fast performance for mixed workloads and robust security with database roles. A prototype of the solidDB Storage Engine for MySQL is currently available for download and community testing at http://www.solidtech.com/solidDBforMySQL/download
Finnish Revenge as mySQL Gets Solid Support
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)