I have a php script that I run (not through apache), that create a very large database. Three small columns, but ultimately billions of rows. My script creates the data, and inserts it into the table. This runs on a Core2Duo Mac mini and it gets about 7700 inserts per second. That's pretty quick.
Here's where it gets complicated though. Indexes slow things down too much while inserting, so I can't make the indexes until I'm done. This script may end up running for weeks before this table is done being built. I need a way to track my progress so I can pick up where I left off if I need to reboot the server, of if it crashes, or if the power goes out etc. But I can't look up my "last insert" by querying this table that may have 500 million rows and no indexes (yet).
So I created a second table. This table has one row, and one column. Every time I add a row in the big table, I update the one row in this table with that same primary string. This works well to give me a very reliable record of where I am in the creation of this big table. But here's the problem. My insertion rate into the big table drops from about 7700 inserts per second, down to 1188 inserts per second, when you add one UPDATE for every INSERT. Even though there is only one row in the table that's getting all the UPDATES. It just slows things WAY down.
So I'm wondering if anyone has any ideas for a way to 'keep my place' without slowing down the inserts so much? The data I need to save my place is just a large integer, or a small string, either way is fine. Any thoughts? Remember, what I need most is reliability. Even if my server crashes solid, I want to have my last value stored somewhere so I can restart it.