You wrote:
Question is: how efficient are CLOBs at storing a) data under 4001
characters and b) data over 4000 characters? As I understand it CLOBs get
stored inline with the row if the insert string is under 4001 characters
and elsewhere if they are over. But: if I always use bind variables will
they always get stored elsewhere or does Oracle still determine storage
itself?
Basically I don't want any performance or storage issues to negate the
benefits of using a CLOB. The BBS may grow to hundreds of thousands of
messages (98% of which will likely be less than 4000 characters). I hope
I can just insert everything the same way and have the CLOB basically work
like a VARCHAR2(4000+).
Since you are familiar with CLOBs, I assume you know Oracle will manage the
CLOBs completely transparent to your application. From your application's
perspective, the CLOB is pretty much just a large VARCHAR2(). (For info
on the quirks of dealing with CLOBs, see the Oracle documentation.)
Regardless of how you pass the data (examples: static SQL, bind variables,
smoke signals) to the RDBMS, internally it manages the data how it feels
is best.
When the RDBMS decides to store a CLOB outside of the main record body, it
transparently leaves a pointer in the record to the location of the CLOB.
Any time a record is split for any reason, you run into the potential for
additional I/O. The trade off here is that by moving large CLOBs outside
the main records, it makes accessing the non-CLOB fields faster by packing
them tighter in the Oracle disk blocks. If you don't reference the CLOB
column often, then performance should actually be better because of this.
If you frequently access the CLOB column, the RDBMS will spend more time
and I/O following pointers to retrive the CLOBs and transparently
rejoining them with their record bodies. Since around 98% of your messages
will be relatively small, I don't think you will run into much of a
performance hit.
Also, remember that Oracle caches frequently access data in the Oracle SGA.
If a group of messages/records are frequently being accessed, then they will
be is coming from memory rather than disk. That works to your advantage,
even for records with large CLOBs.
-- Michael
Darkstreak Consulting
www.darkstreak.com