PG: Query failed: ERROR: catalog is missing 1 attribute(s) for relid 456086

csn · May 22, 2003

Maybe I spoke too soon ;P. Everything was working fine yesterday, but today PG was failing with:

Warning: pg_query() [function.pg-query]: Query failed: ERROR: catalog is missing 1 attribute(s) for relid 456086 . in /var/www/html/lib.php on line 200

VACUUM gave the same error, but VACUUM FULL appears to have fixed whatever the problem was. Only one of the PG databases was giving this error. Any idea what the problem was/is? I'm using PG 7.3.2 and run the vacuumdb script nightly.

Sxooter · May 22, 2003

What query generated this error?

csn · May 22, 2003

It was a SELECT * from a view. The view is something like

SELECT *
FROM A
LEFT JOIN B
ON A.B_ID = B.ID

Sxooter · May 22, 2003

Is it possible to get a memory test run on that box?

www.memtest86.com

csn · May 23, 2003

Hmm, I don't have physical access to the server. Looks like a partition can be used instead of a boot disk for memtest? Is that possible/easy/advisable?

Got a response from Tom Lane:

CSN writes:

ERROR: catalog is missing 1 attribute(s) for relid
456086 . in /var/www/html/lib.php on line 200

Ugh. This indicates corruption in the system catalogs --- specifically,
a missing pg_attribute row. These rows will normally be picked up with an
indexscan, so it's possible that pg_attribute itself is fine and the
trouble is corruption in pg_attribute_relid_attnum_index.

VACUUM gave the same error, but VACUUM FULL appears to
have fixed whatever the problem was.

Sounds like blind luck to me. I'd have expected that you might escape
alive by reindexing pg_attribute, otherwise you have a big problem.

The only way that an error like that would go away by itself is if it
wasn't really there at all --- that is, the apparent corruption only
existed in in-memory copies of pg_attribute pages, and not on disk at
all. I'd strongly recommend running some hardware diagnostics
(memtest86, badblocks, etc). You might not be so lucky with the next
dropped bit.
		regards, tom lane

Maybe I'll try badblocks. At any rate, is this something that's bound to happen again? And even fubar everything? I've never encountered memory-related errors before (that I know of).

Sxooter · May 23, 2003

Yep, it will happen again if you have bad memory. No database can make up for bad memory, and postgresql chews up memory like mad to buffer everything it can, and it trusts the machine to have all good memory.

You should be able to get your hosting provider to test it for you. It's in their best interest as well as their customer's best interest to make sure their machines are solid and reliable.

If you don't have direct access, a good way to test for bad memory is to compile postgresql with a -j 4 switch or larger. Run top in one window, and run the make with a larger and larger number to -j until you're almost swapping out memory. If you have enough memory you may not be able to use it all directly, but there will be plenty of disc buffer to use it and get corrupted. Either way, look for sig 11s when compiling. If you get sig-11s from gcc, in random places while compiling, then you have a broken machine underneath you.

csn · Dec 14, 2004

Hey Sxooter-

What does the "-j" switch do? I couldn't find it in gcc's (version 3.2.2) help or man page.

Sxooter · Dec 14, 2004

-j tells the compiler to run multiple threads, which uses more memory, which, in turn, makes machines with bad ram fail more often.

Usually...

PG: Query failed: ERROR: catalog is missing 1 attribute(s) for relid 456086

Ccsn

SSxooter

Ccsn

SSxooter

Ccsn

SSxooter

Ccsn

SSxooter