I've started working on a project where there is a fairly large table (about 82,000,000 rows) that I think is very bloated. One of the fields is defined as:
consistency character varying NOT NULL DEFAULT 'Y'::character varying
It's used as a boolean, the values should always either be ('Y'|'N').
Note: there is no check constraint, etc.
I'm trying to come up with reasons to justify changing this field. Here is what I have:
- It's being used as a boolean, so make it that. Explicit is better than implicit.
- It will protect against coding errors because right now there anything that can be converted to text will go blindly in there.
Here are my question(s).
- What about size/storage? The db is UTF-8. So, I think there really isn't much of a savings in that regard. It should be 1 byte for a
boolean
, but also 1 byte for a'Y'
in UTF-8 (at least that's what I get when I check the length in Python). Is there any other storage overhead here that would be saved? - Query performance? Will Postgres get any performance gains for a where cause of "
=TRUE
" vs. "='Y'
"?