7
4

1 回答 1

11

This is probably not something that you can work around unless you want to use a CLOB instead of a VARCHAR2.

In Oracle, when you declare a column, the default is to use byte-length semantics. So a VARCHAR2(100), for example, allocates 100 bytes of storage. If you're using a single-byte character set like ISO 8859-1, every character requires 1 byte of storage, so this also allocates space for 100 characters. But if you are using a multi-byte character set like UFT-8, each character can require between 1 and 4 bytes of storage. Depending on the data, therefore, a VARCHAR2(100) may only be able to store 25 characters of data (English characters generally require 1 byte, European characters generally require 2 bytes, and Asian characters generally require 3 bytes).

You can tell Oracle to use character length semantics which is normally what I'd suggest when moving from an ISO-8859-1 database to a UTF-8 database. If you declare a column VARCHAR2(100 CHAR), Oracle will allocate space for 100 characters regardless of whether that ends up being 100 bytes or 400 bytes. You can also set the NLS_LENGTH_SEMANTICS parameter to CHAR to change the default (for new DDL) so that a VARCHAR2(100) allocates 100 characters of storage rather than 100 bytes.

Unfortunately for you, though, the limit on the size of an Oracle VARCHAR2 (in the context of the SQL engine rather than the PL/SQL engine) is 4000 bytes. So even if you declare a column VARCHAR2(4000 CHAR), you're still going to be limited to actually inserting 4000 bytes of data which may be as few as 1000 characters. For example, in a database using the AL32UTF8 character set, I can declare a column VARCHAR2(4000 CHAR) but inserting a character that requires 2 bytes of storage shows that I can't really insert 4000 characters of data

SQL> create table foo (
  2    col1 varchar2(4000 char)
  3  );

Table created.

SQL> insert into foo values( rpad( 'abcde', 4000, unistr('\00f6') ) );

1 row created.

SQL> ed
Wrote file afiedt.buf

  1* insert into foo values( rpad( 'abcde', 6000, unistr('\00f6') ) )
SQL> /

1 row created.

SQL> select length(col1), lengthb(col1)
  2    from foo;

LENGTH(COL1) LENGTHB(COL1)
------------ -------------
        2003          4000
        2003          4000

If you need to store 4000 characters of UTF-8 data, you'd need a data type that could handle 16000 bytes which would necessitate moving to a CLOB.

于 2011-03-08T09:46:21.687 回答