1

I have included notrim for rowdata column in external table as suggesterd by Alex (This is a continuation of this question,),

But now End of Line character is also appending at the rowdata column, I mean , End of line (CR-LF) is also joins at the end of rowdata.

I don't want to use substr() or translate() , since file size is around 1GB,

My external table creation process :

'CREATE TABLE ' || rec.ext_table_name || ' (ROW_DATA VARCHAR2(4000)) ORGANIZATION EXTERNAL ' ||
     '(TYPE ORACLE_LOADER DEFAULT DIRECTORY ' || rec.dir_name || ' ACCESS ' || 'PARAMETERS (RECORDS ' ||
     'DELIMITED by NEWLINE NOBADFILE NODISCARDFILE ' ||
     'FIELDS REJECT ROWS WITH ALL NULL FIELDS (ROW_DATA POSITION(1:4000) char)) LOCATION (' || l_quote ||
     'temp.txt' || l_quote || ')) REJECT LIMIT UNLIMITED'

Is there any other paramenter I can add , to remove the End-of-line character. Thanks.

EDIT 1:

My file :

Some first line with spaces at end
Some second line with spaces at end

My Ext table :

Some first line with spaces at end    <EOL>
Some second line with spaces at end   <EOL>

to be more clear , I will explain in java (when I assign column values to string , it is something like below),

without notrim :

rowdata[1]="Some first line with spaces at end";
rowdata[2]="Some second line with spaces at end";

with notrim:

rowdata[1]="Some first line with spaces at end    \n";
rowdata[2]="Some second line with spaces at end   \n";

what I want it to be :

rowdata[1]="Some first line with spaces at end    ";
rowdata[2]="Some second line with spaces at end   ";

the delimiter is also a part of rowdata, since no trim is specified.

EDIT2:

Line-Endings : CRLF

Platform :

Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit

Production PL/SQL Release 12.1.0.1.0 - Production

"CORE 12.1.0.1.0 Production" TNS for Solaris: Version 12.1.0.1.0 -

Production NLSRTL Version 12.1.0.1.0 - Production

SELECT DUMP(ROW_DATA,1016) FROM EXT_TABLE WHERE ROWNUM = 1;

Typ=1 Len=616 CharacterSet=AL32UTF8: 41,30,30,30,30,30,30,30,30,30,30,31,30,30,30,30,37,36,36,36,44,30,30,30,30,31,32,35,30,38,31,36,32,35,30,38,31,36,31,33,34,37,30,39,44,42,20,41,30,36,31,30,30,30,30,30,30,30,30,30,30,30,30,32,30,30,4d,59,52,20,32,5a,20,30,31,36,30,30,30,31,32,31,32,33,34,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,52,49,42,46,50,58,30,30,30,31,30,30,30,30,30,30,30,30,31,30,36,32,38,30,31,30,32,30,30,47,20,20,20,20,53,20,20,30,30,30,30,30,30,30,30,30,30,30,20,20,20,20,20,20,20,4e,39,32,37,32,20,20,20,20,20,20,30,30,30,30,30,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,30,30,39,39,38,54,45,53,54,52,52,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,4f,50,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,54,52,41,4e,53,49,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,52,52,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,4f,50,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,54,52,41,4e,53,49,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,d

Len should be 615

4

1 回答 1

3

您的文件行结尾是 CRLF(暗示文件是在 Windows 中创建的?),但您的数据库在 Solaris 上运行。正如文档所说:

如果指定了 DELIMITED BY NEWLINE,则使用的实际值是特定于平台的。在 UNIX 平台上,NEWLINE 假定为“\n”。在 Windows 操作系统上,NEWLINE 假定为“\r\n”。

由于您的数据库平台是 Unix,它只使用 LF ( \n) 作为记录分隔符。您可以更改文件中的分隔符,或更改terminated by子句以查找 Windows 行尾:

,,,
records delimited by "\r\n" nobadfile ...

如果您可能获得具有任何一种行尾类型的文件并且无法控制它,您可以添加一个预处理器步骤来去除任何确实存在的文件。如果您在与文件相同的目录中或(如 Oracle 推荐的)在不同的 Oracle 可访问目录中创建可执行脚本文件,例如称为remove_cr包含:

/usr/bin/sed -e "s/\\r$//" $1

您可以在外部表定义中添加对它的调用,并保留newline终止符:

...
records delimited by newline nobadfile nodiscardfile
preprocessor 'remove_cr'
...

不过,请确保您阅读了文档中的安全警告。

temp.txt带有 CRLF 行结尾的文件的演示:

create table t42_ext (
  row_data varchar2(4000)
)
organization external
(
  type oracle_loader default directory d42 access parameters
  (
    records delimited by newline nobadfile nodiscardfile
    preprocessor 'remove_cr'
    fields reject rows with all null fields
    (
      row_data position(1:4000) char notrim
    )
  )
  location ('temp.txt')
)
reject limit unlimited;

select '<'|| row_data ||'>' from t42_ext;

'<'||ROW_DATA||'>'                                                             
--------------------------------------------------------------------------------
<Line1sometext       >                                                          
<Line2sometext       >                                                          
<Line3sometext       >                                                          
于 2016-08-30T09:51:49.727 回答