2

我在从文本文件解析 CSV 时遇到问题,想知道你们是否可以帮助我。到目前为止,我有以下内容,

CSV 文件 (DATA.txt) 看起来像这样,它总是有 15 个字段,全部用逗号分隔。并非所有字段都是强制性的,因此有些字段将被填写,有些字段为空白。

Seattle,Lastname,Firstname,DOB,SEX,etc,etc
Seattle,Lastname,Firstname,DOB,,etc,etc
Portland,Lastname,Firstname,DOB,SEX,,,etc
Portland,Lastname,Firstname,DOB,SEX,etc,etc

这是我的 REXX 代码

SOURCEFILE = "C:\DATA\DATA.TXT"
IF A=2 THEN DO COUNTER=1 TO LINES(SOURCEFILE)
    PARSE VALUE LINEIN(SOURCEFILE) WITH CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
    CALL SETCURSOR 4,23
    CALL CREATEDATA
END

CREATEDATA:
CALL TYPE CITY
CALL PRESS TAB
CALL TYPE LAST_NAME
CALL PRESS TAB
CALL TYPE DATE(U)
CALL PRESS TAB
CALL TYPE FIRST_NAME
CALL PRESS TAB
CALL PRESS ENTER
RETURN

我不确定在解析时是否应该使用 ARG 或 VAR,或者我是否正确编写了前两行。事实上,我知道我的 CREATEDATA 函数可以正常工作,因为我输入的是“CITY”,但不是解析的值。任何帮助将不胜感激。谢谢!

4

2 回答 2

1

一个问题 if A=2 then in 的目的是什么

IF A=2 THEN DO COUNTER=1 TO LINES(SOURCEFILE)

如果 A != 2 则绕过循环。我怀疑你的程序应该是:

SOURCEFILE = "C:\DATA\DATA.TXT"
DO COUNTER=1 TO LINES(SOURCEFILE) 
    PARSE VALUE LINEIN(SOURCEFILE) WITH CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
    CALL SETCURSOR 4,23
    CALL CREATEDATA
END

RETURN   /* prevent the fall through to createdata */

CREATEDATA:




---------------------------

parse 语句具有以下基本格式

解析 [来源] [解析控制]

其中 [来源] 包括

arg - 过程调用的参数 pull - 从堆栈中提取的数据 var - 数据来自变量值 ... 内联提供数据

所以你的解析可以像

   linein = LINEIN(SOURCEFILE)
   PARSE var linein CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc

或者

    DO COUNTER=1 TO LINES(SOURCEFILE) 
        CALL SETCURSOR 4,23
        CALL CREATEDATA LINEIN(SOURCEFILE)
    END

    RETURN   /* prevent the fall through to createdata */

    CREATEDATA:
    parse arg CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc

最后 ass ross 说你应该尝试避免 lines(sourcefile) 因为它涉及读取整个文件

于 2013-04-11T00:26:10.993 回答
1

几点评论:

1)Lines(SourceFile)在 Windows 系统上可能涉及读取整个文件以计算 CR-LF 序列。然后你的Parse value LineIn(SourceFile)循环再次读取它。执行此操作的典型 Rexx 方法是:

Address SYSTEM 'TYPE' SourceFile with output stem Lines.
Do Counter = 1 to Lines.0
    Parse var Lines.Counter ...
End
Drop Lines.

至少,只要文件不太大,以至于将其保存在数组中会占用大量内存。

2)您在循环结束时流入CreateData,这就是您看到“CITY”的原因。你需要一个ReturnExit之后的End指令。

3) 鉴于#2,很明显它Parse永远不会被执行,因为City它是未初始化的(Rexx 中未初始化变量的值是它的大写名称)。这是有条件的A=2,一定不是这样。

于 2013-04-10T21:34:56.483 回答