csv - COBOL .csv 文件 IO 到表中不起作用

Question

我正在尝试学习 Cobol，因为我听说过它并认为看看会很有趣。我遇到了 MicroFocus Cobol，但我不确定这是否与这篇文章相关，而且由于我喜欢在 Visual Studio 中写作，因此有足够的动力去尝试和学习它。

我一直在阅读很多关于它的内容并尝试遵循文档和示例。到目前为止，我已经让用户输入和输出到控制台工作，所以我决定尝试文件 IO。当我一次只阅读“记录”时，一切正常，我意识到“记录”可能是不正确的行话。虽然我已经编程了一段时间，但我是 cobol 的一个极端菜鸟。

我有一个我之前编写的 c++ 程序，它只需要一个 .csv 文件并对其进行解析，然后按用户想要的任何列对数据进行排序。我认为在 cobol 中做同样的事情并不难。显然我在这方面判断错误。

我有一个文件，使用 notepad++ 在 Windows 中编辑，名为 test.csv，其中包含：

4001942600,140,4
4001942700,141,3
4001944000,142,2

此数据来自美国人口普查，其列标题标题为：GEOID、SUMLEV、STATE。我删除了标题行，因为当时我不知道如何读入它，然后再读入其他数据。随便...

在 Visual Studio 2015 中，在 Windows 7 Pro 64 位上，使用 Micro Focus 和分步调试，我可以看到包含第一行数据的记录。unstring 可以正常运行，但是下次程序“循环”时，我可以进行单步调试，并查看记录并查看它包含新数据，但是当我展开监视元素时，监视显示如下所示：

        REC-COUNTER 002 PIC 9(3) 
+       IN-RECORD   {Length = 42} : "40019427004001942700             000      "    GROUP
-       GEOID   {Length = 3}    PIC 9(10) 
        GEOID(1)    4001942700  PIC 9(10) 
        GEOID(2)    4001942700  PIC 9(10) 
        GEOID(3)    <Illegal data in numeric field> PIC 9(10) 

-       SUMLEV  {Length = 3}    PIC 9(3) 
        SUMLEV(1)   <Illegal data in numeric field> PIC 9(3) 
        SUMLEV(2)   000 PIC 9(3) 
        SUMLEV(3)   <Illegal data in numeric field> PIC 9(3) 

-       STATE   {Length = 3}    PIC X
        STATE(1)        PIC X
        STATE(2)        PIC X
        STATE(3)        PIC X

所以我不确定为什么在第二次 Unstring 操作之前我可以看到正确的数据，但是在 unstring 发生之后，不正确的数据会存储在“表”中。同样有趣的是，如果我第三次继续，正确的数据将存储在“表”中。

         identification division.
         program-id.endat.
         environment division.
         input-output section.
         file-control.
             select in-file assign to "C:/Users/Shittin Kitten/Google Drive/Embry-Riddle/Spring 2017/CS332/group_project/cobol1/cobol1/test.csv"
                organization is line sequential.
         data division.     
         file section.
         fd in-file.  
         01 in-record.
             05 record-table.
                 10 geoid     occurs 3 times        pic 9(10).
                 10 sumlev   occurs 3 times       pic 9(3).
                 10 state       occurs 3 times       pic X(1).
         working-storage section.
    01 switches.
     05 eof-switch pic X value "N".
  *  declaring a local variable for counting
    01 rec-counter pic 9(3).
  *  Defining constants for new line and carraige return. \n \r DNE in cobol!
    78 NL  value X"0A".
    78 CR  value X"0D".
    78 TAB value X"09".

  ******** Start of Program ******
   000-main.
     open input in-file.
       perform 
       perform 200-process-records
         until eof-switch = "Y".
       close in-file;
     stop run.
  *********** End of Program ************

  ******** Start of Paragraph  2 *********
   200-process-records.
       read in-file into in-record
         at end move "Y" to eof-switch
         not at end compute rec-counter = rec-counter + 1;
       end-read.
       Unstring in-record delimited by "," into 
           geoid in record-table(rec-counter), 
           sumlev in record-table(rec-counter), 
           state in record-table(rec-counter).

     display "GEOID  " & TAB &">> " & TAB & geoid of record-table(rec-counter).
     display "SUMLEV  >> " & TAB & sumlev of record-table(rec-counter).
     display "STATE  "  & TAB &">> " & TAB & state of record-table(rec-counter) & NL.
  ************* End of Paragraph 2  **************

我很困惑为什么在读取操作之后我实际上可以看到数据，但它没有存储在表中。我也尝试将表格的声明更改为 pic 9（一些长度）并且结果发生了变化，但我似乎无法确定我对此没有得到什么。

score 1 · Accepted Answer

好吧，我想通了。再次进行单步调试时，将鼠标悬停record-table在最后一个数据字段之后，我注意到有 26 个空格。现在今晚早些时候，我试图“即时”更改这些数据，因为通常 Visual Studio 允许这样做。我试图做出改变，但没有验证它是否需要，通常我不需要，但显然它没有。现在我应该知道得更清楚了，因为显示在左侧的图标显示了record-table一个关闭的挂锁。

我通常对 C、C++ 和 C# 进行编程，所以当我看到小挂锁时，它通常与范围和可见性有关。由于不太了解 COBOL，我忽略了这个小细节。

现在我决定在unstring in-record delimited by spaces into temp-string.之前

   Unstring temp-string delimited by "," into 
       geoid in record-table(rec-counter), 
       sumlev in record-table(rec-counter), 
       state in record-table(rec-counter).

这样做的结果是正确格式化的数据，至少据我所知，存储到表中并打印到控制台屏幕。

现在我已经读到unstring“函数”可以利用多个“运算符”，例如，我可能会尝试将这两个unstring操作合并为一个。

干杯!

**** 更新 ****

我已阅读下面伍德格先生的回复。如果我能在这方面寻求更多帮助。我也读过这篇类似但目前高于我水平的帖子。 COBOL 读取/存储在表中

这几乎就是我想要做的，但我不明白 Woodger 先生试图解释的一些事情。下面的代码更精致一些，我有一些问题作为评论。我非常希望得到一些帮助，或者如果我可以进行离线对话也可以。

`identification division.
  * I do not know what 'endat' is
         program-id.endat. 
         environment division.
         input-output section.
   file-control.
  * assign a file path to in-file
             select in-file assign to "C:/Users/Shittin Kitten/Google Drive/Embry-Riddle/Spring 2017/CS332/group_project/cobol1/cobol1/test.csv"
  *  Is line sequential what I need here?  I think it is
                organization is line sequential.
  *  Is the data devision similar to typedef in C?   
         data division.
  *  Does the file sectino belong to data division?
         file section.
  * Am I doing this correctly?  Should this be below?
         fd in-file.  
  * I believe I am defining a structure at this point
   01 in-record.
      05 record-table.
                 10 geoid     occurs 3 times        pic A(10).
                 10 sumlev   occurs 3 times       pic A(3).
                 10 state       occurs 3 times       pic A(1).
  * To me the working-storage section is similar to ADA declarative section
  *  is this a correct analogy?
         working-storage section.
  * Is this where in-record should go?  Is in-record a representative name?
    01 eof-switch pic X value "N".
    01 rec-counter pic 9(1).
  *  I don't know if I need these 
    78 NL  value X"0A".
    78 TAB value X"09".
    01 sort-col pic 9(1).
  ********************************* Start of Program ****************************
        *Now the procedure division, this is alot like ada to me
         procedure division.
  * Open the file
     perform 100-initialize.
  *  Read data
       perform 200-process-records
  *  loop until eof
         until eof-switch = "Y".
  *  ask user to sort by a column    
     display "Would which column would you like to bubble sort? " & TAB.
  *  get user input
     accept sort-col.
  * close file
     perform 300-terminate.
  * End program
   stop run.
  ********************************* End of Program ****************************

  ******************************** Start of Paragraph 1  ************************
     100-initialize.
       open input in-file.
  *   Performing a read, what is the difference in this read and the next one
  *   paragraph 200?  Why do I do this here instead of just opening the file?
       read in-file 
         at end
           move "Y" to eof-switch
         not at end
  *       Should I do this addition here? Also why a semicolon?
           add 1 to rec-counter;
       end-read.
  *    Should I not be unstringing here?
       Unstring in-record delimited by "," into geoid of record-table, 
                       sumlev of record-table, state of record-table.
  ******************************** End of Paragraph 1  ************************

  ********************************* Start of Paragraph  2 **********************
   200-process-records.

       read in-file into in-record
         at end move "Y" to eof-switch
         not at end add 1 to rec-counter;
       end-read.

  *   Should in-record be something else?  I think so but don't know how to
  *   declare and use it
       Unstring in-record delimited by ","  into 
           geoid in record-table(rec-counter), 
           sumlev in record-table(rec-counter), 
           state in record-table(rec-counter).

  *  These lines seem to give the printed format that I want
     display "GEOID  " & TAB &">> " & TAB & geoid of record-table(rec-counter).
     display "SUMLEV  >> " & TAB & sumlev of record-table(rec-counter).
     display "STATE  "  & TAB &">> " & TAB & state of record-table(rec-counter) & NL.

  ********************************* End of Paragraph 2  ************************    

  ********************************* Start of Paragraph 3  ************************
   300-terminate.
     display "number of records >>>> " rec-counter;
     close in-file;
  **************************** End of Paragraph 3  *****************************

`

score 1 · Accepted Answer

我认为有些事情你还没有掌握，而你需要掌握。

在中DATA DIVISION，有许多 SECTION，每个 SECTION 都有特定的用途。

您可以在FILE SECTION其中定义表示文件数据（输入、输出或输入-输出）的数据结构。每个文件都有一个FD, 并且从属于一个 FD 将是一个或多个 01 级结构，这些结构可以非常简单，也可以非常复杂。

某些确切的行为取决于编译器的特定实现，但是您应该以这种方式对待事情，为了您自己的“最小惊喜”以及以后必须修改您的程序的任何人：对于输入文件，不要t 读取后更改数据，除非您要更新记录（如果您使用的是键控 READ，也许）。您可以将“输入区域”视为数据文件上的“窗口”。下一次 READ，窗口指向不同的位置。或者，您可以将其视为“下一条记录到达，抹去之前的记录”。您已将 UNSTRING 的“结果”放入记录区域。结果肯定会在下一次读取时消失。您也有可能（如果窗口对于您的编译器是正确的，并且取决于它用于 IO 的机制）也可以压缩“跟随”数据。

您的结果应该在 WORKING-STORAGE 中，它不会受到正在读取的新记录的干扰。

READ filname INTO data-description 是数据从记录区域到数据描述的隐式 MOVE。如您所指定，如果数据描述是记录区域，则结果为“未定义”。如果您只想要记录区域中的数据，则只需要一个普通的 READ 文件名即可。

您原来的 UNSTRING 也有类似的问题。您有引用相同存储的源和目标字段。“未定义”而不是您想要的结果。这就是为什么不必要的 UNSTRING “起作用”的原因。

你有一个多余的内联 PERFORM。您在文件结束后处理“某事”。通过在 PROCEDURE DIVISION 中使用不必要的“标点符号”（您显然省略了粘贴），您会使事情变得更加复杂。尝试在那里使用 ADD 而不是 COMPUTE。查看 FILE STATUS 和 88 级条件名称的使用。

您不需要“新行” DISPLAY，因为除非您使用NO ADVANCING.

您不需要在 DISPLAY 中“连接”，因为您也可以免费获得。

DISPLAY 及其近亲 ACCEPT 是动词（只有内部函数是 COBOL 中的函数（除非您的编译器支持用户定义的函数）），它们因编译器而异。如果您的编译器支持SCREEN SECTIONDATA DIVISION，您可以在“屏幕”中格式化和处理用户输入。如果您要使用 IBM 的 Enterprise COBOL，您将拥有非常基本的 DISPLAY/ACCEPT。

您“声明一个局部变量”。你？凭什么？本地的程序。

通过查看过去几年的 COBOL 问题，您可以获得很多提示。

csv - COBOL .cs​​v 文件 IO 到表中不起作用

2 回答 2

Related

Reference

csv - COBOL .csv 文件 IO 到表中不起作用