我有一个巨大的文本文件(500K 行),其中一些行分成多行。我正在尝试使用拆分线获取记录,并显示在一行中。当行被拆分时,在下一行的开头之前有一个空行。现在,我正在遍历每一行,在 start ( AAAA|
) 处测试字符串以确定它是否是新行,然后与下一个连接。但这似乎需要很多时间,并且想知道是否有更好的方法来做到这一点。此外,有些行被分成多行,每条新记录都以“AAAA|”开头。
输入文件:
AAAA|XXXX|YYYY|ZZZZ|532920-1*TYCO ELECTRONICS AMP#HDR4-2B-320-PSH2-A*CECO COMPONENT EQUIPMENT CO INC#
AAAA|XXXX|2342342|ADFADFS|A80386DX-33*INTEL CORP#
AAAA|SDFASF|234232322|saddfwq|ER412D-5A*TELEDYNE COMPONENTS#M39016/15-088L*QPL-39016#JMACD-5XL*HI-G INC#914S72101-10L*DRI RELAYS INC#M39016/15-082L*QPL-39016#3SBS1412A2*TYCO ELECTRONICS
CORP#
AAAA|XXXXXXX|5675423|XVASD|N74F132D-T*NXP SEMICONDUCTORS#74F132SC*FAIRCHILD SEMICONDUCTOR CORP#N74F132D*NXP SEMICONDUCTORS#MC74F132D*FREESCALE SEMICONDUCTOR INC#N74F132D,602*NXP SEMICONDU
CTORS#
AAAA|SDFASFSAS|23422|DFGAQWEWE|3SBS1411A2*TYCO ELECTRONICS CORP#914S70301-10L*DRI RELAYS INC#M39016/15-081L*QPL-39016#ER412D-26A*TELEDYNE COMPONENTS#JMACD-26XL*HI-G INC#M39016/15-087L*QPL
-39016#
AAAA|SFRQ3|34543534|NSGBSSDF|3SBS1223A2*TYCO ELECTRONICS CORP#914S60301-10L*DRI RELAYS INC#M39016/15-039L*QPL-39016#914S60301-09L*DRI RELAYS INC#M39016/15-051L*QPL-39016#ER412D-18A/S*TE
LEDYNE COMPONENTS#JMAPD-18XL*HI-G INC#
AAAA|ALSKFJ|1SFAE|ASLKFJSLKSAD|11163-164J*PHILIPS COMPONENTS#SEE_DRAWING_11163-164J*ROHM CO LTD#CF1/4L_164J*KOA SPEER ELECTRONICS INC#SEE_DRAWING_11163-164J*PHILIPS COMPONENTS#CF1/4L
U164J*KOA SPEER ELECTRONICS INC#CF1/4-160K-5%*KOA SPEER ELECTRONICS INC#11163-164J*ROHM CO LTD#131-00164-0053*HONEYWELL CROSS REFERENCE#CF1/4CT52A164J*KOA SPEER ELECTRONICS INC#CF1/4CT52R164J*KOA SPEE
R ELECTRONICS INC#||
AAAA|ASDFAA|1ASFSDAS|ASDFSA|MF 55 D 4323 F*KOA SPEER ELECTRONICS INC#2322156X4324*BC COMPONENTS INC#MF1/4DLT52R4323F*KOA SPEER ELECTRONICS INC#2322 156 X 4324*BC COMPONENTS INC#SFR55432K0
1%*BC COMPONENTS INC#CCF-55 4323 F*VISHAY DALE#CCF-554323F*VISHAY DALE#MF1/4DL_4323F*KOA SPEER ELECTRONICS INC#RN55D4323F*MILITARY SPECIFICATIONS#SFR55 432K0 1%*BC COMPONENTS INC#MF55D4323F*KOA SPEER
ELECTRONICS INC#||