与大多数解析问题一样,我尝试构建一个最能描述输入格式元素的语法。
在这种情况下,我们有名词:
[comma ending value-chars qmark quoted-chars value header row]
一些动词:
[row-feed emit-value]
和操作名词:
[current chunk current-row width]
我想我可以再把它分解一点,但足以使用。一、基础:
comma: ","
ending: "^/"
qmark: {"}
value-chars: complement charset reduce [qmark comma ending]
quoted-chars: complement charset reduce [qmark]
现在是价值结构。引用的值是由我们找到的有效字符或引号组成的:
current: chunk: none
quoted-value: [
qmark (current: copy "")
any [
copy chunk some quoted-chars (append current chunk)
|
qmark qmark (append current qmark)
]
qmark
]
value: [
copy current some value-chars
| quoted-value
]
emit-value: [
(
delimiter: comma
append current-row current
)
]
emit-none: [
(
delimiter: comma
append current-row none
)
]
请注意,在每一行的开头delimiter
设置为,然后在我们传递一个值时立即更改为。因此,输入行定义为。ending
comma
[ending value any [comma value]]
剩下的就是定义文档结构:
current-row: none
row-feed: [
(
delimiter: ending
append/only out current-row: copy []
)
]
width: none
header: [
(out: copy [])
row-feed any [
value comma
emit-value
]
value body: ending :body
emit-value
(width: length? current-row)
]
row: [
row-feed width [
delimiter [
value emit-value
| emit-none
]
]
]
if parse/all stream [header some row opt ending][out]
把它包起来屏蔽所有这些词,你有:
REBOL [
Title: "CSV Parser"
Date: 19-Nov-2012
Author: "Christopher Ross-Gill"
]
parse-csv: use [
comma ending delimiter value-chars qmark quoted-chars
value quoted-value header row
row-feed emit-value emit-none
out current current-row width
][
comma: ","
ending: "^/"
qmark: {"}
value-chars: complement charset reduce [qmark comma ending]
quoted-chars: complement charset reduce [qmark]
current: none
quoted-value: use [chunk][
[
qmark (current: copy "")
any [
copy chunk some quoted-chars (append current chunk)
|
qmark qmark (append current qmark)
]
qmark
]
]
value: [
copy current some value-chars
| quoted-value
]
current-row: none
row-feed: [
(
delimiter: ending
append/only out current-row: copy []
)
]
emit-value: [
(
delimiter: comma
append current-row current
)
]
emit-none: [
(
delimiter: comma
append current-row none
)
]
width: none
header: [
(out: copy [])
row-feed any [
value comma
emit-value
]
value body: ending :body
emit-value
(width: length? current-row)
]
row: [
opt ending end break
|
row-feed width [
delimiter [
value emit-value
| emit-none
]
]
]
func [stream [string!]][
if parse/all stream [header some row][out]
]
]