最推荐的导入文本的方法是编辑文件并将其保存为定义变量的方案文件:
(define data "the text in
mydata.scm here")
然后调用:
(load "mydata.scm")
很多时候,并不是每个数据文件都可以被编辑并保存为方案文件,虽然换行符会自动转义,但双引号不能,这会在加载文件时产生问题。
一些实现特定的技术是:
;Chicken
(use utils)
(read-all "mydata.txt")
;Racket
(file->string "mydata.txt")
更便携的功能是:
;works in chicken-csi and Racket
(define (readlines filename)
(call-with-input-file filename
(lambda (p)
(let loop ((line (read-line p))
(result '()))
(if (eof-object? line)
(reverse result)
(loop (read-line p) (cons line result)))))))
由于读取行需要一个额外的文件,运行一个可执行编译的 chicken-csc 会出错。
读取文件最便携的方法是这个函数:
;works in Chicken, Racket, SISC
;Read a file to a list of chars
(define (file->char_list path)
(call-with-input-file path
(lambda (input-port)
(let loop ((x (read-char input-port)))
(cond
((eof-object? x) '())
(#t (begin (cons x (loop (read-char input-port))))))))))
此功能相当快速且可跨实现移植。所需要的只是将 char_list 转换为字符串。
最简单的方法是:
;may not work if there is limit on arguments
(apply string (file->char_list "mydata.txt"))
问题是某些实现对可以传递给函数的参数数量有限制。2049 个字符的列表在 Chicken 中不起作用。
另一种方法是:
;works in Chicken, Racket
(foldr (lambda (x y) (string-append (string x) y)) "" (file->char_list "mydata.txt"))
问题是:首先,foldr 没有被普遍认可(SISC),尽管它可以被定义。其次,由于附加了每个字符,这种方法非常慢。
我编写了接下来的两个函数将字符列表分割成嵌套列表,直到最低级别不会超过 Chicken 中的最大参数计数。第三个函数遍历嵌套的 char 列表并使用 string string-append 返回一个字符串:
(define (cleave_at n a)
(cond
((null? a) '())
((zero? n) (list '() a))
(#t
((lambda (x)
(cons (cons (car a) (car x)) (cdr x)))
(cleave_at (- n 1) (cdr a))))))
(define (cleave_binary_nest n a)
(cond
((equal? n (length a)) (list a))
(#t
((lambda (x)
(cond
((> (length (car x)) n) (map (lambda (y) (cleave_binary_nest n y)) x))
(#t x)))
(cleave_at (floor (/ (length a) 2)) a)))))
(define (binary_nest_char->string a)
(cond
((null? a) "")
((char? (car a)) (apply string a))
(#t (string-append
(binary_nest_char->string (car a)) (binary_nest_char->string (cdr a))))))
该函数是这样调用的:
;Works in Racket, Chicken, SISC
;faster than foldr method (3x faster interpreted Chicken) (30x faster compiled Chicken) (125x faster Racket gui)
(binary_nest_char->string (cleave_binary_nest 2048 (file->char_list "mydata.txt")))
为了减少字母字符和空格,还有两个函数:
(define (alphaspace? x)
(cond
((and (char-ci>=? x #\a) (char-ci<=? x #\z)) #t)
((equal? x #\space) #t)
(#t #f)))
(define (filter pred lis)
; if lis is empty
(if (null? lis)
; return an empty list
'()
; otherwise, if the predicate is true on the first element
(if (pred (car lis))
; return the first element concatenated with the
; result of calling filter on the rest of lis
(cons (car lis) (filter pred (cdr lis)))
; otherwise (if the predicate was false) just
; return the result of filtering the rest of lis
(filter pred (cdr lis)))))
(define data (file->char_list "mydata.txt"))
(define data_alphaspace (filter alphaspace? data))
(define result (binary_nest_char->string (cleave_binary_nest 2048 data_alphaspace)))
这适用于 Racket、Chicken(解释和编译)和 SISC(Java)。这些方言中的每一个也应该适用于 Linux、Mac (OS X) 和 Windows。