我使用 Attoparsec 编写了一个简单的文本 STL(标准镶嵌库)解析器。STL 包含一组方面。每个面包含一个法线和三角形的顶点。典型的 STL 文件可能很大(~100 Mb 或更多)。
STL 文件格式为
solid
facet normal nx ny nz
outer loop
vertex x1 y1 z1
vertex x2 y2 z2
vertex x3 y3 z3
endfacet
endsolid
解析器代码在这里。完整的实现可以在STLReader找到
-- | Point represents a vertex of the triangle
data Point a = Point !a !a !a deriving Show
-- | Vector for a given class
data Vector a = Vector !a !a !a deriving Show
-- | Parse coordinate triplet.
coordinates :: (Fractional a) => Text -> (a -> a -> a -> b) -> Parser b
coordinates s f = do
skipSpace
string s
!x <- coordinate
!y <- coordinate
!z <- coordinate
return $! f x y z
where
coordinate = skipWhile isHorizontalSpace *> fmap realToFrac double
{-# INLINE coordinates #-}
type RawFacet a = (Vector a, Point a, Point a, Point a)
-- | Parse a facet. The facet comprises of a normal, and three vertices
facet :: Fractional a => Parser (RawFacet a)
facet = (,,,) <$> beginFacet
<* (skipSpace *> "outer loop")
<*> vertexPoint
<*> vertexPoint
<*> vertexPoint
<* (skipSpace <* "endloop" <* endFacet )
<?> "facet"
where
beginFacet = skipSpace <* "facet" *> coordinates "normal" Vector
endFacet = skipSpace <* string "endfacet"
vertexPoint = coordinates "vertex" Point
{-# INLINE facet #-}
rawFacets :: Fractional a => Parser [RawFacet a]
rawFacets = beginSolid *> many' facet <* endSolid
where
solidName = option "default" (skipWhile isHorizontalSpace *> fmap T.pack (many1 $ satisfy isAlphaNum) )
beginSolid = skipSpace <* "solid" *> solidName <?> "start solid"
endSolid = skipSpace <* "endsolid" <?> "end solid"
-- | Read text STL file. STL extensions for color etc. are not supported in this version.
readTextSTL :: Fractional a => FilePath -> IO (Either String [RawFacet a])
readTextSTL path = liftM (Al.eitherResult . Al.parse rawFacets) (TIO.readFile path)
main :: IO Int
main = do
(path:_) <- getArgs
putStrLn $ "Parsing STL file: " ++ path
s <- readTextSTL path
putStrLn "Parsing complete"
case s of
Left error -> putStrLn error
Right s -> putStrLn $ "Num facets : " ++ show (length s)
return 0
我使用Meshlab提供的“c”解析器对该代码进行了基准测试。当我使用 69Mb 扫描进行测试时,meshlab 在大约 13 秒内完成了这项工作,而使用 attoparsec 则需要 22 秒。因此,尽管 attoparsec 使我能够比 Parsec 更快地解析(使用 parsec 大约需要 36 秒),但我还有很长的路要走。
我怎样才能进一步改进这个解析器?