ParseUtils provide useful functions that can be used in Parser. Detailed documentation about Parsec library, that is used in this module, can be found at: http://research.microsoft.com/users/daan/parsec.html
> module ParseUtils where
>
> import qualified Text.ParserCombinators.Parsec as P
> import qualified Text.ParserCombinators.Parsec.Token as P
> import qualified Text.ParserCombinators.Parsec.Expr as P
> import qualified Text.ParserCombinators.Parsec.Language as P
> import Text.ParserCombinators.Parsec ((<|>))
>
> liplStyle = P.haskellStyle {
> P.commentLine = "#"
> , P.identLetter = P.alphaNum <|> P.oneOf "_'-"
> }
liplStyle is a LanguageDef. It is based on haskellStyle but defines line comment to start with #. And, identifers can include - (in Haskell, identifiers can't include -). Since liplStyle inherites most of the settings from haskellStyle, nested block comments ({- -}) are already defined.
> lexer = P.makeTokenParser liplStyle
When a parser (CharParser, GenParser...etc) is built from above lexer (which is a TokenParser), the parser will get proper tokens: line comments starting with # are discarded (so are nested block comments inside {- -}), identifiers are tokenized so that they can contain alphaNum or one of _'-...etc.
> ws = P.whiteSpace lexer
> mustSpaces = P.skipMany1 P.space >> ws
ws parses zero or more white spaces. musteSpace parses one or more white spaces.
> identStart = P.identStart liplStyle
> identLetter = P.identLetter liplStyle
> opLetter = P.opLetter liplStyle
identStart parses 1 character that can start an identifier (alphabet character). identLetter parses 1 character that can be in an identifier (alphaNum, one of _-'). opLetter parses 1 character that can be in an operator (:!+./<=>?...etc)
> parseHeadBody headChar bodyChar = do
> h <- headChar
> b <- P.many bodyChar
> return (h : b)
Given 2 character parsers, parseHeadBody parses a string that starts with one of headChar and continues with one of bodyChar.
> nat = P.many1 P.digit
>
> lbracket = P.char '[' >> P.spaces
> rbracket = P.spaces >> P.char ']'
>
> comma = (P.spaces >> P.char ',' >> P.spaces)
>
> lparen = P.char '(' >> P.spaces
> rparen = P.spaces >> P.char ')'
>
> lbrace = P.char '{' >> P.spaces
> rbrace = P.spaces >> P.char '}'
nat parses 1 or more digits. lbracket, rbracket, comma, lparen, rparen, lbrace, and rbrace parses [],(){} respectively with optional white spaces around them.