blancas.kern.core
The core Kern library.
Kern is a library of parser combinators for Clojure. It is useful for
implementing recursive-descent parsers based on predictive LL(1) grammars
with on-demand, unlimited look-ahead LL(*).
The main inspiration for Kern comes from Parsec, a Haskell library written
by Daan Leijen, as well as work by Graham Hutton, Erik Meijer, and William Burge.
The name Kern is a token of appreciation to Brian Kernighan (now at Princeton)
for his work on programming languages.
Daan Leijen
Parsec, a fast combinator parser, 2001
http://legacy.cs.uu.nl/daan/download/parsec/parsec.pdf
Graham Hutton and Erik Meijer
Monadic Parser Combinators, 1996
http://eprints.nottingham.ac.uk/237/1/monparsing.pdf
William H. Burge
Recursive Programming Techniques
Addison-Wesley, 1975
*tab-width*
dynamic
The number of columns to advance for a tab character.
By default, a tab takes four columns.
->PError
(->PError pos msgs)
Positional factory function for class blancas.kern.core.PError.
->PMessage
(->PMessage type text)
Positional factory function for class blancas.kern.core.PMessage.
->PPosition
(->PPosition src line col)
Positional factory function for class blancas.kern.core.PPosition.
->PState
(->PState input pos value ok empty user error)
Positional factory function for class blancas.kern.core.PState.
<$>
(<$> f p)
Parses p; if successful, it applies f to the value parsed by p.
<*>
(<*> p & more)
Applies one or more parsers; collects the results in a
vector, including nil values. If any parser fails, it
stops immediately and fails.
<+>
(<+> p & more)
Applies one or more parsers stopping at the first failure.
Flattens the result and converts it to a string.
<:>
(<:> p)
Parses p; on failure it pretends it did not consume any input.
<<
(<< p q)
(<< p q & more)
Parses p followed by q; keeps p, skips q. If more parsers are
given, it keeps the first result and skips the rest.
<?>
(<?> p msg)
If parser p fails consuming no input, it replaces any Expecting
errors with a single Expecting with message msg. This helps to
produce more abstract and accurate error messages.
<|>
(<|> p q)
(<|> p q & more)
Tries p; if it fails without consuming any input, it tries q.
With more parsers, it will stop and succeed if a parser succeeds;
it will stop and fail if a parser fails consuming input; or it
will try the next one if a parser fails without consuming input.
>>
(>> p q)
(>> p q & more)
Parses p followed by q; skips p, keeps q. If more parsers are
given, it skips all but last and keeps the result of the last.
>>=
(>>= p f)
Binds parser p to function f which gets p's value and returns
a new parser. Function p must define a single parameter. The
argument it receives is the value parsed by p, not ps' return
value, which is a parser state record.
alpha-num
Parses a letter or digit.
any-char
Succeeds with any character.
between
(between delim p)
(between open close p)
Applies open, p, close; returns the value of p.
bind
macro
(bind [& bindings] & body)
Expands into nested >>= forms and a function body. The pattern:
(>>= p1 (fn [v1]
(>>= p2 (fn [v2]
...
(return (f v1 v2 ...))))))
can be more conveniently be written as:
(bind [v1 p1 v2 p2 ...] (return (f v1 v2 ...)))
char-seq
(char-seq rdr)
Returns characters from rdr as a lazy sequence.
rdr must implement java.io.Reader
clear-empty
(clear-empty s)
Sets the parser state as not empty. Needed in compound parsers
where optional parsers at the end may leave an incorrect :empty
state for the parser as a whole.
dec-num
Parses a decimal integer delimited by any character that
is not a decimal digit.
def-
macro
(def- name & more)
Same as def, yielding a private def.
defn*
macro
(defn* name & more)
Same as def, yielding a dynamic def.
end-by
(end-by sep p)
Parses p zero or more times, separated and ended by applications
of sep; returns the results of p in a vector.
end-by1
(end-by1 sep p)
Parses p one or more times, separated and ended by applications
of sep; returns the results of p in a vector.
eof
Succeeds on end of input.
expect
(expect p msg)
Applies parser p; if it fails (regardless of input consumed)
it replaces any expecting errors with expecting msg. This is
similar to <?> but works even if some input was consumed.
expecting
(expecting msg s)
f->s
(f->s f)
(f->s f e)
Gets a character sequence from a file-like object.
fail
(fail msg)
Fails without consuming any input, having a single error
record with the passed messge msg.
failed-empty?
(failed-empty? s)
Tests if s failed without consuming any input.
field*
(field* cs)
Parses an unquoted text field terminated by any character in cs.
float-num
Parses a simple fractional number without an exponent.
It is delimited by any character that is not a decimal
digit. It cannot start with a period; the first period
found must be followed by at least one digit.
fwd
macro
(fwd p)
Delays the evaluation of a parser that was forward (declare)d and
it has not been defined yet. For use in (def)s of no-arg parsers,
since the parser expression evaluates immediately.
get-position
(get-position s)
Gets the position in the input stream of a parser state.
get-state
(get-state s)
Get the user state from the parser state record.
hex-digit
Parses a hexadecimal digit.
hex-num
Parses a hex integer delimited by any character that
is not a hex digit.
look-ahead
(look-ahead p)
Applies p and returns the result; it consumes no input.
lower
Parses a lower-case letter.
many
(many p)
Parses p zero or more times; returns the result(s) in a
vector. It stops when p fails, but this parser succeeds.
many-till
(many-till p end)
Parses zero or more p while trying end, until end succeeds.
Returns the results in a vector.
many0
(many0 p)
Like (many) but it won't set the state to :empty. Use instead of
(many) if it comes last to avoid overriding non-empty parsing.
many1
(many1 p)
Parses p one or more times and returns the result(s) in a
vector. It stops when p fails, but this parser succeeds.
map->PError
(map->PError m__6522__auto__)
Factory function for class blancas.kern.core.PError, taking a map of keywords to field values.
map->PMessage
(map->PMessage m__6522__auto__)
Factory function for class blancas.kern.core.PMessage, taking a map of keywords to field values.
map->PPosition
(map->PPosition m__6522__auto__)
Factory function for class blancas.kern.core.PPosition, taking a map of keywords to field values.
map->PState
(map->PState m__6522__auto__)
Factory function for class blancas.kern.core.PState, taking a map of keywords to field values.
mark
Succeeds with a punctuation mark.
member?
(member? x coll)
Tests if x is a member of coll.
modify-state
(modify-state f & more)
Modify the user state with the result of f, which takes the old
user state plus any additional arguments.
none-of*
(none-of* cs)
Succeeds if the next character is not in the supplied string.
not-followed-by
(not-followed-by p)
Succeeds only if p fails; consumes no input.
oct-num
Parses an octal integer delimited by any character that
is not an octal digit.
one-of*
(one-of* cs)
Succeeds if the next character is in the supplied string.
option
(option x p)
Applies p; if it fails without consuming input, it returns a
parser state record with the :value x as default.
optional
(optional p)
Succeeds if p succeeds or if p fails without consuming input.
parse
(parse p cs)
(parse p cs src)
(parse p cs src us)
Parses a character sequence; takes an optional label and a user
state initial value, which default to nil. Returns a PState record.
cs A seqable object; parse calls (seq) on this value.
src Identifies the source of the text, e.g. a filename.
us Initializes a field that is maintained by client code.
parse-data
(parse-data p cs)
(parse-data p cs src)
(parse-data p cs src us)
Works like (parse) but with error diagnostics disabled for
better performance. It's intended for data that can be
assumed to be correct or its diagnosis postponed.
parse-data-file
(parse-data-file p f)
(parse-data-file p f en)
(parse-data-file p f en us)
Works like (parse-file) but with error diagnostics disabled for
better performance. It's intended for data files that can be
assumed to be correct or its diagnosis postponed.
parse-file
(parse-file p f)
(parse-file p f en)
(parse-file p f en us)
Parses a file; takes an optional encoding and user state,
which default to utf-8 and nil. Returns a PState record.
predict
(predict p)
Applies p; if it succeeds it consumes no input.
print-error
(print-error s)
Prints error messages in a PState record.
put-state
(put-state u)
Put u as the new value for user state in the parser state record.
reply
(reply v s)
Makes s succeed with value v.
return
(return v)
Succeeds without consuming any input. Any carried errors
are removed.
run
(run p cs)
(run p cs src)
(run p cs src us)
For testing parsers, e.g. at the REPL. Calls (parse) on the
arguments and prints the result. If p succeeds it prints the
parsed value; if it fails it prints any error messages.
run*
(run* p cs)
(run* p cs src)
(run* p cs src us)
For testing parsers, e.g. at the REPL. Works like (run) but
on success it pretty-prints the resulting parser state.
runf
(runf p f)
(runf p f en)
(runf p f en us)
For testing, e.g. at the REPL, with input from files.
Prints the results.
runf*
(runf* p f)
(runf* p f en)
(runf* p f en us)
For testing, e.g. at the REPL, with input from files.
Pretty-prints the results.
satisfy
(satisfy pred)
Succeeds if the next character satisfies the predicate pred,
in which case advances the position of the input stream. It
may fail on an unexpected end of input.
search
(search p)
Applies a parser p, traversing the input as necessary,
until it succeeds or it reaches the end of input.
sep-by
(sep-by sep p)
Parses p zero or more times while parsing sep in between;
collects the results of p in a vector.
sep-by1
(sep-by1 sep p)
Parses p one or more times while parsing sep in between;
collects the results of p in a vector.
sep-end-by
(sep-end-by sep p)
Parses p zero or more times separated, and optionally ended by sep;
collects the results in a vector.
sep-end-by1
(sep-end-by1 sep p)
Parses p one or more times separated, and optionally ended by sep;
collects the results in a vector.
set-position
(set-position pos)
Sets the position in the input stream of a parser state.
skip
(skip p)
(skip p q)
(skip p q & more)
Applies one or more parsers and skips the result. That is, it
returns a parser state record with a :value nil.
skip-many
(skip-many p)
Parses p zero or more times and skips the results. This is
like skip but it can apply p zero, one, or many times.
skip-many1
(skip-many1 p)
Parses p one or more times and skips the results.
skip-ws
(skip-ws p)
Skips whitespaces before parsing p.
space
Parses the space character.
split
Splits a string on whitespace.
split-on
(split-on cs)
Splits a string on one of the given characters and whitespace.
Removes empty strings from the result.
sym*
(sym* x)
Parses a single symbol x (a character).
sym-
(sym- x)
Parses a single symbol x (a character); not case-sensitive.
tab
Parses the tab character.
times
(times n p)
Applies p n times; collects the results in a vector.
token*
(token* xs)
(token* xs & more)
Parses a specific string, not necessarily delimited. If more
than one are given it will try each choice in turn.
token-
(token- xs)
(token- xs & more)
Parses a specific string, not necessarily delimited; not
case-sensitive. If more than one are given it will try
each choice in turn.
unexpected
(unexpected msg s)
Sets s as failed because an unexpected reason.
upper
Parses an upper-case letter.
value
(value p cs)
(value p cs src)
(value p cs src us)
Calls (parse) on the arguments and returns the actual parsed
value, not the PState record.
white-space
Parses a whitespace character.
word*
(word* letter cs)
(word* letter cs & more)
Parses a specific string, delimited by letter. If more than
one are given it will try each choice in turn.
word-
(word- letter cs)
(word- letter cs & more)
Parses a specific string, delimited by letter; not case-sensitive.
If more than one are given it will try each choice in turn.