I recently began to look at implementing the search part of the SCIM 2.0 Protocol (RFC-7644).
The RFC describes the query language using something called an ABNF. Here’s what the ABNF for the SCIM query language looks like.
SCIM filters MUST conform to the following ABNF [RFC5234] rules as
specified below:
FILTER = attrExp / logExp / valuePath / *1"not" "(" FILTER ")"
valuePath = attrPath "[" valFilter "]"
; FILTER uses sub-attributes of a parent attrPath
valFilter = attrExp / logExp / *1"not" "(" valFilter ")"
attrExp = (attrPath SP "pr") /
(attrPath SP compareOp SP compValue)
logExp = FILTER SP ("and" / "or") SP FILTER
compValue = false / null / true / number / string
; rules from JSON (RFC 7159)
compareOp = "eq" / "ne" / "co" /
"sw" / "ew" /
"gt" / "lt" /
"ge" / "le"
attrPath = [URI ":"] ATTRNAME *1subAttr
; SCIM attribute name
; URI is SCIM "schema" URI
ATTRNAME = ALPHA *(nameChar)
nameChar = "-" / "_" / DIGIT / ALPHA
subAttr = "." ATTRNAME
; a sub-attribute of a complex attribute
Figure 1: ABNF Specification of SCIM Filters
From RFC-7644
ABNF stands for Augmented Backus-Naur Form. It is a meta language used
to describe the grammer of a language.
Ruby seems to be lacking support for the SCIM query language so I wanted to see if there was a way that I could help fill that void. To accomplish this goal, I needed to learn the ABNF meta-language.
ABNF is described in RFC-5234.
As an aside, drop this shell script in your $PATH to make
reading RFCs much easier.
モ cat ~/bin/rfc
#!/bin/sh
mkdir -p ~/.rfc
rsync -avz --delete ftp.rfc-editor.org::rfcs-text-only ~/.rfc
less ~/.rfc/rfc"$1".txt
ABNF is broken down into two categories:
- Rule Definitions
- Operators
Rule Definitions
Let’s take a look at a single rule.
FILTER = attrExp / logExp / valuePath / *1"not" "(" FILTER ")"
The rule above has a name of FILTER. This name is case insensitive.
It then refers to 3 other rules, then itself.
.i.e.
Match the rule named `attrExp` or the rule named `logExp` or the rule
named `valuePath` or (an optional literal `not` followed by literal `(`
followed by rule named `FILTER` followed by literal `)`.
Expressed as ruby using Parslet rule definitions:
# FILTER = attrExp / logExp / valuePath / *1"not" "(" FILTER ")"
rule(:filter) do
(attribute_expression | logical_expression | value_path) |
(not_op? >> lparen >> filter >> rparen)
end
Operators
In the last section I showed a few examples of operators. In this section, let’s go over each operator in more detail.
Concatenation
The following is an example of the Concatenation operator.
"(" FILTER ")"
It says that a literal ( must be followed by a match for rule FILTER
then followed by a literal ).
This can be expressed in ruby via Parslet as:
(str('not').maybe >> str('(') >> filter >> str(')'))
Ruby’s shovel (») operator is used as the ABNF concatenation operator.
Alternatives
The following is an example of the Alternatives operator.
attrExp / logExp / valuePath
The alternatives operator allows a rule to be matched by supplying
alternative rules that can be matched. In the example above, this rule
is matched when either attrExp or logExp or valuePath rules are
matched.
Expressed via Parslet as:
(attribute_expression | logical_expression | value_path)
Ruby logical or operator is overloaded to support the ABNF
alternative operator.
Sequence Group
Elements enclosed in parentheses are treated as a single element, whose contents are strictly ordered. - RFC-5234
The following example uses a sequence group.
logExp = FILTER SP ("and" / "or") SP FILTER
It says that a FILTER rule must be followed by a literal space
character ` , then can be followed by a liternal and or a literal
or` then followed by a space and another filter.
Expressed as ruby using Parslet:
rule(:logical_expression) do
filter >> space >> (and_op | or_op) >> space >> filter
end
Variable Repetition
The operator “*” preceding an element indicates repetition - RFC-5234
For example:
*1"not"
str('not').repeat(0, 1)
The above example states at most 1 literal not.
ALPHA *(nameChar)
name_character.repeat(0, nil)
The above example says at 0 or more nameChar can be present.
1*2(nameChar)
name_character.repeat(1, 2)
The above example states that at least 1 nameChar but not more than 2
nameChars must be present.
Specific Repetition
element is equivalent to \* element
Optional Sequence
[foo bar] is equivalent to *1(foo bar).
attrPath = [URI ":"] ATTRNAME *1subAttr
(uri >> colon).repeat(0, 1) >> attribute_name >> sub_attribute.repeat(0, 1)
I hope this helps as a brief overview of how to map ABNF to Parslet.
The accompanying ruby source code can be found here.