Wednesday, November 3, 2021

Map Compasses (Part 2): Compass Design Language & Parsing

Welcome back to my blog series on implementing procedurally-generated map compasses!  Last time I looked at a variety of map compasses and saw how they broke down into combinations of pointers and circles and wrote a simple code framework.  This time I'm going to talk about the high level design of the code, and start creating a language to describe compasses.

I intend to organize the map compass code into two stages:
(That's some high-quality software engineering, folks!)

In the first step, I will generate the compass.  By this I mean deciding what pointers and circles to draw, how to size and order them, what size lines to use and so on.  The output of this step will be a description of the compass, i.e., something like “A 50 pixel radius, 5 pixel wide circle of alternating black and white segments, then 60 pixel radius black and white intercardinal pointers, then 75 pixel radius black and white cardinal pointers, and then labels on the cardinal points":


Of course, I won't use English for this description but rather “Compass Description Language (CDL)," a structured language much like my Map Border Description Language (MBDL).  As with MBDL, one use of the CDL will be to easily save and reuse compass designs I particularly like.  Another reason to use CDL is that it will let me implement some complex compass designs that would otherwise be difficult.  So my code will look like this:
In the second step I will draw the compass.  This involves parsing the CDL produced by the generator, and translating it into commands to render the compass onto the screen.  In a little bit more detail, it will look like this:
The Generate step will be built around a grammar written in a procedural generation tool called RiTa. That grammar will create a compass description in the Compass Description Language (CDL).  The CDL description will then be parsed by a tool called Nearley into an intermediate representation that is more “machine friendly."  The intermediate representation will then be executed by a Javascript execution engine.  The output of that will be the SVG commands to draw the compass.

This seems convoluted but using a rules grammar for procedural generation is common, and the pattern of “parse to an intermediate representation and execute that" is a typical pattern for building a language interpreter.  And because I'll be using a couple of parsing and grammar tools, less code will be required than you might think from looking at that diagram.  This architecture will also enable me to design, develop and test each function separately.

While I'm using SVG for Dragons Abound, other people use a variety of different graphic display technologies.  So I'll try to isolate the code that actually does the rendering in SVG, so that other users can easily swap in code to render in Canvas or whatever.

I'll actually implement middle-out.  I'll start by defining (a subset of) the Compass Description Language, and then I'll build the code to draw that.  This has a couple of advantages.  First, it gets to a point where I'm actually seeing things on the computer screen quickly, and that's always nice.  It helps keep me motivated to see progress.  Also, if something doesn't look right, I can notice that and correct it early.  Second, creating the CDL and trying out different things helps explore and understand how a compass is drawn and what works and what doesn't.  This will be useful when I get to the building the procedural generation.

Rather than tackle the whole CDL at once, I'm going to do it in parts.  Because the compass will be drawn back-to-front, I'm going to start by looking at the circular ornaments that are behind the pointers.  Here's an example compass with an interesting background ornamentation:
So how would I describe the background ornamentation on this compass?  I'll work my way outside in, and I'll take my starting point as the tips of the cardinal pointers.  So starting there and working in, first there's some empty space, then there's a circle, more empty space, a scale, more empty space, a circular arrangement of dots, more empty space, another circle, more empty space and then a filled black circle.  If I translate that into something a bit more formal, it might look like this:
SPACE
CIRCLE
SPACE
SCALE
SPACE
DOTS
SPACE
CIRCLE
SPACE
FILLED-CIRCLE
However, I note that the space between the last two circles is about twice as big as the other spaces.  I could put the SPACE command in there twice, but it seems more general to add a parameter to the space command indicating how many pixels of space to leave:
SPACE(5)
CIRCLE
SPACE(5)
SCALE
SPACE(5)
DOTS
SPACE(5)
CIRCLE
SPACE(10)
FILLED-CIRCLE
Likewise, I can add parameters for the line width on the circles and the scales.
SPACE(5)
CIRCLE(1)
SPACE(5)
SCALE(8)
SPACE(5)
DOTS
SPACE(5)
CIRCLE(1)
SPACE(10)
FILLED-CIRCLE
Circles and filled circles only differ in whether they are filled, so I can make that a parameter as well:
SPACE(5)
CIRCLE(1, None)
SPACE(5)
SCALE(8)
SPACE(5)
DOTS
SPACE(5)
CIRCLE(1, None)
SPACE(10)
CIRCLE(0, Black)
Adding the fill color makes me realize that I've been assuming that all the lines in the compass will be drawn in black (and indeed they almost always are) but it might make sense to add that as a parameter as well, so that I at least have the option to use a different color:
SPACE(5)
CIRCLE(1, Black, None)
SPACE(5)
SCALE(8)
SPACE(5)
DOTS
SPACE(5)
CIRCLE(1, Black, None)
SPACE(10)
CIRCLE(0, Black, Black)
I'll return later to scales and dots, but you start to see the structure of CDL for these kinds of ornaments.  In particular, there are two CDL commands that look pretty complete:
  • SPACE(n) -- Move inward n pixels before drawing the next element.
  • CIRCLE(lineWidth, lineColor, fillColor) -- Draw a circle with a line that is lineWidth pixels wide with color lineColor, and fill the circle with color fillColor.
That seems like enough specification to implement these two commands, so I'll start with this as the core of CDL.  (Of course I might be wrong about this and have to update these commands later.)

Now I need to be able to take a compass described in CDL and turn it into the actions to draw the compass on the screen.  This is similar to getting from the written representation of a computer program (e.g., in Javascript) to actually executing the program.  The first step is to parse the language -- to translate it from the original text directly to an actual compass, or into some intermediate representation that makes it easier to translate into a compass.

When I worked on procedural generation of map borders, I created and parsed a Map Border Description Language (MBDL) and I'll do the same thing with CDL.  Both MBDL and CDL are examples of what is called a context-free grammar.  There are any number of Javascript parsing tools for parsing context-free grammars.  I settled on using Nearley.js, which seems to be a mature and (more importantly) well-documented tool.

I'm not going to provide a primer on context-free grammars, but the syntax of Nearley is pretty straightforward and you should be able to get the idea without too much trouble.  A Nearley grammar consists of a number of rules.  Each rule has a symbol on the left side, an arrow, and then right side of the rule, which can be a sequence of symbols or non-symbols, and different options separated by the pipe “|" (or) operator:
compass -> Element:+
Element -> spaceElement | circleElement
Each rule says that the left side can be replaced by any of the options on the right side.  So the first rule here says that a compass is one or more elements.  (The :+ sign after Element is the EBNF shorthand for “one or more of the preceding symbol.")  The second rule says that a background element can be a space or a circle.

By the way, if  you want to play around with this (or any other) grammar in Nearley there's an online playground here.   You can type in a grammar and test cases to see what matches and what doesn't.

As I indicated above, a space element has a number as a parameter:
@builtin “number.ne"
spaceElement -> “SPACE" “(" decimal “)"
There are a couple of new things here.  First, there are some strings in this new rule.  Strings are the non-symbols or terminals of the grammar, so they represent something that appears in exactly that way in the language.  In this case, the rule says that a spaceElement consists of the literal string “SPACE" followed by the literal string “(" followed by a decimal followed by the literal string “)".  The other new element is decimal.  Nearley has a couple of common elements built-in -- “decimal" being one of them -- so I can use that to recognize the numeric width in the space element.  Decimal is exactly what you think -- a decimal number.

So the rule above would recognize these examples as spaceElements:
SPACE(1)
SPACE(11.5)
SPACE(-45) 
However, it would not recognize the following examples:
MOVE(1)
space(1)
SPACE( 1 ) 
It's good that it doesn't recognize MOVE(1) as a spaceElement, but the other two failures are a little annoying.  It would be nice to have a little more flexibility in formatting.  To begin with, in Nearly you can add an “i" to the end of any string in a rule to indicate that it is case insensitive:
spaceElement -> “SPACE"i “(" decimal “)"
The new rule will recognize “SPACE," “space," and “Space" as well as other variants.  Ignoring whitespace is a little more work.  First let me define a rule to recognize whitespace:
WS -> [ \t\n\v\f]:*
This rule uses a number of new features.  The square brackets indicate a character set, and matches any character in the set.  For example, “[abc]" would match any single a, b or c character.  In this case, the character set matches spaces, tabs (\t), newline (\n), vertical tabs (\v) and form feed (\f) characters.  Finally, the “:*" means to match “zero or more" of the preceding expression.  So put all together, this means “zero or more of any white space character."  So this will match a space, two spaces, two spaces and a tab, and so on.

Now I can put this whitespace symbol anywhere that I want to be able to ignore white space.  For example:
spaceElement -> “SPACE"i WS “(" WS decimal WS “)" WS
Now my spaceElement will ignore whitespace between the various parts of the element.  (All the WS elements can make it difficult to read a rule, so I may leave it out for clarity in the blog posts.)

Here's the rule for the circle element:
@builtin “string.ne"
circleElement -> “CIRCLE" “(" decimal “," dqstring “," dqstring “)"
The new element here (dqstring) is another Nearley builtin, and it matches a string enclosed in double quotes.  So a circleElement has a number and two strings as its parameters.

So our full grammar looks like this:
@builtin “number.ne"
@builtin “string.ne"
Compass -> Element:+
Element -> spaceElement | circleElement
WS -> [\s]:*
spaceElement -> “SPACE" WS “(" WS decimal WS “)" WS
circleElement -> “CIRCLE" WS “(" WS decimal WS “," WS dqstring WS “," WS dqstring WS “)" WS
This grammar should recognize the following compassBG description:
SPACE(5)
CIRCLE(1, "Black", "None")
SPACE(5)
CIRCLE(1, "Black", "None")
SPACE(10)
CIRCLE(0, "Black", "Black")
Using Nearley is a little more complicated than most Javascript libraries because you must pre-compile your Nearley grammar before using it.  To do this, I put the above grammar into a file named “cdl.ne" and compiled it with “nearleyc cdl.ne -o cdl.js".  This creates a Javascript file that defines the grammar and can be included in your code with
import CDL from './cdl.js';
I've included the compiled file in the repository for this project, but if you want to compile your own Nearley grammars (or modify this one), you'll need to install the Nearley compiler on your own machine.  (If you're content to use the compiled files I provide then you don't need to bother.)  You also have to include the Nearly library in your program. I've put this in the Libraries folder of this project.  The easiest way to include it is to add this line to the HTML for the web page:
<script src="Libraries/nearley.js"></script>
To use the parser, you create a new Parser object based upon the grammar, and then feed it a string:
function parseCDL(text, debug=false) {
    // Create a Parser object from our grammar.
    const parser = new Nearley.Parser(Nearley.Grammar.fromCompiled(CDL));
    // Parse text
    const result = parser.feed(text);
    if (debug) {
          console.log('Parse of '+text+' is '+parser.results[0]);
    };
    return parser.results[0];
};
feed() returns an array of results because the grammar might be ambiguous, resulting in multiple ways to parse an input.  However, the CDL grammar is not ambiguous so I'm only concerned with the first result.

A reminder that if you want to follow along from home, you need to download the Part 2 branch from the Procedural Map Compasses repository on Github and get the test web page up and open the console.  Part 1 of this series has more detailed instructions, but the short version (on Windows) is to double-click mongoose-free-6.9.exe to start up a web server in the Part 2 directory, and then select the test.html link.  You can also try out this code on the web at https://dragonsabound-part2.netlify.app/test.html although you'll only be able to run the code, not modify it.

Either way, once you have the web page open you can hit the Test button on the web page, open the browser console, and you should see something like this:
Ignoring the actual result for the moment, you can see that the parse has succeeded and found all the elements of the compass description.

If you edit compass.js and change the string being passed to parseCDL into something illegal (here I've added 'XX' at the front of the string), you'll see what happens when the parse fails:


You get an error that tells you the character where the parse failed and what the parse was expecting to see instead.

To go back now to the actual result returned by the parser, if you look at it in the debugger you'll see it's an array with two items corresponding to the “spaceElement" and the “circleElement" from the input.  And each of these items are themselves arrays with items corresponding to the parts of the space or circle element.  This result corresponds to the parse tree of the input.  I won't actually be using this (for reasons that will become clear) so we can safely ignore it.

Now that we have a working grammar for the simple CDL subset, next time I'll tackle how to turn that into a drawing.

Suggestions to Explore

  • Right now CDL requires commands to be in upper case.  How would you modify the grammar to allow lower case and mixed case commands?

  • The SPACE command really moves inward and outward from the center.  It might make more sense to call it the MOVE command, or at least permit MOVE as an alias.  What's the best way to modify the grammar to allow both SPACE and MOVE as names for this command?

  • The CIRCLE command requires a fill color even if the color is “none" for no fill.  Add a new rule for a version of the CIRCLE command that only takes two arguments.  Can Nearley now parse both versions of the CIRCLE command?  How does it tell them apart?

No comments:

Post a Comment