Long Flights Are Good For Something

One thing long flights are good for is addressing old issues that you’ve always meant to fix but never really got around to. So when I found myself on a three-and-a-half hour flight yesterday, I decided it was time to tackle an old parser syntax issue that has been nagging at me for some time.

My goal with the parser has always been to make it as robust as possible, even if much of the functionality isn’t necessarily used in Vespers — I’d hate to have to add functionality after the fact, like for a future game. So the idea was to develop the parser to provide at least the typical performance one might expect from a TADS or Inform game. Most of the early months of development were spent on engineering the parser, and it was quite the battle. Hard to say for sure, but I’d say I got about 85% of the way there.

One of the missing pieces was being able to parse a command using the syntax VERB CREATURE NOUN. For instance, SHOW CECILIA THE KEYS.

The easier form of the syntax is VERB NOUN ‘TO’ CREATURE, as in SHOW THE KEYS TO CECILIA. The presence of the preposition (TO) helpfully divides the two tokens in the command. The way I implemented the parser, after first parsing and removing the verb, I scan the remainder of the command until I hit either a preposition or the end of the command; the words collected from that scan compose the token. In the example above, THE KEYS would make up the token that comes before the preposition TO; everything after the preposition (CECILIA) makes up the following token. Then I would proceed to figure out what objects the two tokens were referring to.

The problem occurs when there is no preposition, but I’m expecting two consecutive nouns. So if the command is SHOW CECILIA THE KEYS, the parser would grab the entire phrase CECILIA THE KEYS as a single token. Actually, I automatically remove all occurrences of the definite article THE, so the token would really be CECILIA KEYS, which would be handled the same way as something like GOLD KEYS or SHINY KEYS. So the parser would look for an object called CECILIA KEYS, and then proceed to throw up.

Fixing that to allow the alternative syntax was something I always knew was important and would have to be done, although there aren’t many game verbs that require this syntax. Off the top of my head, I can think of SHOW, GIVE, and FEED, although I know there are more. On top of that, it doesn’t really come up much in Vespers, except for a couple of times. But it’s a good example of a syntax that people often use, and in a text game the parser should really be able to handle it gracefully since, without it, the error message is unlikely to be helpful (“You don’t see any such thing.” Say what?). Then, naturally, people will point fingers and say “See? This is why we don’t have text games anymore.”

I knew it was going to be a challenge, mostly because the parser code is a large, unwieldy beast, and on top of that, I haven’t really looked at it closely for a good long time. Enter the three-and-a-half hour long flight, and we have a golden opportunity to reorient, refamiliarize, and figure this thing out.

As it turns out, the whole thing needed a grand total of three lines of code.

The way I handled it before, if I was trying to match a NOUN in the syntax and I had a token consisting of a bunch of words from the command to match it to, I would find the closest match and count how many words in the command were matched. If there were leftover words that weren’t part of the match, that meant something was wrong and the syntax shouldn’t match — so I aborted that syntax and moved on to the next one. So in the case of the command SHOW CECILIA THE KEYS, the token would be CECILIA KEYS, and I’d be able to match CECILIA with the word KEYS left over. The leftover word would cause the syntax to fail, and we’d move on to try the next syntax for the verb.

Since I already had the code in place to count the number of words in the token that were matched, all I needed to do was reset the number of words in the token to the number of words that were matched. So in the CECILIA KEYS example, the number of words in the token is 2, but the number of words matched is only 1. So I tell the parser that the correct number of words in the token is actually 1 (not 2), and the remainder of the token is kept for the next round of token matching. Add in a quick check to see if the next syntax word is a NOUN or HELD object, and that’s all I needed to add.

%nextSynWord = %syntaxWord[%currentSynWordNum+1];
if ((%nextSynWord $= "noun") || (%nextSynWord $= "held")) {
%nouns = %nouns - %numMatchWords;

That left plenty of time to watch “Fantastic Mr. Fox”, too. Nice flick.

Be the first to like.
Enjoyed this article? Subscribe to The Monk's Brew RSS feed.

This entry was posted in interactive fiction, text in games, Vespers. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

One Comment

  1. Posted March 23, 2010 at 11:27 AM | Permalink

    Excellent article and a very elegant solution too. Mind you, I do believe that any steps parsers take towards natural language are hugely important… Unfortunately, good writing can’t solve everything.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>


Subscribe without commenting