Custom Search

Monday, August 10, 2015

Unicode input with X11

Unicode!

Whether you like it or not, Unicode is here. Modern programming languages allow it to be used in variable names and for operator symbols, older ones are adding it as extensions, and inputting it is getting easier all the time.

And frankly, I think most of us would rather read x ≠ 23 instead of x /= 23 or x != 23 or even x =/= 23 . Or how about x ∈ A instead of element(x, A)?

So here's how I set up my X11 keyboard to allow me to input the more popular programming symbols - at least for Haskell - directly from the keyboard, without having to use some editor-specific magic.

Dead keys warning

Those of you already using non-ASCII characters - for instance, the various accented latin characters popular in Europe - may be using dead keys to get to those characters. These changes may either make the dead keys stop working, or the dead keys could prevent them from working, depending on your keymap.

If you aren't sure, you can check for them by running the command xmodmap -pke | grep dead. If you get output, you're using dead keys. You'll want to save that for later!

But onwards...

Setup

For any of this to work, your environment and applications must be set up for 8-bit input and Unicode output.

First, you need to be using a locale ending in UTF-8 or similar to get Unicode output. Normally, you can fix it by just setting the LANG environment variable. The format is a two-letter language code in lower case, an underscore, a two-letter country code in upper case, a period, and then the character encoding. So I use en_US.UTF-8.

The command locale will tell you your current settings. If it doesn't use a Unicode character map like UTF_8, use locale -a to get a list of locales you can use.

If you want this to work in a terminal emulator or text editor, you should check the documentation to make sure they are set up properly for 8 bit input and Unicode output. Modern applications should adopt properly once you've set the environment variable.

You'll also need a font that has the appropriate glyphs. There are a number of good choices available. Personally, I use the DejaVu font family.

Get all that installed, restart your editor or terminal emulator or IDE, and set the font to the one you've chosen for this.

Mode switch

You can configure a Mode switch key as another shift key, allowing you to enter two (or more) more characters for each key, with and without the shift key also being held down.

It's normal for the AltGraph key to already be assigned to this, if you have one. The xkeycaps tool can help you find out, or possibly find out if you have a key assigned for it. Start it, select your keyboard type, and select Ok. Then hover over the various shift/window/bucky keys, and the KeySym line will display Mode_switch for the key you want to use.

If not, I'll tell you how to add one. If so, you can use it as is, or follow the instructions below to change it.

Assign Mode_switch

If you couldn't find the mode_switch key, or want to use one other than the one you found, you need to create an xmodmap file to change things.

Find the keycode

First, we need to find the keycode for the key you want to use for this shift key. Keycodes aren't normally a good thing to use in an xmodmap file, because they are very specific to your keyboard and system. However, given that you may be picking some non-standard key to start with, the alternative - a keysym - isn't likely to be any more portable, so we might as well do it the simpler way.

Again, you can use the xkeycaps tool to find the key. Start it, and then press the key you're interested in. It should highlight on the keyboard image in xkeycaps. Pressing it will then display the keycode and keysym info you need.

Alternatively, the xev command can be used. Run xev -event keyboard, and a new window will open. Press the key, and you'll get a couple of blocks of text, one for the KeyPress event and one for the KeyRelease event, including the keycode and keysym values. This happens for the various shift keys as well.

Assign and activate it

Now that we've found the keycode for the key we want to use, we need to tell X11 about it. The xmodmap command will do that. If you are already using it, then just add them to your file, probably near the top. If not, the normal name is .xmodmap.

First, just set up the key to send the Mode_switch value with the line. I chose keycode 64, so I used:

keycode 64 = Mode_switch Mode_switch

To activate it, you need to assign a modifier to it. Run the command xmodmap -pm and find a line that has no keys assigned. For me, that was the one starting mod5. Any of the lines starting mod should be usable except for mod1, which some applications assume is the Alt key. So now add the lines (again, I chose mod5; if you chose something different modify the lines):

clear mod5
add mod5 = Mode_switch

The clear mod5 probably isn't needed, but I'm being paranoid about things.

While we're at it, let's turn on a couple keys with extra symbols:

! Contains as a member () and supserset of ()
keysym period = period greater U220B U2283

! element of () and subset of ()
keysym comma = comma less U2208 U2282

As the comments say, this makes the ,< and .> keys also send ∈⊂ and ∋⊃. The format for these lines is the keysym command, then the keysym, then an = followed by four more keysyms. The U2280 "keysym" is a Unicode character for the symbol we want. The first keysym is the unmodified key, then the shifted key, then the mode key, and finally the shifted mode key.

So once this is added to your .xmodmap file, just run xmodmap .xmodmap (adjusting if you used a different file) to install it. You should now be able to enter the '∈', '∋', '⊂' and '⊃' characters in it by holding down your Mode key and typing ',', '.', '<', and '>', in order.

If that doesn't work, start up xkeycaps again, and check that the key you want for Mode_switch has that symbol assigned, and that the < and > keys have all four characters assigned, as that's the most likely source of problems.

And now lots of characters

Ok, now that we can enter unicode characters, lets add some!

Here's a list of what I'm using. The only thing I'm not particularly happy with is the . Using mode-u for U makes sense because it's both the first character of union and looks like the symbol, but neither mode-shift-u nor mode-i for intersection seems right.

! not sign (¬)
keysym grave = grave asciitilde U00AC U00AC

! division sign (÷)
keysym slash = slash question U00F7 U00F7

! dot operator () & ring operator ()
keysym 8 = 8 asterisk U22C5 U2218

! Contains as a member () and supserset of ()
keysym period = period greater U220B U2283

! element of () and subset of ()
keysym comma = comma less U2208 U2282

! increment ()
keysym minus = minus underscore U2206 U2206

! Greek alpha (α) (Α)
keysym a = a A U03B1 U391

! Greek beta (β) (Β)
keysym b = b B U03B2 U0392

! Greek delta (δ) (Δ)
keysym d = d D U03B4 U0394

! Greek epsilon (ε) (Ε)
keysym e = e E U03B5 U0395

! Greek gamma (γ) (Γ)
keysym g = g G U03B3 U0393

! An alternative for intersection (), greek small letter iota (Λ)
keysym i = i I U2229 U03B9

! Greek lambda (λ) (Λ)
keysym l = l L U03BB U039B

! micro symbol (µ), greek capital mu (Μ)
keysym m = m M U00B5 U039C

! Greek omega (ω) (Ω)
keysym o = o O U03C9 U03A9

! pi (π) and unary product operator ()
keysym p = p P U03C0 U220F

! Rationals ()
keysym q = q Q U211A U211A

! Greek rho (ρ) & rationals ()
keysym r = r R U03C1 U211D

! Greek sigma (σ) and unary sum operator ()
keysym s = s S U03C3 U2211

! Nil/undefined/bottom (), greek captilal tau (Τ)
keysym t = t T U22A5 U03A4

! Union () & intersection () (latter needs to be better)
keysym u = u U U222A U2229

! Square root symbol ()
keysym v = v V U221A U221A

! Greek small letter zeta (ζ), Integers ()
keysym z = z Z U03B6 U2124

! Thumbs up and down, and a couple of surprises
keysym Up = Up NoSymbol U1F44D NoSymbol
keysym Down = Down NoSymbol U1F44E NoSymbol
keysym Left = Left NoSymbol U1F595 NoSymbol
keysym Right = Right NoSymbol U1F594 NoSymbol

Fix any dead keys

If you use dead keys and install this as is, the dead keys will either stop working, or the mode switch won't work for them. If the assignment here works, they will no longer be dead keys, and will stop working. If the keysym I used wasn't used, then the assignment didn't happen, so the mode switch won't work.

In either case, you can fix this by changing the ASCII symbol name to the appropriate dead_key name in the assignment. If the dead keys quit working, just change the ones on the right side of the = sign to those. If the mode key isn't working, change them both.

The only dead keys I think might cause problems - and they keysyms I used above - are dead_tilde, which should replace asciitilde, dead_grave, which should replace grave, and possibly dead_cdilla, which should replace comma.

Run it automatically

Now, you need to set things up to do the xmodmap when you start X. Given that every distribution seems to set things up differently, I'm not going to cover that. If you have a way to run programs when you start X, that's probably the place to put it. But first, try simply creating .xmodmap in your home directory, as many systems checked for that and used it by default, and some still do that.

Multi_key

But we're not done yet. You can use a slightly newer facility: Compose. This lets you assign the symbol Multi_key to a key, and pressing that key will cause multiple keyboard characters to be collected and translated into a single key.

Supported on your system?

First, check to see if it's supported at all. You're looking for the Compose file for your locale. This will be someplace like /usr/share/X11/locale, though that will vary from distribution to distribution. In it, you'll find directories whose name matches your locale. For me, that's en_US.UTF_8. If there's a Compose file there, then you're set. You may want to look that over as well, as you'll now be able to use all the things it defines as well.

Add the Multi_key

This is a little bit easier to set up. First, see if you already have one by running xmodmap -pk | grep Multi. Again, if you have one, you can use it. If not, or if you want to use a different one, here's how to change it.

First, find a key you want to use. Since I'm using a very small keyboard without a lot of extra keys, I chose to put it on the same key as the Mode_switch key, only shifted. So now the keycode I use is:

keycode 64 = Mode_switch Multi_key

You can do that yourself, or set up another key by setting all four keysyms to Multi_key.

The downside of setting using the Mode_switch key that way is that the order I press the two in matters! If I hit the Mode_switch key first, then I get that fourth character for the key. If I hit the Shift key first, then I just entered the Multi_key key, and need to release the Mode_switch key to start entering more keystrokes.

Test it

The Compose facility is a library, and any application that is linked with it will work. Exactly what happens if the glyph isn't available will depend on the application. But once you've set up your xmodmap file to enable the Multi_key and run xmodmap to install it, you can restart those applications to use them.

So restart your editor/IDE/terminal emulator, and try typing one of the sequences from the Compose file we found earlier. Mine has <o> <o> in it as the degree symbol, °, so try that one. If that doesn't work something like a' should get you an accented a. If that doesn't work - back to xkeycaps, to make sure your Multi_key key is properly enabled.

And a bunch more symbols!

Now you can install a bunch of math/programming symbols. These go into ~/.XCompose. Once you've set that file up, it will be loaded by applications that support it when they start.

Here's the set I use:

# Get system defaults. Might be useful things there as well.
include "%L"

# Note: There are a few cases where a two-character sequence is a
# prefix of a three-character sequence. The two-character sequence
# gets a <space> suffix to disambiguate them.

# Logical operators for programming languages that support them.
<Multi_key> <equal> <equal>               : "≡" U2261 # IDENTICAL TO
<Multi_key> <equal> <slash>               : "≢" U2262 # NOT IDENTICAL TO
<Multi_key> <equal> <U00AC>               : "≢" U2262 # NOT IDENTICAL TO
<Multi_key> <equal> <exclam>              : "≢" U2262 # NOT IDENTICAL TO
<Multi_key> <equal> <asciitilde>          : "≢" U2262 # NOT IDENTICAL TO
<Multi_key> <slash> <equal>               : "≠" U2260 # NOT EQUAL TO
<Multi_key> <U00AC> <equal>               : "≠" U2260 # NOT EQUAL TO
<Multi_key> <exclam> <equal>              : "≠" U2260 # NOT EQUAL TO
<Multi_key> <asciitilde> <equal>          : "≠" U2260 # NOT EQUAL TO
<Multi_key> <ampersand> <ampersand>       : "∧" U2227 # LOGICAL AND
<Multi_key> <bar> <bar> <space>           : "∨" U2228 # LOGICAL OR
<Multi_key> <less> <equal>                : "≤" U2264 # LESS-THAN OR EQUAL TO
<Multi_key> <greater> <equal>             : "≥" U2265 # GREATER-THAN OR EQUAL TO
<Multi_key> <slash> <less>                : "≮" U226E # NOT LESS-THAN
<Multi_key> <U00AC> <less>                : "≮" U226E # NOT LESS-THAN
<Multi_key> <exclam> <less>               : "≮" U226E # NOT LESS-THAN
<Multi_key> <asciitilde> <less>           : "≮" U226E # NOT LESS-THAN
<Multi_key> <slash> <greater>             : "≯" U226F # NOT GREATER-THAN
<Multi_key> <U00AC> <greater>             : "≯" U226F # NOT GREATER-THAN
<Multi_key> <exclam> <greater>            : "≯" U226F # NOT GREATER-THAN
<Multi_key> <asciitilde> <greater>        : "≯" U226F # NOT GREATER-THAN
<Multi_key> <greater> <greater> <space>   : "≫" U226B # MUCH GREATER-THAN
<Multi_key> <less> <less> <space>         : "≪" U226A # MUCH LESS-THAN
<Multi_key> <less> <less> <less>          : "⋘" U22D8 # VERY MUCH LESS-THAN
<Multi_key> <greater> <greater> <greater> : "⋙" U22D9 # VERY MUCH GREATER-THAN

# Symbols for set theory, using mode-shifted multikey goodies
<Multi_key> <slash> <U2208>               : "∉" U2209 # NOT AN ELEMENT OF
<Multi_key> <U00AC> <U2208>               : "∉" U2209 # NOT AN ELEMENT OF
<Multi_key> <exclam> <U2208>              : "∉" U2209 # NOT AN ELEMENT OF
<Multi_key> <asciitilde> <U2208>          : "∉" U2209 # NOT AN ELEMENT OF
<Multi_key> <slash> <U220B>               : "∌" U220C # DOES NOT CONTAIN AS MEMBER
<Multi_key> <U00AC> <U220B>               : "∌" U220C # DOES NOT CONTAIN AS MEMBER
<Multi_key> <exclam> <U220B>              : "∌" U220C # DOES NOT CONTAIN AS MEMBER
<Multi_key> <asciitilde> <U220B>          : "∌" U220C # DOES NOT CONTAIN AS MEMBER
<Multi_key> <U2282> <equal>               : "⊆" U2286 # SUBSET OF OR EQUAL TO
<Multi_key> <U2283> <equal>               : "⊇" U2287 # SUPERSET OF OR EQUAL TO
<Multi_key> <slash> <U2282> <space>       : "⊄" U2284 # NOT A SUBSET OF
<Multi_key> <U00AC> <U2282> <space>       : "⊄" U2284 # NOT A SUBSET OF
<Multi_key> <exclaim> <U2282> <space>     : "⊄" U2284 # NOT A SUBSET OF
<Multi_key> <asciitilde> <U2282> <space>  : "⊄" U2284 # NOT A SUBSET OF
<Multi_key> <slash> <U2283> <space>       : "⊅" U2285 # NOT A SUPERSET OF
<Multi_key> <U00AC> <U2283> <space>       : "⊅" U2285 # NOT A SUPERSET OF
<Multi_key> <exclaim> <U2283> <space>     : "⊅" U2285 # NOT A SUPERSET OF
<Multi_key> <asciitilde> <U2283> <space>  : "⊅" U2285 # NOT A SUPERSET OF
<Multi_key> <slash> <U2282> <equal>       : "⊈" U2288 # NEITHER A SUBSET OF NOR EQUAL TO
<Multi_key> <U00AC> <U2282> <equal>       : "⊈" U2288 # NEITHER A SUBSET OF NOR EQUAL TO
<Multi_key> <exclam> <U2282> <equal>      : "⊈" U2288 # NEITHER A SUBSET OF NOR EQUAL TO
<Multi_key> <asciitilde> <U2282> <equal>  : "⊈" U2288 # NEITHER A SUBSET OF NOR EQUAL TO
<Multi_key> <slash> <U2283> <equal>       : "⊉" U2289 # NEITHER A SUPERSET OF NOR EQUAL TO
<Multi_key> <U00AC> <U2283> <equal>       : "⊉" U2289 # NEITHER A SUPERSET OF NOR EQUAL TO
<Multi_key> <exclam> <U2283> <equal>      : "⊉" U2289 # NEITHER A SUPERSET OF NOR EQUAL TO
<Multi_key> <asciitilde> <U2283> <equal>  : "⊉" U2289 # NEITHER A SUPERSET OF NOR EQUAL TO

# More symbols.
<Multi_key> <braceleft> <braceright>      : "∅" U2205 # EMPTY SET
<Multi_key> <slash> <slash>               : "∖" U2216 # SET MINUS
<Multi_key> <less> <minus> <space>        : "←" U2190 # LEFTWARDS ARROW
<Multi_key> <minus> <greater>             : "→" U2192 # RIGHTWARDS ARROW
<Multi_key> <minus> <less>                : "⤙" U2919 # LEFTWARDS ARROW-TAIL
<Multi_key> <greater> <minus>             : "⤚" U291A # RIGHTWARDS ARROW-TAIL
<Multi_key> <equal> <less>                : "⇐" U21D0 # LEFTWARDS DOUBLE ARROW
<Multi_key> <equal> <greater>             : "⇒" U21D2 # RIGHTWARDS DOUBLE ARROW
<Multi_key> <colon> <colon>               : "∷" U2237 # PROPORTION
<Multi_key> <plus> <plus> <space>         : "⧺" U29FA # DOUBLE PLUS
<Multi_key> <less> <bar>                  : "⊲" U22B2 # NORMAL SUBGROUP OF
<Multi_key> <bar> <greater>               : "⊳" U22B3 # CONTAINS AS NORMAL SUBGROUP
<Multi_key> <greater> <less>              : "⋈" U22C8 # BOWTIE
<Multi_key> <less> <asterisk> <greater>   : "⊛" U229B # CIRCLED ASTERISK OPERATOR
<Multi_key> <less> <plus> <greater>       : "⊕" U2295 # CIRCLED PLUS
<Multi_key> <less> <minus> <greater>      : "⊖" U2296 # CIRCLED MINUS
<Multi_key> <asterisk> <asterisk> <asterisk> : "⁂" U2040 # ASTERISM
<Multi_key> <plus> <plus> <plus>          : "⧻" U29FB # TRIPLE PLUS
<Multi_key> <bar> <bar> <bar>             : "⧻" U2AF4 # TRIPLE VERTICAL BAR BINARY RELATIONSHIP

I've included some really obscure symbols, because the languages I'm working with use them. As the comments note, there are some character sequences - usually a pair of the same character - that are prefixes of some three-character sequence - usually three of that same sequence. For those to work The exception is that (<less> <minus> <space>) is a prefix of  (<less> <minus <greater>). I'm not happy about this space, but it's the best alternative I've found so far. If you've got a better solution, let me know.

Friday, April 17, 2015

Functional Mazes

Long time, no blog!

Wow, it's been a long time since I posted here. Well, I've been involved with rc stuff and 3d printing things.

The latter is what led to this post. I ran into a 3d printed maze that was generated programmatically. The authors comment that trying to do this in OpenSCAD was impossible because it was a functional language.

I had to disagree. First, that OpenSCAD is a functional language. It might have functional inspiration, and the developers try to stay with those principles, but as a functional language, it sucks. So much so that I wrote a Haskell library to generate OpenSCAD from a proper functional language, with static type checking and much better error messages.

Second, I disagree that a maze generator is impossible in a functional language. There was a time in the past - long enough ago that I was working in BASIC, C and PostScript - that I wrote maze generators. So I decided to write one in Haskell. The generator is purely functional, though the State monad is used to hide some of the plumbing.

The algorithm

The algorithm I use is a bog-standard recursive walk of the cells in the maze. We start with walls between all the cells, and pick a random cell. Then take each of it's four neighbors in random order. If a neighbor has been visited, we skip it. Otherwise, remove the wall between those two cells, and recursively walk from that neighbor as well.

You can find this algorithm described on rosettacode. This will include implementations in a number of languages, including a more idiomatic Haskell version.

Representation

A maze is just an array of cells that either have halls or walls between each cell. Each cell needs two walls, one for each direction, and boolean to note that it's been visited. Which gives us the basic data types for a maze:

data Wall = Wall | Hall deriving Eq
data Cell = Cell { x, y :: Wall, visited :: Bool }
type Board = Array (Int, Int) Cell

If you're wondering why only two walls and not four, it's because each wall is shared by two cells. So each cell will have the wall with the larger coordinate in each direction. In particular, the x wall is the wall that runs in the y direction with the largest x coordinate, and vice versa.

Data.Array

To work with the array, we need a few tools from Data.Array:

import Data.Array (Array, array, bounds, (//), (!))

We need the Array type for our type declarations, which we've already seen. The array function is used to create the initial array. bounds gets the bounds of the array so we don't have to pass those around. // is used to create a new array with a list of changes to an array, and ! is used to reference a Cell in the array.

Generating the board

So our initial maze needs to have every cell with both walls, and initially not visited. That's just a Cell wall wall false.

But our maze is going to have a border around it, for a number of reasons. We can make the walk easier by creating border cells as visited. So the walk won't need to do bounds checking. And the border on the low coordinate sides will have the wall on that side, and no others. That will simplify drawing the maze. And as a final touch, we'll put entry and exit doors in the maze at the diagonally opposite corners on the axis.

So we have a function makeCell that looks convoluted, but each part is straightforward. For a Board of width w and height h, we make the cell at i, j with:

makeCell i j = Cell (if j == 0 then Hall else Wall)
                    (if i == 0 || i == 1 && j == h || i == w && j == 0
                     then Hall else Wall)
               $ i == 0 || i > w || j == 0 || j > h

This creates a Cell with an x Wall except for the first element in the j direction, and a y Wall except for the first element in the i direction, as otherwise we'd try and draw those. We also create Hall's for 1, h and w, 0, which will be the entry and exit for the maze.

Finally, if either x or y is 0 or x is greater than w or y greater than h, then mark these border Cell's as Visited, so we won't visit them during the walk. All other cells haven't been visited yet.

So now we can create the array with array and makeCell:

makeBoard w h = array ((0, 0), (w+1, h+1))
                      [((i, j), makeCell i j) | i <- [0..w+1], j <- [0..h+1]]

array takes list of pairs of indices and values and converts it to an array whose bounds are given as the first argument. In this case, bounds are (0, 0) and (w+1, h+1). That makes the border the indices that have 0 for either x or y, and w+1 for x and h+1 for y. A list comprehension generates the indices, and we call makeCell on them to create the Cell for each index.

The walk

We start with helper functions to remove each wall from a cell. Well, since this is a functional language, we can't actually remove the wall, so instead well have functions that return a cell with the appropriate wall removed. And one to return a visited cell.

clearY, clearX, visit :: Cell -> Cell
clearY cell = cell {y = Hall}
clearX cell = cell {x = Hall}
visit cell = cell {visited = True}

A step is represented by a tuple indicating the motion in the x and ydirections, so we'll want a list of all possible steps, allSteps:

 allSteps = [(0, 1), (0, -1), (1, 0), (-1, 0)]

The body of the walkMaze function is straightforward. Just pick a random cell in the maze, then call the internal helper walkCell for allSteps, that board position and our original Board:

i <- state $ randomR (1, (fst . snd $ bounds origBoard) - 1)
j <- state $ randomR (1, (snd . snd $ bounds origBoard) - 1)
walkCell (i, j) allSteps origBoard

walkCell implements the "in every direction from each cell" part of the algorithm by calling itself recursively, removing a random move from the list of moves it was passed on each recursion, stopping when there are no more moves. It uses doStep to walk the Board after that step:

walkCell _ [] b = return b
walkCell start steps board = do
  step <- (steps !!) <$> (state . randomR) (0, length steps - 1)
  walkCell start (delete step steps)
    =<< doStep start step (board // [(start, visit $ board ! start)])

doStep just calls walkCell on allSteps and the cell it steps to, after removing the wall between the Cell it's stepping from and the new Celll. The last bit is the hard part, requiring examining the move in detail:

doStep from@(i, j) (dX, dY) board
  | visited neighbor = return board
  | dY > 0 = walkCell' $ board // [(from, clearY cell)]
  | dY < 0 = walkCell' $ board // [(new, clearY neighbor)]
  | dX > 0 = walkCell' $ board // [(from, clearX cell)]
  | dX < 0 = walkCell' $ board // [(new, clearX neighbor)]
  where cell = board ! from
        new = (i + dX, j + dY)
        neighbor = board ! new
        walkCell' = walkCell new allSteps 

So we can put all that together to get:

walkMaze :: Board -> State StdGen Board
walkMaze origBoard = let
  clearY cell = cell {y = Hall}
  clearX cell = cell {x = Hall}
  visit cell = cell {visited = True}

  allSteps = [(0, 1), (0, -1), (1, 0), (-1, 0)]

  walkCell _ [] b = return b
  walkCell start steps board = do
    step <- (steps !!) <$> (state . randomR) (0, length steps - 1)
    walkCell start (delete step steps)
      =<< doStep start step (board // [(start, visit $ board ! start)])

  doStep from@(i, j) (dX, dY) board
    | visited neighbor = return board
    | dY > 0 = walkCell' $ board // [(from, clearY cell)]
    | dY < 0 = walkCell' $ board // [(new, clearY neighbor)]
    | dX > 0 = walkCell' $ board // [(from, clearX cell)]
    | dX < 0 = walkCell' $ board // [(new, clearX neighbor)]
    where cell = board ! from
          new = (i + dX, j + dY)
          neighbor = board ! new
          walkCell' = walkCell new allSteps 
  in do
    i <- state $ randomR (1, (fst . snd $ bounds origBoard) - 1)
    j <- state $ randomR (1, (snd . snd $ bounds origBoard) - 1)
    walkCell (i, j) allSteps origBoard

The key to doing this in a functional language is generating a new Board for the various recursive calls, rather than mutating a Board and just using recursion to keep track of the progress of the walk.

This is liable to create a lot of extra state in each recursion. I haven't made any attempts to minimize that, which you would want to do in a solution for production use. Idiomatic Haskell would use the State monad for the Board to hide the extra plumbing, as is done with the random number generator.

Displaying the board

While the walk above is the meat of this blog entry, I find the display code interesting, so will cover that as well.

It would be nice to be able to plug in various different types of output to display the maze, so that we can debug with ASCII to a terminal or a Diagram before adding code to generate OpenSCAD code. So we'll use a Board drawing function that takes functions that generate the walls and pastes them together. The type for the function is:

drawBoard :: (Board -> Int -> Int -> a)    -- make X-direction cell walls
             -> (Board -> Int -> Int -> a) -- make Y-direction cell walls
             -> ([a] -> b)                 -- combine [walls] into a row
             -> ([b] -> IO ())             -- Draw the board from [rows]
             -> Board                      -- Board to draw
             -> IO ()

As you can see, it takes two functions that create Wall's, one in each direction. Then a function to combine a list of walls into a row, and finally one that takes a list of rows and outputs the final maze. For a larger program, it might be worthwhile to use a Render data type to hold those for functions, but for a simple demo, it's just extra formula.

The wall drawing functions get the Board and indices, as the indices may be needed to calculate where the wall needs to go. However, we are also going to generate the rows by generating the walls for the Cell's in order of increasing x, then do the same to put the rows together in order of increasing y.

So the actual drawBoard code is:

drawBoard makeX makeY makeRow makeMaze board =
  makeMaze . concat $ [firstWall]:[drawCells j | j <- [1 .. height]]
  where height = (snd . snd $ bounds board) - 1
        width = (fst . snd $ bounds board) - 1
        firstWall = makeRow [makeX board i 0 | i <- [0 .. width]]
        drawCells j = [makeRow [makeY  board i j | i <- [0 .. width]],
                       makeRow [makeX board i j | i <- [0 .. width]]]

This builds firstWall, which is the x direction walls for the 0'th y row. We don't bother making the y direction walls for that row, since they aren't part of the maze proper. That firstWall is wrapped in a list and consed onto the list output by drawCells, which outputs a list consisting of a row x walls and a row of the y walls for the Cell's in that y direction. We draw the 0 Cells in each row to generate the y direction Wall that forms the boundary of the maze. There are no x direction Walls in those Cells, but either makeRow or DrawX will be responsible for dealing with any other artifacts that these cells might generate.

That result is passed to concat to turn it into a list of rows instead of a list of lists of rows, which are passed to makeMaze to output the maze.

Drawing in ASCII

For ASCII output, we only need two extra functions:

charX, charY :: Board -> Int -> Int -> String
charX board i j = if y (board ! (i, j)) == Wall then "---+" else "   +"
charY board i j = if x (board ! (i, j)) == Wall then "   |" else "    "

An x Wall is a horizontal line of dashes, and a y wall is a vertical bar. Hall's are just blank spaces, except for a + at an intersection. Note that an x Wall is the y element of a Cell, as the Cell element is named for the direction you are facing, but the Wall rendering is named for the direction the wall runs.

makeRow is simply drop 3 . concat, to paste the strings together and then remove the extra Hall's drawCells creates for the 0 cells in each row. makeMaze is just putStr. unlines.

At this point, if you load the module (available via the fossil repository link on the right) into ghci, you can print square grids. Just use :main 16 8 to print a 16 by 8 maze. Or on a Unix system, you should be able to do ./maze.hs 16 8 to generate a maze from the shell.

Graphical output

That works, but it's not very pretty. So let's do a little graphics. Since I'm not much of a graphics designer, it still won't be very pretty.

Support routines

This is a bit more complicated, so let's start with a couple of support routines.

diaSpace is used to create a spacer. It takes an R2, which is a direction, and a Double indicating how long it is. It outputs a Diagram B R2, which is something we can draw. Given that it's a spacer, it won't draw anything when drawn.

diaSpace :: R2 -> Double -> Diagram B R2
diaSpace unit size = phantom (fromOffsets [unit # scale size] :: Diagram B R2)

diaCell does all the work. It needs to know which Wall to check in a Cell, which direction to draw in, any spacer needed, and the cell size. Plus the board and the cell's index:

diaCell :: (Cell -> Wall) -> R2 -> Diagram B R2 -> Double -> Board -> Int -> Int
           -> Diagram B R2
diaCell side unit space cellSize board i j =
  space ||| make (side (board ! (i, j)))
  where make Wall = strokeT (fromOffsets [unit # scale cellSize])
        make Hall = diaSpace unit cellSize

diaCell returns the space in front of the result of calling the internal function make on the Cells' Wall. make is simple - it uses diaSpace to return a blank space for a Hall, and the Diagram primitives to create a line of length cellSize in the given direction.

Drawing cells.

Given diaCell, the two routines for drawing walls are simple:

diaX, diaY :: Double -> Board -> Int -> Int -> Diagram B R2
diaX = diaCell y unitX mempty
diaY cellSize = diaCell x unitY (diaSpace unitX cellSize) cellSize

The type of diaX and diaY match the types needed by drawBoard. diaX is just diaCell with the y Wall selector as it's first argument, the x direction and an empty spacer, as the wall spans the entire length of the Cell. diaY needs the cellSize argument as well, since the spacer it passes to diaCell is a cellSize spacer created by diaSpace.

Drawing the board

The row creator for drawBoard is simply the Diagram function hcat, which accepts a list of diagrams and puts them together horizontally in a new diagram.

The board creator is almost that simply, but is actually long enough to get it's own function:

diaBoard :: Double -> [Diagram B R2] -> IO ()
diaBoard ww rows =
  renderCairo "maze.png" Absolute $ vcat rows # centerXY # pad 1.1 # lwO ww

As with the row creator, the bulk of the work is done by the Diagram function vcat, which stacks the diagrams up vertically instead of horizontally. That image is then centered by centerXY, padded by pad 1.1, and the line weight is set to the wall width with lw0 ww. That diagram is passed to renderCairo along with some extra arguments so that it creates an appropriately scaled output in the file maze.png.

Seeing the result

The version of maze.hs in the fossil repository has the Diagram (and OpenSCAD) drawing code commented out. Once you install the diagrams package and the diagrams-cairo package, you can change that. Look for three places where a line starts with {- Comment out. The first two will need to be moved down to the next blank line. The last one will need to be moved down to beneath diaBoard. You can now run this in ghci as :main 16 8 40 2, or as ./maze.hs 16 8 40 2. The two new arguments are the size of the cell and the width of the walls to draw. The old ASCII invocations will still work as well.

After running it with 4 arguments, the file maze.png will be created in the current directory, and you can display that.

Expanding this to display images from the command line, or to embed it in an app for solving mazes, is left as an exercise for the reader. In which case, it ought to be made pretty as well.

Printing in 3d

The inspiration was a 3d-printed maze, so let's do that. This is very similar to the Diagrams code, so the commentary will be a bit shorter.

To show what using an encapsulating data type would look like, this uses the SCADCell data type, consisting of the side selector, a routine to construct the appropriate wall, and a Vector3d to move the wall to the appropriate place in the cell:

data SCADCell = SCADCell (Cell->Wall)                           -- Wall extractor
                         (Double -> Double -> Double -> Model3d) -- Wall drawing
                         Vector3d                                -- translation

This is the first argument to scadCell. scadCell just creates a the appropriate wall and base, or a 0-sized block if this is a border Cell. It also need the cell size, wall dimensions and base depth to create those models. scadX and scadY just call scadCell with the appropriate SCADCell.

scadX, scadY :: Double -> Double -> Double -> Double -> Board -> Int -> Int
                -> Model3d
scadX cs = scadCell (SCADCell y (flip box) (0, cs, 0)) cs
scadY cs = scadCell (SCADCell x box (cs, 0, 0)) cs

scadCell :: SCADCell -> Double -> Double -> Double -> Double ->
            Board -> Int -> Int -> Model3d
scadCell (SCADCell side box' move) cs ww wh bd board i j =
  make (side $ board ! (i, j))
  # translate (cs * fromIntegral (i - 1), cs * fromIntegral (j - 1), 0)
  where make Wall = box' ww (cs + ww) (bd + wh) # translate move <> base
        make Hall = base
        base = if i == 0 || j == 0 then box 0 0 0
               else box (cs + ww) (cs + ww) bd

Again, there's a function in the library that does exactly what we want for turning the output of the cell drawing routines into a row. So we just use union for this. That same function also serves to join the rows into a board, so we just need to compose it with draw in order to print the maze. However, this prints the maze "upside down" compared to the previous two rendering engines, so we use mirror to fix that as well. No real need, but it feels like the right thing.

You'll need to install version 0.2.1.1 or later of my Haskell OpenSCAD library from hackage and uncomment the appropriate code segments to use it. You can then run it as either :main or ./main.hs, using arguments like 16 8 20 2 4 10. That's the same four arguments as the Diagram version, with the depth of the base and the height of the walls added.

To see the results, you'll also need the OpenSCAD application. That can generate an STL file, and getting it to a 3d printer from there is up to you.

Main

Just for completeness, a brief look at the main routine that ties it together. This is really just a kludge to test the others, but it does the job.

The outline is to get the arguments, map them to integers. Sorry, no fractional sizes here. Then convert those to floats for the things that need them. Switch on the length of the argument list to either raise a usage error or create a drawBoard' function that's just the drawBoard invoked with the functions appropriate to the type of output we want.

Then get a random number generator, and run mazeWalk using it on a board of the appropriate size, which we will use the newly created drawBoard' to output.

main :: IO ()
main = do
  args <- map read <$> getArgs
  let floats = map fromIntegral args
      drawBoard' =
        case length args of
          2 -> drawBoard charX charY (drop 3 . concat) (putStr . unlines)

          4 -> drawBoard (diaX cs) (diaY cs) hcat (diaBoard ww)
               where [_, _, cs, ww] = floats

          6 -> drawBoard (scadX cs ww wh bd) (scadY cs ww wh bd) union
                         (draw . mirror (0, 1, 0) . union)
               where ([_, _, cs, ww, bd, wh]) = floats
{- Comment out drawing argument handling
-}
          _ -> error "Width Height [CellSize WallWidth | CellSize WallWidth WallHeight BaseDepth]"
  gen <- newStdGen
  drawBoard' $ evalState (walkMaze $ makeBoard (head args) (args !! 1)) gen