Custom Search

Saturday, December 28, 2013

An arduino anemometer

Why make an anemometer?

There are lots of anemometers on the market, with a lot of different features. What they don't have is a configurable display. By coupling a good wind sensor with an android board and a cheap display, the display can be configured to do things that simply aren't possible with any of the off the shelf systems. See my rc blog for more details about this.

Friday, November 15, 2013

Threads vs. processes for parallel processing

Introduction

Concurrency is a hot topic these days, largely because multi-core systems are the norm rather than the exception, and have been for a while. The thing is, multi-core isn't as much of a win for concurrent operations as it is for parallel processing. While the difference is subtle, it's also crucial for deciding what tools you want to use.

Some definitions

So, what is concurrent operations, parallel processing, and the tools I'm talking about? Or more importantly, given that they are used in a number of different ways in the industry, how am I using them?

Parallel processing

Parallel processing means carrying out multiple computations at the same time. Waiting for external devices is not carrying out a computation, so doing computing while IO is going on isn't as parallel processing - at least not in this document.

Concurrent operations

Concurrent operations means doing multiple things at the same time. Unlike parallel processing, concurrent operations includes doing IO. In fact, doing nothing but IO can qualify. Parallel processing is a form of concurrent operations, but I'm going to exclude parallel processing from the definition of concurrent operations in this document as a matter of convenience.

Thread of execution

A thread of execution is a series of instructions that are to be executed by the CPU in order. The operating system may interrupt the thread and allow other threads of execution to operate between instructions in a thread. This really isn't a term used in the industry, but I wanted a term that covered both threads and processes. I'm going to use it to avoid saying threads and processes or thread or process all the time.

Process

A process is a thread of execution plus an address space. The address space is not shared with other processes. Unix was a revolutionary step forward for processes. Before Unix, processed tended to be expensive and painful to create. Users usually got one process, and running a program would involve the command processor (which was running in the users process) loading the executable into the users address space and then calling the executable as a subroutine. It was possible to do concurrent operations in a process, and even have multiple threads of execution active, but it involved executable images specifically built for the purpose, since loading a standard executable would destroy the one that was running. By making processes easy and cheap to create from standard binaries, Unix allowed the command processor to run in it's own process, meaning that concurrent operations could be done by simply having the command processor not wait for a program to finish before prompting the user for the next command.

Thread

A thread is a popular - and very effective - tool for doing concurrent operations. A thread is just a therad of execution that shares its address space with other threads. Threads work very well for concurrent IO, since you can just code the IO normally, and then sync with the main process when it's done. For parallel processing, not so well. I'm going to discuss those problems next.

The problems with threads

Using threads creates a couple of problems that using processes avoids.

Shared everything

Actually, there's only one problem with threads, but it creates a number of issues. The problem is that all your data is shared implicitly. The immediate affect of this is that you have to figure out which of your data objects might be modified in different threads, and do something to make sure that doesn't happen concurrently.

In my experience, the biggest source of concurrency problems in threaded programs is programmers who didn't realize that some object they were modifying could be modified by another thread. I have seen the claim - from people I trust and respect - that a properly designed programming language (meaning not C++ or Java) and programming practices can avoid these problems. I haven't used that language, and find it hard to believe that masses of programmers will all follow proper practices. Maybe if a single entity controlled the language and everyone learned from their examples - which may well be the case for this language.

Elsewhere, functional programming languages that make most data immutable are catching on as a solution, allowing either the language runtime or type system to catch such errors.

Once you've identified those places, you've got to arrange to provide the appropriate interlocks. Language constructs that do this automatically are another hot research topic.

Fragility

If your program does something completely unexpected, then you may well have no idea what state it's in. The question becomes - how do you deal with this situation? If you are using threads for concurrency, then you have to worry about what the rogue thread may have done to data being processed by another thread, even if it they were logically unconnected - one of the consequences of implicitly sharing everything. To be reasonably sure that everything that might have been corrupted is discarded, you have to discard all the threads in the process.

If you using processes for concurrency, discarding that process provides the same level of assurance, but that's only one execution threads worth of work.

So using threads makes your programs fragile.

No distributed processing

If you're dealing with any significant amount of data, you're going to want to have multiple machines working on it. If your application is built using processes for concurrent operations and sockets for data sharing, this is trivial. All you have to do is add some kind of rendezvous mechanism.

If you're using threads, on the other hand - well, you pretty much have to start over to get any communications between the execution threads at all.

Sharing data between processes

Given that processes don't share an address space, the obvious issue is how do you get data between them. I've actually had proponents of threads claim that you couldn't share data between processes without serializing it (though to be fair, they were probably trolling). While that is one solution - and pretty much required for large problems - there are alternatives, each with their own downside.

The important thing to realize is that any non-trivial program will have multiple types of information to be shared, each of which will have it's own requirements. Using threads provides a single solution you have to use, even if it doesn't fit the requirements for the data at hand. If you're used to that model, you'll need to get past it and think about how you're actually using your data.

Explicit sharing

You can always share memory explicitly between processes. This provides the same level of communications as threads, without the nasty implicit sharing of everything. The downside is that I know of no language that provides language-level support for insuring that everything you need to share winds up in the shared memory segment. This means you are in practice limited to machine primitives composed with arrays and records. In my experience this hasn't been a problem, as most of the cases where you need this level of communications use such data.

Serial communications

You can also serialize the data and write it outside of your process. This is required if you want to distribute the processing of your data to more than one machine. The downside is that - well, compared to shared memory, it's expensive. You have to serialize and deserialize the data, and copy it between the processes in serialized form. If you have lots of communications going on (not the same thing as lots of data - passing one byte back and forth a million times is as bad as passing a megabyte back and forth once), this won't work very well. Unless, of course, you more data than will fit in the address space, in which case you pretty much have to do things this way.

Disk communications

An interesting variant on the idea of serial communications is storing the data on disk. This has the advantage of preventing data loss.

You store each chunk of data in a file in a queue directory. Each worker process scans the queue directory, trying to get an exclusive lock on each file until it succeeds in doing so. It processes the data therein, writing the output to a locked file in a second queue directory. When it's done, it can unlock the output file and then remove the input file. There's no chance of data loss, though you could wind up with duplicated data. Adding a work directory between the two queues reduces that problem, but then requires code to clean up the work directory at startup, moving input files back to the input queue and removing output files.

Inheritance

If you have a data structure small enough to fit in memory that all the processes need read access to, and any writes are not shared between processed, inheriting the data by creating it in the parent and forking to pass it to children works very well.

Configuration data for the application often fits this mold. Having all the configuration data handled by the parent - rather than having it in distinct files and the parent passing along the file name - means the parent can verify things between processes, like that the output of one process is correctly connected to the input of the next. Better yet, it can provide intelligent defaults for some values, and share configuration information between process, which has resulted in reducing configuration information by 99% in at least one case.

The downside is that the communications is one-directional. If you need to communicate a value back to the parent, you'll have to find another method to do so.

Exit status

This is another one-directional channel, and is very narrow - basically a single byte. While it doesn't seem very useful, it's used all the time in Unix systems. In every conditional statement in a shell script, the condition being tested is the exit status of a child process (unless it's a builtin, anyway). Any application that does process-based concurrency needs to deal with these, if for no other reason than to check that the child process actually finished properly.

A real-world example

I was involved in refactoring a credit card payment processing system from a single monolithic, threaded program into a distributed application.

Being distributed was part of the requirements, as one of the goals of the refactoring was to decrypt the financially sensitive data on machines that weren't exposed to the network. So serial communications was used to between hosts on the periphery of the network and the internal hosts with the crypto hardware.

The configuration information on each host was created by the parent process and inherited by the workers.

Each machine had a state table for all open connections to it. This was explicitly shared between the processes, so that the parent could correctly deal with child process failing, so that 1) no transaction was reported as completed until it was actually completed, and 2) no transaction was processed twice. Further, it was shared via a file mmaped to disk, so that these same guarantees could be made if the entire machine failed.

Miscibility

OK, so why not just mix and match threading and processes? Well, there's one last little issue with everything being implicitly shared. In order to fork safely, you need to know things about the state of all the locks - both external and internal - in your program. Getting that information becomes much more difficult if you have locks being used in multiple threads - and when do you use threads without some form of locking?

While not every case where a lock is held by a thread that isn't created in the child and hence won't be freed in the child is a disaster, it is in a lot of them. If an internal lock is locked by another thread when you fork and you need it, you're hosed because you'll never be able to get it. External locks you may or may not be able to get, even if you shouldn't be able to if the parent has them, and freeing them may or may not cause them to be freed for everyone, even if the parent is still using them.

Mixing threads and forks arbitrarily can be done, but - well, it's not at all clear it's less trouble than simply staying with threads for everything. On the other hand, if you limit each process to one or the other - parents never create threads but use forks to create workers, which use threads if they need them, but never fork - then you'll be OK.

Except, of course, if you're building something like a CI system, where what the workers do is fork off processes to build things. In that case, workers should avoid using threads.

Sunday, October 6, 2013

Dynamic typing is doomed

Introduction

Ok, the title is a bit sensationalist. I don't think dynamic typing is really dead. Different tasks call for different language characterstics, and there will always be a place for dynamically typed languages. However, I believe the decades-long debate about whether dynamic typing or static typing is the better choice for general-purpose computing is pretty much over, and dynamic typing lost.

The arguments

The core argument for static typing is that the compiler will catch a lot of errors for you, which saves you time. While a proper test suite will catch all or most of them in any case, and using static typing doesn't eliminate the need for such a suite, taking the test step out of the edit-load-test cycle is as much a win as taking the compile step out of the edit-compile-load-test cycle.

The core argument for dynamic typing is that providing the compiler with type information is a waste of time. Having to type in the type name as many as three times in some cases interrupts and slows development.

Both of these statements are prima facei true, and the question is which provides the larger savings. I personally spent two decades using dynamically typed languages because - well I hate typing.

The fallacy

The problem is that you only have to provide the compiler with lots of type information for languages using primitive type systems. Since most of the popular statically typed languages do that, people don't realize that this is a fallacy.

If you instead use a language with a modern type system - or even a more conventional one and a modern type engine - you can avoid providing type information most of the time. Instead, the compiler can derive the type of variables based on the operations they participate in and the constants also used in those operations, allowing for variable use that is similar to dynamically typed languages.

It's not perfect, because some operations can work on multiple types, and the compiler may not have enough information to pick an ambiguous type. But those cases are the exception, not the rule.

The irony

What's ironic here is that a statically typed language can use the type information to cut down on the typing needed in some cases. All languages provide default behaviors for types composed from more primitive types. Dynamically typed languages do this when comparing container types, since the comparison of those is done by comparing the contained elements in an order determined by the container. Statically typed languages can also do this for creation and for union types, because they have the information required to do so where dynamically typed languages don't.

So, while the argument against static typing is that the developer has to write more code, the reality is that a language with a modern type system can derive the type information, and use it to cut down on the amount of code the developer has to write!

An example

I realized this interesting facet of static typing while working on project euler problem #54

Given that Python values cutting out boilerplate, I wonder if there's some way to leverage this as completely as Haskell does. The enum types aren't a problem, and a tuple for a card should also work mostly the same. The real trick is the union type used to represent a hand. In Python I'd create another enum for the types of the hands, which would be about the same length as the union type declaration in Haskell. However, where in Haskell I'm now done, Python would seem to require creating a wrapper type to provide the comparison function.

A comparison

So, let's go through the types involved in both languages.

Note that I'm using the Enum class from as-yet-unreleased Python 3.4. There is no Enum class in earlier versions of Python, though there are a number of third party modules available that are similar to this. Since this was chosen for Python 3.4, it's presumably the best of breed and will become the defacto standard.

The code I wrote for previously mentioned project euler problem - tweaked for interactive use - is available in a School of Haskell version of this paper. The python types haven't been put into a program, and have only been tested informally.

Suit

Haskell provides a simple declaration for the suit of a card:

data Suit = Spades | Hearts | Diamonds | Clubs deriving (Show, Eq)

A suit is either Spades, Hearts, Diamonds or Clubs. The last bit - deriving Eq - tells the compiler to automatically create the code so that instances of a Suit can be compared for equality. It doesn't provide an ordering (that would be Ord as well as Eq), because most poker games - and the problem statement - don't rank suits. This type - and all the rest - derive from Show so we can just print them for debugging.

Python is equally succinct:

Suit = Enum('Suit', 'Spades Hearts Diamonds Clubs')

This creates the class Suit as a subclass of Enum with the four suits as members. The comparison behaviors come from the Enum class, not the Suit type, so there's no need to specify something like a deriving clause.

The oddity of using strings and having to specify the name twice (once to name the created class, and once as a variable to bind it to) comes from using the functional shorthand for Enum. While the bound variable and the class name don't have to be the same, it's idiomatic to do so, and doing otherwise could result in some confusion in tracebacks and other error messages. A class statement could be used, but that would require binding the members to values by hand.

Rank

The Haskell for the rank of a card is only slightly longer:

data Rank = Two | Three | Four | Five | Six | Seven | Eight | Nine | Ten
     | Jack | Queen | King | Ace deriving (Show, Eq, Ord, Enum)

For rank, we want ordered comparison, and a sequence to the entries in the list, so we add Ord and Enum to the deriving clause to get those respective behaviors.

Python is also slightly longer:

Rank = IntEnum('Rank', 'Two Three Four Five Six Seven Eight Nine Ten '
                    'Jack Queen King Ace')

While the Enum class is already iterable, it doesn't have an ordering, so we use the subclass IntEnum which adds that.

Card

The representation for a card in Haskell is as obvious as the Suit and Rank types:

data Card = Card Rank Suit deriving Show

instance Eq Card where
    Card r1 _ == Card r2 _ = r1 == r2

instance Ord Card where
    Card r1 _ <= Card r2 _ = r1 <= r2

Here Haskell gets the duplicate name. The first Card is the name of the type, the second the name of a constructor for instances of that type. Again, idiomatic usage is that they be the same if there's only one constructor, but it's not required.

This is the first type where we can't automatically derive the behavior we want. That's because the default would include the suit component in the comparison, whereas we want to ignore it. So we need to provide two instance definitions - one for Eq and one for Ord - that provide the desired behavior.

Python has a similar problem:

@functools.total_ordering
class Card(object):
    def __init__(self, rank, suit):
           self.rank = rank
           self.suit = suit
    def __eq__(self, other):
          return self.rank == other.rank
    def __lt__(self, other):
          return self.rank < other.rank

Haskell's default for comparisons uses the features of a Card. In a dynamic language, what features do an don't exist can change during execution. This makes them a poor choice for use in default behaviors. So the default behavior for comparison of objects in Python is to compare their identity. That is an implementation-defined value that is only guaranteed to be unique to this object during it's existence, and not change during the life of the object.

This means that by default, an object is only equal to itself. Still not what we want, so we provide an __eq__ method to get the proper value.

The default for ordering, on the other hand, is useless. You can't depend on two objects having the same order relationship between runs. So any time we want to have an ordering on instances of a class, we have to write at least one method, in this case __lt__. The @functools.total_ordering decorator then takes care of providing the rest of the comparison methods for a total ordering, which is what we want.

Knowing the features also means that a Card constructor must always accept a Rank and a Suit and return a Card so a default constructor can be created by the language. Being able to attach features dynamically is normally leveraged in a creation method in dynamic languages, so we have to write the __init__ method to do that. To be fair, at least one dynamic language allows the creation of instances of such types which bind the values automatically, but that does seem to be the exception rather than the rule for dynamically typed languages.

Hand

So, we're ready for the actual hand type. In Haskell, that's only a little bit more complicated than the preceding:

data Hand = HighCard [Card]
          | PairOf Rank [Card]
          | TwoPair Rank Rank [Card]
          | ThreeOf Rank [Card]
          | Straight [Card]
          | Flush [Card]
          | FullHouse Rank Rank [Card]
          | FourOf Rank [Card]
          | StraightFlush [Card]
            deriving (Show, Eq, Ord)
          -- RoyalFlush isn't listed; it's just an Ace-high StraightFlush

Here, we can see why Haskell has both constructors and type names. The type is Hand, but the constructors are the types of poker hands: FullHouse, Flush, etc. Hand, like most of the previous Haskell types, derives the Eq and Ord functions for the type, just like it does for any other container type. This does what we want - Hand types listed first will be less than hand types listed later, so a Flush is greater than a Straight. Two Hands of the same type will be compared by the provided Ranks, if any, and then the lists of Cards in the hand will be compared if the Ranks are equal. The only requirement for correct behavior is that the Hands be created correctly.

In Python, this is a bit harder - there is no union type for Python. So we're going to have to provide our own discriminator. That's:

Hand = IntEnum('Hand', 'HighCard PairOf TwoPair ThreeOf Straight Flush FullHouse FourOf StraightFlush')

And now there's lots of choices, but no obviously good ones. Writing a single class similar to Card, except you then have to decide how to handle the differences between initializing the different Hand types. You could default the two Rank elements so you can leave them off when not needed, but this exposes the internal implementation in the API. One way to deal with different initializations would be a subclass for each hand type, which might be idiomatic, but would be a bit long.

You can avoid having to write custom comparisons by using a tuple that starts with a Hand entry and comparing those. In that case, a function for each hand type that returns the appropriate tuple does the job, and provides a clean API, but it's a bit repetitive:

def HighCard(cards):
    return Hand.HighCard, cards

def PairOf(rank, cards):
    return Hand.PairOf, rank, cards

def TwoPair(hiRank, loRank, cards):
    return Hand.TwoPair, hiRank, loRank, cards

def ThreeOf(rank, cards):
    return Hand.ThreeOf, rank, cards

def Straight(cards):
    return Hand.Straight, cards

def Flush(cards):
    return Hand.Flush, cards

def FullHouse(overRank, pairRank, cards):
    return Hand.FullHouse, overRank, pairRank, cards

def FourOf(rank, cards):
    return Hand.FourOf, rank, cards

def StraightFlush(cards):
    return Hand.StraightFlush, cards

You could drop the Hand enum and manage the values yourself, but that still leaves this version twice as long as the Haskell version.

Conclusion

As I stated at the outset, the statically typed code has less boilerplate. But it's really a minimal difference. Even in the worst case, with no actual code beyond the type declaration and creation, the dynamically typed code is only about twice as long as the statically typed code. In this case, exposing some of the implementation details to the user could make them shorter in both languages, and much closer to the same length. Adding real code - for instance, to categorize a hand as the proper type then invoke the appropriate constructor - will make that less significant.

The real difference was that - when I got the union type - the statically typed language still had an one obvious way to do things, that was succinct and provided an API I was happy with. With a dynamically typed language, there is no union type, since they don't make sense. After all, your variable can hold all the types, you just have to write the code that deals with the differences between them. But - well, you have to write that code. Which means figuring out the best way to deal with it. You also have to decide what the API is going to look like, and the examine the tradeoffs in it from using a simpler implementation. That took longer than writing all the rest of the code put together.

So, given that my statically typed language has slightly less typing and provides - in this case - an obvious way to do something that presents a problem in the dynamic language, it would seem that dynamic languages no longer have an advantage to offset having to wait for tests to run before you catch type errors. So they will eventually - and given the speed at which this industry adopts new languages, I probably won't live to see the day - dissappear. Or maybe both will disappear before then.

Sunday, July 14, 2013

Slowing down even more..

I know I haven't been exactly a blogging machine - it's a hobby, not a job, so I do it when I feel it - but things are going to get even worse, for two reasons.


  1. I've launched a new blog about my remote control doings. If you're at all interested in remote control craft - mostly aircraft, but I could talk about hovercraft, cars, balls, etc. - go check it out.
  2. My new contract with FP Complete involves writing what amounts to programming blog entries on a regular basis.  So any ideas for a programming blog first get evaluated for it, as that is a job! Their School of Haskell  is an amazing tool for writing blogs using Haskell, and their IDE product currently in beta test is an equally amazing tool for building and deploying web sites in Haskell.

Saturday, June 1, 2013

iOS as Bloatware

I've just bought my fourth Android phone, but only the second that wasn't an AOSP device. The two non-AOSP devices had bloatware problems that reminded me of the things that caused me to switch from the iPhone to Android phones.

For those not familiar with the term - and IIUC, iPhones don't have bloatware in the traditional sense - bloatware is software added to an Android system by either the manufacturer or the network provider that you can't delete. In that, it resembles the OS software - except it wasn't designed by the same group that did the OS, and seldom integrates well.

Droid 3

My first non-AOSP phone was a Droid 3. I bought it as part of changing carriers. The bloatware on it was worse than useless - it actually broke basic functionality.

The least annoying bloatware was the launcher provided by Motorola. It was slow and clunky - on a high end, dual-core phone! At least I could replace it by installing a third party launcher, which mean that it just took up room in the system ROM.

The annoying one was the task killer. Task killers on Android are a bad idea. I've never used them. People running broken software on early versions of Android may have benefited from them, but that wasn't true of Gingerbread, and Gingerbread had been out for most of a year by the time the Droid 3 came out. As far as I can tell, the only thing this ever did was pop up to interrupt my music or ebook reader when I used them for a long time.

Like most bloatware, it couldn't be removed, and couldn't be turned off. So it occupied room in both ROM and RAM and wasted CPU cycles.

But the music player was even worse. It was a modified version of Google's music player, with the bluetooth and headset controls hardwired to go to it. However, it predated Google's cloud-based music service. Installing the update from Google left me with two music players, with the same name and icon. Confusing. Worse yet, once I got the right player working, if I used a bluetooth or headset "pause" button on it, the wrong one would start playing.

When I asked Verizon for help, they couldn't do anything and suggested I talk to Motorola. Motorola's response was "That's the way it's supposed to be."

As you can expect, I rooted this phone ASAP, just so I could turn those things off. Even so, the bluetooth and headset music controls never worked properly on that phone. I was more than glad to get rid of it, and will be avoiding Motorola phones unless they do an AOSP phone.

HTC One

This is my third HTC Android phone, all of which have been Ones (the others being a G1 and a Nexus One). The bloatware on it is called "Sense". Sense actually gets good reviews - people seem to like it. I can see why, but it's a waste of space for me.

For instance, it has a clock, calendar and weather widget built into the launcher, car dock and lock screens. OK, cool. I normally don't use a clock widget (there's a clock in the notifications bar), but getting the other two in a built-in widget would make it worthwhile.

Except ... I fly RC aircraft as a hobby. Wind speed is crucial for this - if it's to high, I can't fly! I won't use a weather app that doesn't provide this. And the one provided with Sense doesn't. You can't change the app used by - or even started by - the widget, and you can't remove the widget.

There is an odd thing about this. Newer versions of Android have a feature that let the user disable bloatware - if whoever installed it lets you. The weather app on the HTC One can be disabled, and I did so. Now the widget just complains about not being able to get the weather. Seems like they could do better than that.

Similarly, the car dock application has a music player widget built into it. But the music player can't access Google's cloud music, so there's nothing on it I want to hear. I can't remove this, so the car dock needs to be replaced.

The other interesting part of Sense is "Blinkfeed", which HTC has been advertising heavily. It's a great idea - the latest news from all your news sources available on the lock screen! Except - well, no RSS feed. No email. Those are my primary news feeds. It's either various syndicated news sources or a small set of social networks - not including the one I use (Google Plus).

Fortunately, this is easy to fix. Install a third part launcher (again). Then install Executive Assistant. I've been using EA since the Nexus One. Much like Blinkfeed, it provides the latest information I want - except it actually provides the ones I want! Set it to act as a lock screen, and disable the stock lock screen in the Security settings, and it works better than ever.

iOS

So, how does iOS fit into this discussion? After all, Apple maintains strict control over the software on it, and it doesn't (or didn't when I was using it) have bloatware from the manufacturer or the cell provider.

It fits somewhere between the Droid 3 and the HTC One. Do recall that I haven't owned an iPhone in a while, so some of these practices may have changed. Updates are certainly appreciated.

Apple provides a smooth, integrated experience with it's devices. This is great - if it's the experience you want. If not, and it's part of the experience that Apple is particularly proud of, then you can't replace it - so it's indistinguishable from bloatware.

Unlike the Droid 3, none of the Apple software breaks third party software - assuming it's available. Unlike the HTC One, many of the parts can't be fixed by changing settings.

Apple doesn't have Widgets at all, so you can't configure the home screen display to provide information you want that apple doesn't provide. You can't replace the home screen software, so you can't fix things that way. Or even fix a home screen that doesn't provide the features you want. If it's not what you want and you can't replace it, it's bloatware, no matter who it comes from.

One of the things quietly taking over the Android world are "trace" keyboards. These let you enter words into a virtual keyboard without ever taking your finger off the keyboard. You make the same motions you'd make to type out the letters, but leave your finger on the keyboard. The software looks at where your finger changed directions, and then uses a dictionary to guess what word you are entering. Yes, it's not 100% accurate, but it's not really any worse than the iOS keyboard with auto-correction turned on. And it's both easier and faster than trying to type in the letters. They are now so popular that all the general-purpose keyboards I have available provide this functionality.

In fact, the ability to use third party keyboards is a major win in and of itself. Try opening an ssh session to another system and using a modern shell. Characters that are common in that environment range from hard to impossible to use. So I normally install a third party app that provides a keyboard designed for this situation.

Google has gone out of it's way to make the Android system pluggable, allowing users (and, unfortunately, manufacturers and cell providers) to plug in replacements for most functionality. After experiencing this, Apple's approach of locking users into their own software is indistinguishable from bloatware. Right now, none of my devices are rooted, though I've thought about it in order to install Titanium Backup for backup services, because between the built-in functionality and third party replacements, they do almost exactly what I want. My iPhones, on the other hand, never lasted long without being jail-broken. In fact, I'd usually delay OS upgrades until after the jailbreak was available, simply because they never did enough out of the box.

So iOS comes with software I don't like, don't want to use, and can't get rid of. Not as good as the HTC One, in that I have to jailbreak the phone in order to install an alternative. Better than the Droid 3, in that it's not fundamentally broken as is, and generally doesn't break the alternatives.

Sunday, March 17, 2013

Idiomatic C and Arduino

The few of you who read this blog regularly will be wondering about "C". Not Python, not some new-fangled functional language, but good old-fashioned C, the workhorse language of the last 30 years or so. That's because I'm talking Arduino, and C - sorta kinda - is the language of choice for that platform.

For those not familiar with it, Arduino is an open source hardware platform built around the ATMega micro-controller CPUs. The boards - whose design is covered by a Creative Commons Attribution Share-Alike license - provide the micro-controller and support circuitry to make using external devices easy, and have a standardized form factor for adding expansion hardware. There is also an IDE for writing code, for the C/C++ framework to run on the controller. The boards - or clones of them - are available for as little as $15 shipped

These are 8 bit CPUs with memory sizes (flash, eeprom, and ram) measured in K, or maybe 10s of K. There is no way the run time system of a modern language will run on them, though there is a forth system or two available, as well as some custom interpreters. You can use a modern language to control the hardware if you run it on the desktop with some kind of serial link between the devices. But normally you cross-compile a small language - like C or a DSL for hardware systems like Atom - for the thing. C isn't much of a reach - these systems aren't much less powerful than the machines C was developed on, though just enough so that a C compiler probably won't fit.

The project

As mentioned, the Arduino boards are designed to make it easy to use external hardware, and the standard examples start with LEDs. They're cheap, easy to hook up, and hard to screw up.

So, you give a geek a bunch of LEDs that can be controlled by a computer, and the first thing they're going to do is build a binary clock, unless they have a grid of LEDs, in which case they'll play life on them. Because that is, after all cool - unless you're not a geek, in which case it's geeky.

Once I noticed that there were wearable Arduino platforms available, I figured a binary wrist watch was a natural. So I built a prototype:

I'm not going to discuss the code here, but it has been added to the blog repository. What I want to discuss are the implementations of the binary clocks I found.

The code

None of the implementations felt like idiomatic C to me. Possibly the problem is with me - I learned C long enough ago that =+ was a valid operator, and this code is more C++ than C. Possibly it's the way hardware people (which is where Arduino's roots are) approach C. Possibly it's the result of learning programming with relatively modern languages instead of things like C and it's predecessors. But we're going to look at one function: display_value(uint8_t *, uint16_t)

display_value accepts a pointer to an array of pin numbers (Arduino output devices, which for this purpose can be set to either 0 or 1 to turn an LED off or on), and a 16 bit unsigned value to display on those LEDs. A common variation is to use a starting pin number and assume the pins are assigned in order, but this doesn't work well with lots of LEDs, as some pins have special capabilities, and wiring constraints may cause you to want to use pins in strange orders. The array of pin numbers allows them to be assigned to whatever pins are convenient.

One common version of the function (usually just coded in line) looks like so:
void
display_value(uint8_t *pins, uint16_t value) {

  for (int8_t i = 0; i < 8; i++) {
    digitalWrite(pins[i], value & (1 << i)) ;
  }
}

The digitalWrite function sets an output pin to the given value. This version is bad on the ATmega processors because they don't have multi-bit shift/rotate operations - you have to do them one bit at a time. So the 1 << i turns into an inner loop. The Arduino framework includes bit twiddling functions that would presumable avoid this, but they don't seem to be in common use, and aren't recommended in the language documentation about >> and <<. In fact, that the processors needs a loop to do variable multi-bit shifts isn't mentioned, either.

The second form avoids that, by using a mask variable that it modifies with each pass through the loop:
void
display_value(uint8_t *pins, uint16_t value) {

  for (uint8_t bit = 0x8, i = 0; bit; i++, bit >>= 1) {
    digitalWrite(pins[i], value & bit) ;
  }
}

This version doesn't have the inner loop for the shift, and when disassembled takes about half the instructions of the first version, and a bit less stack space.

The version I wrote as "natural" C changes the interface slightly, using a terminal zero in the array for the last element, then doing pointer arithmetic instead of an explicit index:
void
display_value(uint8_t *pins, uint16_t value) {

  for (; *pins; pins++, value >>= 1) {
    digitalWrite(*pins, value & 1) ;
  }
}

When disassembled, this one is slightly shorter and uses slightly less stack than the second version. However, if it were open coded, it would require another variable, or two, depending on whether or not value could be destroyed.

As I said, this feels natural to me. I learned to do array processing in C with pointer arithmetic. I can see that that might no longer be fashionable - after all, it is error-prone. However, that it allows me to avoid introducing two new variables seems to more than make up for that, as those introduce there own errors.

What I'm wondering is if there's some good reason to avoid the third version for Arduino projects, or modern C in general. Disassembling it didn't turn up any. Anyone else have an opinion?