Modern Sourcery: Acceptable Electronic Voting

H.R. 811, the Voter Confidence and Increased Accessibility Act (pdf), would mandate a "paper trail" for electronic voting machines and provide for public access to the source code for programs that control voting machines. I am withholding endorsement of HR811, because I want to be sure that the wording of the Act doesn't cause more problems than it solves, but on the surface the steps look positive.

This is a fairly long treatment of the subject for a blog post, but is far from a complete one. First I will give some background on information systems and security, followed by some specific principles that affect electronic voting. I'll give what should be non-controversial, or at least apolitical, policy statements, and then combine all of that together to show what an acceptable voting system would look like.

Background

To see why the things the bill addresses are important, we need to explore the basics of information security, as it applies to electronic voting. My goal is to introduce the topic to people who understand voting from a political or legal perspective, or that of a citizen, but who may have very little exposure to the technology at issue.

Electronic voting is an information system, a collection of processes arranged to transform, transmit, or store data. Information systems should be robust. A robust system is one which operates correctly and efficiently under a wide range of conditions, even under conditions for which it was not specifically designed. In particular, a robust system resists attempts to make it operate incorrectly.

In the subfield of information security, several principles are acknowledged by experts to help achieve robust and "secure" operation. "Secure" is in quotes there because it must be defined for each system. It has been noted that security is an emotion(†), which is an attribute of people, not of systems, but the feeling of security is engendered by some practices and endangered by others, and those practices can usually be analyzed without regard to why a certain result is desirable. A secure system is one about which the designers, implementers, and users feel confidence in its protection of their assets, within acceptable margins of risk. As in any risk analysis, the likelihood of a particular attack or failure must be balanced against the value of a given asset. It is meaningless to label a system secure without specifying the expected level and type of risk, and the assets to be protected against those risks.

While there is no perfect system, there are practices and principles which lead to secure operation. We try to anticipate problems and design to eliminate, or at least mitigate them. I'm going to get to the voting part soon, I promise.

The Principles

The hallmarks of secure operation are generally recognizable by anyone familiar with the concepts:

Economy of Mechanism - this means to keep things simple. Simpler processes are easier to understand and generally more robust.
Fail Safe Design - Erroneous input should result in the least harmful action.
Open Design - The reliability of the system should not depend on keeping its workings hidden.
Complete Access Control (Mediation) - Access to assets should be allowed only to those authorized to access those assets.
Least Privilege - Access to assets should be given only as required.
Separation of Privilege - Access to assets should be based on multiple independent criteria.
Least Common Mechanism - Shared means of operations should be minimized.
Psychological Acceptability - If the perceived inconvenience associated with system safeguards is higher than the perceived value they allow, users will tend either to circumvent the safeguards or to bypass the system altogether, and use something less effective but more accessible.

There are some principles to bear in mind when creating correct, robust systems:

Input should be validated before it is used.
Efficiency: when possible, the resources (typically time and space) used by a process should not grow faster than the size of the input.
Special cases signal that a design can be improved.
Hope is the enemy of "is".

OK, Now the Voting Part

There are two fundamental resources in voting, the physical ballot and the information contained on the ballot, the votes. The ballot is important as a physical record of the intention of the voter, but the information on the ballot is far more important to the process. A ballot may contain several votes, one per contest (except for multiple-choice board races, ballot initiatives, etc.).

Voting must be done in secret, or maintaining confidentiality.

Voting must be done with assured information integrity, so that no one can alter data or exert influence over the process itself in order to alter the outcome.

Voting must work, or be available. It is unacceptable for voters to be delayed longer by process failure as they are waiting to vote than it takes them to cast their ballots. Preliminary results must be known soon after the polls close in the last polling place (e.g., Hawaii).

Counting all votes should be a feature of any system, but there are several ways in which votes could fail to be counted properly:

Individual ballots could be mangled, rejected, or lost by the voter or by the system
Blocs of ballots could be mangled, rejected, or lost by the system
Individual or blocs of votes could be unused or used multiple times by the system

Let's take the general principles in order, produce some policy statements, and then wrap those policy statements into a high-level outline of a system.

Fail Safe Design - All legal votes should be counted. It should be difficult to present erroneous input. It should be impossible to make one ballot choice that is counted for another ballot choice. Input on one ballot choice should not affect other choices.
Complete Access Control - Only the voter should know what choices he made in the voting booth. The system should allow authorized voters to vote one time per election.
Least Privilege - Individuals should be given only the access to ballots their role requires. For instance, those counting votes do not need to know which election they are counting.
Separation of Privilege - Access to ballots, vote tallies, and control data should be based on multiple independent criteria.
Least Common Mechanism - Votes and ballots should be separated as soon as possible. That is, the transmission of votes must not rely on shipping physical ballots.
Economy of Mechanism - We should use the simplest system satisfying all of the requirements
Open Design - A standard for voting machines should be produced, so that a machine from any manufacturer could be put through exact, reproducible tests. Security should not be used to justify hiding the operation of the system. The overall system must be documented clearly and simply enough for anyone to understand.
Psychological Acceptability - The voting process should change as little as possible from the voter's perspective. The voting process should also be understandable to all voters, or at least should present no obstacle between the voter and voting. Safeguards should not appear to prospective legitimate voters to be more trouble than they are worth.

Now For The Tricky Part

With policy statements in hand, we can now see what sort of system would meet the requirements of those policies.

If voting is seen to be difficult because of the security measures, the measures will be worked around, or people will simply not vote.

Only those authorized to access ballots that have been cast should be able to do so. There really should not be an argument against this, but some have demagogued this issue saying that identification is an attempt to exclude poor or minority voters, or that it is psychologically unacceptable as a security measure. As long as the difficulty of obtaining identification for purposes of voting is low, and the identification of who voted does not show how they voted, vote suppression is a red herring.

As quickly as possible, the information on the ballot should be copied from the physical ballot, or a physical ballot ("paper trail") created from an electronic ballot, and if possible the voter should verify a correct copy. Both the physical ballot and the electronic ballot should be transmitted to their secure destinations, by separate means. In no case should the public Internet be used as a means of transmitting official ballots, because this introduces too much shared mechanism: an Internet denial of service would jeopardize voting availability.

While some would step away from anonymous ballots or even away from the secret ballot altogether, the problem of voter intimidation is still a bigger enemy of democracy than voter anonymity. That is, balanced against the ability for others to punish or reward a particular vote, the secret ballot allows the potential of multiple votes per voter or ineligible people voting. Non-secret voting does not completely cure these problems, and introduces many others, besides. The secret ballot follows the principle of Least Privilege: no one but the voter knows how he voted.

The source code (what the programmers edit) for a voting machine really should be available for everyone to see. But companies are wedded to the idea that keeping their code hidden gives them a business advantage, and we must rely on businesses to produce the machines, or rely on the government to make them, a truly intolerable situation. H.R.811 addresses these concerns by mandating that source code be given to election officials, but it is not clear whether citizens "inspecting" the source code would be allowed to do anything useful with it, or merely inspect it visually on paper. Inspecting the code without being able to execute it in a debugging environment is an unacceptable half measure.

The principle of Open Design does not require that the source code be revealed, however. While hiding the source code makes the attacker's job more difficult, it also lowers overall confidence in the system. If the attacker cannot force the system to behave in an unauthorized way even with the source code, the system can be assumed to be more secure than an equivalent system with hidden code. Hiding the source therefore should not be relied upon as a security measure. Companies who hide their source code should assume that the attackers have somehow obtained it, and design their countermeasures accordingly.

An electronic voting system should validate entries before a physical ballot is created, to catch the error early and allow for a good ballot to be taken. Processing two votes should take only twice as long as one vote takes. There should be no special handling required for the elderly or those for whom typical voting procedures are physically difficult: it should be easy for everyone.

The above can be accomplished in two basic ways, each of which has advantages and disadvantages:

Scanned Ballot: A voter fills out a human-readable form and drops it into a scanner. The scanning process can optionally allow the voter to confirm the ballot, or it can simply acknowledge that the ballot was read properly. The scanner tallies and transmits the votes.
Printed Ballot: The voter interacts with a machine to make his ballot choices. The machine prints out the ballot for the voter's inspection. The voter confirms his choices for the machine, and then drops the ballot into the ballot box. The machine tallies and transmits the votes.

In either case, Votes are tallied, batched, and the unofficial electronic results are transmitted automatically when the polls close. The physical ballots in the ballot box are counted when the polling places close, and the official results transmitted by a different means than the unofficial electronic ones were. The physical ballots are retained indefinitely, for recounts and academic analysis.

The Scanned Ballot method is simpler and more familiar to voters, and can be mimicked with a purely manual method and minimal communication infrastructure. The Printed Ballot method has better error detection and correction, and removes any doubt about what a ballot actually says, which is sometimes a problem in recounts.

In either case if there is a serious discrepancy between the official and unofficial results, an audit can be performed to uncover the problem, but the physical ballots should be considered authoritative unless there is sufficient evidence of tampering. By gathering and transmitting the unofficial and official results separately, errors (whether accidental or intentional) become very unlikely to affect the result of the election.

Thanks for reading. Those wanting to know more could do worse than to start with Bruce Schneier, who also blogs about squid on Fridays.

(†)I first heard that thought expressed at SANS'99 by a presenter from gnu.org, and I'm sorry I don't have a better cite than that.

Modern Sourcery

Monday, May 14, 2007

Acceptable Electronic Voting

No comments: