The UNIVAC-1 was the first general purpose computer capable of processing alphanumeric as well as strictly numeric data. In other words, it played host to the first electronic alphabet, a 6-bit character set that mapped more or less directly to a typewriter keyboard. Throughout the 1950s and early 1960s character sets proliferated; IBM, for example, had its own proprietary 8-bit character set called EBCDIC which saw use on mainframe computers well into the 1980s. By the early 1960s, however, the smart people recognized the need for a standard, and so ASCII was born, in 1963 and officially adopted by the American National Standards Institute in 1967. ASCII was (and is) a 7-bit character set; so “A” was represented by the binary string 1000001, and so it ever more would be, world without end, Amen Amen 1000001110110111001011101110. But of course it’s more complicated than that. National variations soon proliferated; the character set known as ISO 646 substituted the British £ sign for the dollar symbol, for example. Still, computers like to think in terms of 8-bit “bytes,” or octals as an 8-bit sequence is more properly known. So, given that ASCII couldn’t accommodate all of the characters necessary for Western European languages, let alone other languages, it wasn’t long before there were 8-bit extensions, which added another 128 characters to the 128 characters of the original 7-bit set for a total of 256. These became codified in the family of ISO-8859 character sets. If you’re reading this in the US or Western Europe, chances are your Web browser is set to display ISO Latin-1, the variant of 8859 that supports character symbols for the national language of those countries. (If you like, you can read about these things here.) Recently there is a movement afoot to replace Latin-1 with Latin-9, which also includes the character code for the Euro sign (“”). Your browser might display it anyway because it includes native Unicode support, but that’s another story. Obviously the socio-political dimensions here are enormous, and the proliferation of character sets and encoding standards gives lie to the notion that there is finally any such thing as “plain text” in the electronic sphere.
But really I wanted to mention all this because it leads to a fascinating tidbit in Paul E. Ceruzzi’s History of Modern Computng (MIT 1998). You might have wondered, above, why ASCII wasn’t established as an 8-bit standard in the first place. There were a couple of reasons, but chief amongst them was the concern that the fragile paper tape which was a common storage medium of the day would be too susceptible to tearing if there were as many as eight holes punched across its width. Seven bits, however, was felt to be within acceptable fault tolerances. And so we find that ASCII, which defines the conditions of electronic textuality at the most fundamental level, is—literally—informed by the materiality of paper.
Posted by mgk at June 25, 2005 05:22 PMAn excellent anecdote about the 7-bitness of 7-bit ASCII! This is even better than the length of a CD being determined by Beethoven's Ninth Symphony - whether http://www.snopes2.com/music/media/cdlength.htm says that's true or not. The ASR33 Teletype, on which Will Crowther wrote the original version of Adventure, used 7-bit ASCII and paper tape; I suppose Crowther may have written Adventure offline on paper tape and then uploaded it. Which brings me to ask: Wasn't error-checking part of the reason for restricting the character set to 7 bits? With the eighth bit free, it could be set to indicate the parity of the other seven and be used to catch errors in transmission.
Also, Is Ceruzzi's History of Modern Computing otherwise worthwhile? I've wondered about it, but histories of technology and computing seldom deal with interesting material specifics like these.
Posted by: nick at June 27, 2005 09:32 PM | Link to CommentHah, I didn't know about the Ninth Symphony meme. I swear there was a point in the mid-1980s when bands were making albums just a little longer than 45 minutes total to discourage taping on one side of a 90 minute cassette.
Yes, the parity bit was definitely also a consideration. I'm not sure which came first (chicken and egg). There doesn't seem to be a whole lot of serious work on the origins of ASCII, beyond the basic narrative.
Ceruzzi's book is definitely worthwhile. He's a serious historian and he knows his stuff (had a chance to have lunch with him once). I also like Martin Campbell-Kelly's recent history of software, which bears the rather ungainly title of From Airline Reservations to Sonic the Hedgehog (also MIT).