forked from gdsc-uba/the-translator
-
Notifications
You must be signed in to change notification settings - Fork 0
/
sample.txt
324 lines (322 loc) · 26.2 KB
/
sample.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
A Brief History of Text and the Computer
1. The central role of text in the development of the computer
The first programmable digital computer was built in the 1940s. The processing of text is
an even more recent practice, dating from the 1960s. It was only in the 1990s that the
computer also began to offer a serious alternative for the distribution of texts. Yet in less
than half a century computers have insinuated themselves into the texture of society to
such an extent that it could not function without them. This chapter examines how the
computer has in such a short time come to play such a predominating role specifically in
the textual world.
Some major milestones in the development of what has since become the digital
textual medium can be identified.
In the nineteenth century the single-purpose calculating machine, was first conceived to
have the potential to be turned into a Universal Machine, capable of performing tasks that
may be expressed by way of an algorithm. In the 1940s the first Universal Machines were
built. In the 1960s the computer as a Universal Machine was enabled to process text,
which gave it a role in the text creation phase. In the 1980s the graphical man–machine
interface of the computer greatly enhanced the possibilities for the typographical
rendering of text. It enabled the computer to play a central role in the production of
printed matter. The graphic interface also paved the way for the co-existence of two
different ways to treat text digitally: the logical and the typographical.
• In the nineteenth century the single-purpose calculating machine, was first conceived
to have the potential to be turned into a Universal Machine, capable of performing tasks
that may be expressed by way of an algorithm.
• In the 1940s the first Universal Machines were built.
• In the 1960s the computer as a Universal Machine was enabled to process text, which
gave it a role in the text creation phase.
• In the 1980s the graphical man–machine interface of the computer greatly enhanced
the possibilities for the typographical rendering of text. It enabled the computer to play
a central role in the production of printed matter. The graphic interface also paved the
way for the co-existence of two different ways to treat text digitally: the logical and the
typographical.
• In the 1990s the computer was included in a network, which enlarged its role as a
communication tool from that of an aid for the production of analogue printed matter to
a new, fully digital medium in its own right, also comprising distribution and
consumption.
In his ‘communications circuit’1 Robert Darnton has conceptualised the entire
transmission process of books and other printed text forms as it has functioned for several
centuries. This model visualises a process in which various consecutive actors work
1 See Robert Darnton, ‘What is the History of Books’, Daedalus, Summer 1982, pp. 65-83.
2
together under varying cultural, economic, and political conditions to disseminate an
author’s text so that it can reach its readers. Laying the communications circuit over the
transmission process of digital text this model will identify the similarities and differences
(or continuities and discontinuities) between the new digital medium and its predecessor
the print medium.
The process of making texts public and disseminating them comprises various
distinct stages. These roughly correspond to those identified in the communications
circuit: the creation of the text (writing), followed by its production (multiplication),
distribution (the moment the text is made public), and finally consumption (reading).
However, to inspect the different stages in the development of the role of the computer in
the transfer of text more closely, I would like to propose one small adaptation to this chain
of stages. I would like to place a magnifying glass over the first link in the chain, the
‘creation stage’, which is the the phase in which the contents and form of the text have not
been finalised. Besides the writing of the text by the author this phase also comprises its
editing, whether this is done by the author or by someone acting on his or her behalf (for
example the publisher’s editor). Technically, this means making a distinction between (a)
text entry, (b) text recording, and (c) the manipulation of the text once it has been entered.
Recognising this fluidity in the creation stage, comprising writing and editing in any
number of iterations, makes it easier to trace the development of the computer’s role in the
writing process. Roughly three stages in that development can be recognised. These partly
overlapped, but they are fundamentally different enough to treat them separately. The
stages are (1) the representation of text on the computer (entry, recording, storage), (2) the
manipulation of stored text for scientific and professional applications, and (3) the actual
word processing on the pc, as an aid in the authorial thinking and writing process.
Among the most popular computer applications today are no doubt chatting, word
processing, emailing and Web browsing, all text-based pursuits. But also outside of these
text applications text is the key to our computer use. In all arithmetic, analytical, medial
and other applications for which the computer as a universal machine lends itself, text has
a central place. On the World Wide Web—and on the Internet in general—text is the most
common way to organise, search, and find information, even when that information itself
is not a text but, for example, a music file or an image. In all daily dealings with the
computer text furnishes the chief interface, of the operating system as well as the
applications. Files are named and stuck in folders, which are again named using text. But
also beyond this kind of daily consumer use language is the basis for all human–machine
interaction. All modern programming languages use a form of natural language. Also
markup (one of the most important ways to encode text on the computer—and the
technical basis of publication on the World Wide Web) is an entirely textual practice.
In the previous chapter I described how western society is shot through with the
social and cultural significance of books as the main means to transmit knowledge. I have
called this the Order of the Book. Against this background it seems only natural, and in
fact almost inevitable, that the computer was to be deployed for textual communication as
soon as this became possible, and that the whole human–computer interaction became a
textual affair. Indeed, the eagerness with which the word processor was embraced in the
1980s seems to confirm that idea. Given the prominence of text-based applications in
3
popular computer use today, the question even presents itself why the computer was
invented as a calculating machine rather than a language machine. As it is, the computer
continues to have to recalculate all those textual data and instructions that we feed it to the
only meaningful units which it knows: ones and zeros. Why would it not be possible to
calculate with language itself? The idea may seem stranger than it is. His whole life
Wilhelm Leibniz continued to believe in the construction of a language consisting of logical
symbols that could be manipulated by means of a calculator. Such a language, and a
machine to ‘calculate’ it would enable any philosophical debate to be settled with the click
of a button. 2 That Leibniz’ dream has still not been achieved, is not so much because such a
logical system of symbols is not viable.3 The real problem is that the subtle shades of
meaning we can—and want to be able to—express with natural human language are simply
not amenable to being reduced to a system of logical symbols.
Zeros and ones it was, then. For the sake of convenience, however, it was felt
necessary to devise a way to cast instructions to the computer into a humanly intelligible
shape. Hence program lines, menus, file names and the like now all have a human-
readable form, even if behind the scenes the computer still calculates with the only
numbers it knows: ones and zeros. No user now stops to think that every keystroke is
converted into a series of binary numbers. In fact, in our perception language is the
primary way in which we deal with the computer today. The numbers that the computer
really crunches appear to play no more than a subordinate role; the numbers seem to
dance to the tune of the text. But once the reverse used to be the case and, thinking from
the binary heart of the computer, the quest was for a way to represent letters.
Given the enormous importance of text for average daily computer usage it is
striking how much effort it still took before the computer could actually deal with text.
How did that process take place and why did it take so long? What factors impeded and
stimulated it: design and chance, unintended effects, failure of intended effects, etc. This
chapter will reconstruct that process in general outline.
That text has come to take a central position on the computer appears at first sight
to be only natural—a reflection of the importance of text in society. At the same time some
commentators point out that text is actually beginning to lose its prominence.4 They are
obviously not suggesting that we are about to engage in a direct binary data exchange with
the computer, or that humans have recently acquired massive training and experience in
symbolic logic. What they mean is that in addition to text, other modalities, especially
images, are playing an increasingly important role in digital communication, as in society
at large. This is often referred to as the ascendancy of visual culture. 5 One simple
explanation for that increase of other modalities could be that the digital medium makes it
easy, as a result of the convergence identified in Chapter 2, to integrate modalities such as
2 In The Courtier and the Heretic: Leibniz, Spinoza, and the Fate of God in the Modern World (New York
and London, 2006, p. 79), Matthew Stewart gives an account of this ideal of Leibniz.
3 Alfred North Whitehead and Bertrand Russell’s Principia Mathematica (1910-1913) is impressive evidence
that it is, even if Douglas Hofstadter is right with his interpretation of the implications of Kurt Gödel’s
explosive article in 1931 for the fate of Russel’s fortress, which he deemed impregnable (see Douglas
Hofstadter, I Am a Strange Loop, New York, 2007, Chapter 10).
4 Steve Johnson, Interface Culture: How New Technology Transforms the Way We Create and
Communicate (New York, 1997, see pp. 148-52), is one of the exceptions.
5 See, for example, Mitchell Stephens, The Rise of the Image, the Fall of the Word.
4
images and sound in text. But the notion of a visual culture is not that new, and certainly
predates the advent of the computer. From the beginning of the twentieth century in many
places in the world all kinds of visual languages have been designed for signs, packaging
and other forms of communication. 6 In the middle of the last century De la Fontaine
Verwey finds in his contribution to Copy and Print in the Netherlands that the image,
‘[s]uperseded for a time by the book’, ‘has resumed its ancient rights and is engaged in
fulfilling tasks that have for centuries been carried out by the printed word.’7
Not only are the signs that text is beginnning to lose its prominence still rather
faint, the role of text has probably simultaneously been strengthened in other ways, such
as the largely textual interface of the computer and the Internet, but also the
extraordinarily popularity of texting on the mobile phone. To judge by the popularity of
social networks, blogs and the comment function on so many websites, it may well be the
case that more people write—at least with a form of publication in mind—than ever before.
The phenomenon is not necessarily always equally visible, however. An example of a less
directly visible use of text is the way keywords are assigned to images and sound in order
to be able to search for them. This may be a transient phenomenon while the searchability
of images and sound through other images and sound is still in its infancy. For the time
being at any rate the entire digital world—including games and chatting—is accessed by
means of text.
If the relationship between text and other modalities is indeed changing, the
change, at least so far, seems not particularly drastic. Nonetheless, in a longer historical
perspective a situation may well be imagined where text need not necessarily be the most
important means of communication. I will return to this speculation in Chapter 6.
2. The history of computers and computing
There was initially little evidence of the important role that text was to play in the digital
world. The history of the advent of text to the computer starts with two major
developmental leaps in the history of the computer itself. Two in particular are important:
(a) that from machines with only one function to multifunction machines, and (b) that
from mechanical to electronic, digital machines. In the category of machinery with only
one function, two are of particular relevance to the history of the computer as a machine
for the processing of text. The first is the calculator, which still forms the heart of every
computer. The second is the typewriter, which delivered, in the shape of the keyboard, the
chief means of input for the computer today. In addition, there is a number of more
specialised machines, some of which I will also briefly mention.
The history of the calculator as a forerunner of the computer goes back some four
6 There was an enormous belief in the potential of images (in the form of pictogrammes and icons, but also
image-based statistics) in promoting efficient information transmission. A particularly prominent and
tireless advocate of the use of information graphics was Otto Neurath, the inventor of the Isotype
(International System of Typographic Picture Education) symbols in the 1920s. After fleeing his native
Austria in the 1930s he founded the International Foundation for Visual Education in The Hague, and later
the Isotype Institute in Oxford.
7 H. de la Fontaine Verwey, ‘The twentieth century’, in W. Gs Hellinga, Copy and Print in the Netherlands:
an Atlas of Historical Bibliography, Amsterdam, 1962, pp. 59-67, on p. 59.
5
centuries. In 1623 Wilhelm Schickard (1592-1635) from Tübingen made a 6-bit ‘counting
clock’, which could add and subtract. He called his machine a clock because the machinery
was reminiscent of one. The instrument was entirely mechanical. When half a century later
Wilhelm Leibniz began to work out his idea for a digital calculator, he was a great deal
more ambitious. His machine was to be capable of processing universal logical symbols. In
spite of his unbridled ambition and dedication, he never managed to go beyond a kind of
mechanical pocket calculator which could add, subtract, multiply, and divide. Like Leibniz
in the seventeenth century, the British mathematician Charles Babbage in the nineteenth
century had the vision that calculators could be used for purposes other than making
numerical calculations. In the intervening centuries scientific knowledge and instrument
making skills had advanced so much that Babbage was able to take the implementation of
his ideas further than his predecessors. Although Babbage never built more than parts of
his ‘Analytical Engine’, 8 on the strength of his design he can be considered as the creator of
the first Universal Machine. Like the calculators of Schickard and Leibniz, it was entirely
mechanical (it was to be powered by steam) and made use of decimal instead of digital
numbers, but it was programmable, separated the data from the program, and was capable
of loops and conditional branching. That was more than most computers were capable of
even a century later. Babbage was even considering exporting the outcome of calculations
to punched cards. This notion, inspired by the Jacquard loom, would have enabled the
machine to write and store its own programs.
Charles Babbage had the vision; Ada, Countess of Lovelace, a fellow mathematician
who heard him expound on it one night over dinner, was one of very few people who
understood its implications. Recognising that on a higher level of abstraction computing
was not counting but the manipulation of symbols, she proceeded to devise a number of
algorithms that might actually have been executed by the Analytical Engine had it ever
been built. When Lovelace translated an article on the Analytical Engine by the Italian
mathematician and military engineer Luigi Menabrea she added some very percipient
notes of her own, amounting to twice the length of the original article. In these notes she
correctly predicted that a machine like the Analytical Engine might be used to compose
music, produce graphics, and perform a variety of scientific tasks:
[I]t might act upon other things besides number, were objects found whose mutual
fundamental relations could be expressed by those of the abstract science of operations,
and which should be also susceptible of adaptations to the action of the operating
notation and mechanism of the engine. Supposing, for instance, that the fundamental
relations of pitched sounds in the science of harmony and of musical composition were
susceptible of such expression and adaptations, the engine might compose elaborate
and scientific pieces of music of any degree of complexity or extent.9
8 Originally Babbage had designed a simpler ‘machine’, which he named the ‘Difference Engine’ because it
was able automatically to generate tables of the intervals (or differences) between sets of numbers resulting
from programmed series of progressive additions. The machine could produce a print of the tables.
9 ‘Sketch of The Analytical Engine Invented by Charles Babbage, Esq.’, by L. F. Menabrea, with notes by Ada
Lovelace, reprinted in Charles Babbage, Science and Reform: Selected Works of Charles Babbage, ed.
Anthony Hyman, CUP, 1989, pp. 243-311, on p. 270. Emphasis in the original.
6
Evidence that the vision of Babbage and Lovelace could become reality, was only delivered
by Alan Turing in the middle of the twentieth century. According to Turing his abstract
‘Turing machine’ was capable of executing all functions which can be calculated in the
form of an algorithmic procedure. Modern digital, electronic programmable computers
which met Turing’s requirements were first developed in the 1940s.
The binary principle was used not only for the calculations themselves, but also for
the way in which the data were encoded. Just as numbers can be represented both in a
binary system and in a decimal one, the same is true in principle also for text, image and
sound. In the case of numbers and text the number of discrete characters is very limited,
and each character can be represented by a limited number of bits. For Latin script, a
single byte (eight bits) can encode 256 unique characters. Modalities like image and sound
are more complicated to encode. Here the signal has to be divided into any arbitrary
number of constituent components. Dividing an image (or sound) into discrete particles
means that transitions will never be continuous, but always incremental. The number of
components per unit of signal (for example, pixels per inch) decides the realism of the
binary representation: the more the better. But however high the number of pixels per
inch, the realism of a digital rendition can in principle never be equal to an analog
rendition. In spite of all its shortcomings the relevance of binary representation is that all
data in all modalities and all the calculations that could be applied to them, can be encoded
in the same binary fashion. This makes ‘binariness’ the ‘element’ 10 in which the much-
vaunted convergence of modalities (on which more in Chapter 5) can take place.
The typewriter is the second single-function machine besides the calculator that has
been of great importance in developing text encoding on the computer. Some of the
earliest typewriters were designed for the blind, 11 which nicely illustrates how deep the
divide can be between an inventor’s intent and the actual social use of an invention. Of
special interest for the present topic is the case of the keyboard. Of all the ingenious typing
systems ever designed 12 it was that by Christopher Sholes, the creator of the first
typewriter to be taken into commercial production, that became the standard. This was the
keyboard with the well-known qwerty layout. 13 The most important legacy of the Sholes
keyboard is that the characters found on his keyboard are now still the atomic building
blocks of text on the computer. The standard computer keyboard has no accented
characters, makes no distinction between a hyphen and an em-dash, or between the
decimal point and the full stop, and lacks all sorts of special characters: from typographical
through mathematical to currency signs. Instead it was visual appearance only that
decided whether a separate character was created.14 The computer keyboard encodes
individual letters binarily and enters them into the computer. Just as on a typewriter, this
is done by assigning a single character per key, although that number may be increased by
using the shift key (and on the computer in addition various function keys).
10 The term is Michael Heim’s, from Electric Language, p. 102.
11 Michael H. Adler, The Writing Machine, London, 1973, p. 48.
12 Adler, The Writing Machine, p. 25-90.
13 The qwerty layout is still in use in many countries, for example, throughout the English speaking world. In
some other countries the layout differs. France uses the azerty keyboard, while Germany and some Eastern
European countries use the qwertz keyboard.
14 Hence on some keyboards no separate figure 1 was included, the letter l being regarded as suffiently similar
in shape.
7
Among the many inventive and less inventive alternative text entry systems that did
not make it must definitely be mentioned the idea of Douglas Engelbart, also known as the
inventor of the computer mouse,15 to enter the 31 characters of the standard 5-bit code of
the ‘teletype’ (the forerunner of the telex) by the simultaneous pressing of five keys (25 =
32). 16 Engelbart worked on this ‘five-key handset’ in the 1960s, as part of his ambitious
Framework for the Augmentation of Man’s Intellect framework, which will receive more
attention later in this chapter. While the idea was not new (in the course of the nineteenth
century several typewriters with piano-type keyboards had already been designed) and
certainly had advantages, it was not up against the domination the qwerty keyboard had by
then already acquired. It was in use in large parts of the world, and generations of typists
had learned typing blind using the qwerty layout.
The typewriter did not take the process of creation – editing – production –
publication – distribution – consumption beyond the creation stage. It took care of the
‘data entry’ and ‘storage’ (anachronistic terms for functions that were really only created
by the computer) of a text, but could do little for its reproduction, publication and
distribution.17 As a medium this does not distinguish the typewriter substantially from
manuscript—with the exception perhaps of the degree of readability. In that regard the
typewriter only very partially approaches printing type. That did not, however, keep its
inventors from stressing this property, even in the case of the very earliest machines.18 The
magnifying glass that I placed over the creation and editorial phase shows that the
typewriter is a rather poor performer when it comes to the manipulation of text.
Among the more specialised techniques that are relevant in the history of the
computer certainly belongs telegraphy, and in particular the Baudot system for text input
dating from 1874. That is the standard 5-bit code already mentioned in the discussion of
the keyboard. Despite the limited number of characters (a maximum of 32) that could be
encrypted with this five-bit system, this character encoding by Emile Baudot (1845-1903)
still remained in use in the digital electronic environment until it was replaced by ASCII in
the middle of the 1960s. 19 Since the morse system with its dots and dashes is also binary,
Baudot’s encoding system lent itself particularly well for transferring to the computer.
Another specialist device for the processing of text that deserves attention was the
typesetting machine used in in print production. 20 The typesetting machine used the
typewriter keyboard to its advantage (albeit that the typesetting machine’s keyboard was
equipped with substantially more keys; the Monotype had four complete sets of qwerty
keys, one each for roman, italics, bold, and small caps). At least four major improvements
15 See Thierry Bardini, Bootstrapping: Douglas Engelbart, Coevolution, and the Origins of Personal
Computing, Stanford, 2000, pp. 81-102.
16 See Bardini, Bootstrapping, pp. 58-80.
17 Except of course with the rather limited help of the carbon copy. The stencil machine and the duplicator
can be disregarded. While these techniques make use of a typewriter to record the text, multiplication is a
separate step, which requires a duplicating machine.
18 ‘A British engineer, Henry Mill, was granted British patent no. 395 in 1714 for a device capable of
impressing letters on paper or parchment one after another “as in writing”, the product being so neat and
exact as to be indistinguishable from printing’ (Adler, The Writing Machine, p. 47).
19 Bardini, Bootstrapping, pp. 65-79. More will be said about ASCII (American Standard for Information
Interchange) later in this chapter.
20 Other, more marginal systems that can be mentioned were for example those used for the generation of
titles and subtitles for film and television, which were also becoming more sophisticated as time went on.
8
which were applied early if not for the first time in the typesetting industry, have been of
great importance for the development of digital word processing. These were the use of a
storage medium, in the form of the punched tape of the Monotype typesetting machine
from 1887; the application of the tele typewriter for remote typesetting through the 6-bit
code of the TeleTypeSetter (TTS) in the late 1920s; the application of the computer in the
third generation of phototypesetting machines from the late 1960s, and the development
of the concept of markup in the 1960s and 1970s, about which more later.