ASCII TABLE
The term "plain ASCII" refers to the 128 characters in the
following chart. Extended ASCII (sometimes called superASCII) is the next 128
characters, 80h - FFh. Extended ASCII is not a standard, and that character set
varies from machine to machine and from country to country. It also varies from
screen to printer on the same computer!
Standard ASCII uses the lower 7 bits of an eight-bit byte.
(0000 0000 to 0111 1111). Including zero, this means ASCII can code 128 characters.
Characters sent to printers or computer screens are ASCII. Note that many
characters are control codes, like space, <SP>, carriage control <CR> and line
feed <LF>.
Columns - most significant hex digit or byte (MSB)
Rows - least significant hex digit or byte (LSB)
Obtain the (hex) code by adding the column bits to the row bits. Thus the character 'A'=40+1=41h.
00 10 20 30 40 50 60 70
--------------------------------------
0 | NUL DLE SP 0 @ P ' p
1 | SOH DC1 ! 1 A Q a q
2 | STX DC2 " 2 B R b r
3 | ETX DC3 # 3 C S c s
4 | EOT DC4 $ 4 D T d t
5 | ENQ NAK % 5 E U e u
6 | ACK SYN & 6 F V f v
7 | BEL ETB ' 7 G W g w
8 | BS CAN ) 8 H X h x
9 | HT EM ( 9 I Y i y
A | LF SUB * : J Z j z
B | VT ESC + ; K [ k {
C | FF FS , < L \ l |
D | CR GS - ] M ] m }
E | SO RS . > N ^ n ~
F | SI US / ? O _ o DEL
Thus the ASCII hex code for 'A' is 41h, and for 'a' is 61h. SOH is obtained with
Control-A.
Organisation
- The hexadecimal labels are a simple means of displaying the binary bits of each
character or control code. In the following we use the notation b0 - b7 to indicate bit 0
(lsb or least significant bit) to bit 7 (msb or most significant bit). b7 is not used in
standard ASCII. Its use generates graphical characters or accented characters and is
not part of the official or standard ASCII code. The character codes from 80h -FFh
all have b7=1. These are called the Extended ASCII codes, and they are by no means
standardised. The Extended ASCII character on your screen may not be the same as
the one printed, or the same as one displayed on another computer.
- The first two columns are control codes; the remaining 6 are alphanumeric (letters plus
numbers) and punctuation, collectively called displayable characters as opposed to control
codes. The one exception is DEL (7F), which performs a delete operation on the
current character. On the old paper tape of teletype machines, it was also called
the RUBOUT character, because it punched out all 7 holes in the tape, thereby eradicating
whatever had previously been punched there. All control codes can be obtained
from the keyboard by holding down the Control key while simultaneously depressing a
displayable key. Such an action zeros bit 6 (b6) of the normal code for that key.
Some control codes have their own keys, such as Enter, which generates the
<CR> code (0D), and Backspace which generates the <BS>. These codes can
also be generated with Control-M and Control-H respectively.
- The NUL does absolutely nothing, but on mechanical printing devices such as a teletype,
a few NULS were inserted after a Carriage Return (<CR>=0D) to give the printing
mechanism time to cross the page back to the left side. A line feed <LF> would
be issued immediately afterwards, because advancing the paper was a fast mechanical
move. In the Microsoft world, end-of-line characters are still the pair
<CR><LF> in that order (although the NULS are no longer used) since video
screens do not have the time delay of the old teletypes.
- SOH is Start of Header (referring to a transmission via teletype radio or land lines).
SOH is obtained from the keyboard by holding down the Control key, and
simultaneously typing the letter "A". It is often written "^A",
where the caret is a symbol for the Control key.
- Note the relationship between ^A, A and a. The ASCII hex codes are 01, 41h and 61h.
Thus by zeroing b6, one converts 0100 00012 (41h) to 0000 0001 (01h).
(This is the same as subtracting 40h from 41h). To obtain 'a', one SETs b5: 0100
0001 --> 0110 0001, which is equivalent to adding 20h to 41h to get 61h. There is
a similar relationship between ^S, S and s, and so on.
- STX means Start of Text; ETX is end of text and EOT is End Of Transmission. Unix systems
use ETX (^C) to abort operations and EOT (^D) as an end-of-file character (EOF),
particularly to abort keyboard input when a program is waiting for more.
- ENQ/ACK are enquiry/acknowledge (a means of verifying your recipient is online).
ACK is also used in the protocol of some printers.
- BEL rang the bell on a teletype. It makes a beep on a computer.
- BS is backspace (NB the delete operation is actually three characters:
<BS><SP><BS> on a video monitor.) On a VT100 - compatible monitor,
the DEL key performs the same operation.
- HT is the (horizontal) tab key. VT, the vertical tab was designed for rapid
movement from one data field to another when filling out forms. In the MSDOS world,
HT now performs that operation.
- Most of the other codes are rarely used (SO = shift out, SI = shift in) but a few have
become special.
- DC3 (Device Control 3) is also the X-OFF character, used even today in some protocols to
stop transmission of data (through a modem for example). DC1 (X-ON) starts it back
up again.
- SUB is actually the end-of-file character used by older versions of MSDOS. In
transferring binary files, however (which use all 8 bits), that code would occasionally
occur, aborting the transmission. So current techniques count the characters rather
than use an EOF character. It is obtained with ^Z from the keyboard. See EOT
for the Unix equivalent.
- ESC is the familiar escape character, used originally to "escape from" a
transmission whose link failed for some reason.