GEOS/ASCII/PETSCII Character Sets
Ariel's GEOS Programmer's Reference Guide | ||
---|---|---|
Back: geoProgrammer Macros | Up: Contents | Next: 6502-Series Assembly Language |
Table of Contents
GEOS uses a subset of ASCII for text and character encoding, with two extensions. ASCII is a 7-bit encoding: Only the code points 0 to 127 ($00 to $7f) are valid. Every character is contained within a single byte.
GEOS extends its 7-bit ASCII with a special character code (128, $80) for output and a special meaning of bit 7 for input. In output, the system font (BSW 9) defines the shortcut key logo for character 128: the Commodore "chicken-lips" logo in GEOS and GEOS 128, and the filled-apple logo in Apple GEOS. In input, bit 7 is set if the character's key was shifted with the shortcut key when typed ($e9 and $c9 for [C=]+[I] and [C=]+[Shift]+[I], for example). The geosSym file defines the constant SHORTCUT (128) for these purposes.
GEOS does not use the "PETSCII" encoding native to the Commodore PET, 64, and 128. Their encoding is largely incompatible, but translation between ASCII and PETSCII is, for the characters they have in common, reasonably straightforward. In these tables, "PETSCII Set 1" refers to the character displayed when Set 1 is active (normal uppercase mode), and "PETSCII Set 2" refers to the character displayed when Set 2 is active (lowercase mode). No attempt is made to render or describe most of the graphics characters; see the Commodore 64 User's Guide and the Commodore 64 Programmer's Reference Guide for visual depictions.
Characters and Code Points
The 128 character codes are broken into four sets of 32 characters.
- Control codes, 0-31, $00-$1f, are intended to control the flow of the text stream. In GEOS, control codes also change how the characters following them are presented.
- Symbols and digits, 32-63, $20-3f, contain the digits and most of the punctuation symbols. The digits 0-9 are at code points $30-$39 by design.
- Uppercase letters, 64-95, $40-$5f, contain the uppercase letters, the brackets, and the backslash.
- Lowercase letters, 96-127, $60-$7f, contain the lowercase letters, the braces, and the vertical bar.
Control Codes
GEOS codes are constants defined in the geosSym file.
GEOS codes marked with an asterisk (*) are valid in PutString but not in PutChar.
GEOS codes marked with two asterisks (**) are technically valid in string text, but PutString will either not interpret them correctly or not interpret them at all if encountered.
Byte | Hex | GEOS Code | ASCII Code | PETSCII Code | Purpose in GEOS |
---|---|---|---|---|---|
0 | $00 | NULL | NUL | String termination character | |
1 | $01 | SOH | |||
2 | $02 | STX | |||
3 | $03 | ETX | |||
4 | $04 | EOT | |||
5 | $05 | ENQ | White | ||
6 | $06 | ESC_PUTSTRING ** | ACK | Not a text code, should never be sent to PutString or PutChar. In a GraphicsString code string: text string escape, rest of string is PutString text. | |
7 | $07 | BEL | |||
8 | $08 | BACKSPACE | BS | Disable Shift+C= | Erase the previous character. |
9 | $09 | FORWARDSPACE | HT | Enable Shift+C= | Move forward one space width (not implemented in 64 or 128). geoWrite: TAB |
10 | $0a | LF | LF | Move down currentHeight pixels (one line) | |
11 | $0b | HOME | VT | Move to upper-left screen corner | |
12 | $0c | UPLINE | FF | Move up currentHeight pixels (one line). geoWrite: PAGE_BREAK | |
13 | $0d | CR | CR | CR | Move down one line and left to leftMargin |
14 | $0e | ULINEON | SO | Lowercase | Begin underlining |
15 | $0f | ULINEOFF | SI | End underlining | |
16 | $10 | ESC_GRAPHICS * | DLE | Graphics string escape, rest of string is GraphicsString codes | |
17 | $11 | ESC_RULER | DC1 | Cursor down | Ignored by text routines. geoWrite: ruler escape |
18 | $12 | REVON | DC2 | Reverse | Begin reverse-video text |
19 | $13 | REVOFF | DC3 | Home | End reverse-video text |
20 | $14 | GOTOX * | DC4 | Delete | Move to X coordinate encoded in next two bytes |
21 | $15 | GOTOY * | NAK | Move to Y coordinate encoded in next byte | |
22 | $16 | GOTOXY * | SYN | Move to (X,Y) encoded in next three bytes (two X, one Y) | |
23 | $17 | NEWCARDSET * | ETB | Unimplemented. Next two bytes are skipped. | |
24 | $18 | BOLDON | CAN | Begin boldface text | |
25 | $19 | ITALICON | EM | Begin italicized text | |
26 | $1a | OUTLINEON | SUB | Begin outlined text | |
27 | $1b | PLAINTEXT | ESC | End embellished text modes, resume plain text | |
28 | $1c | FS | Red | ||
29 | $1d | GS | Cursor right | ||
30 | $1e | RS | Green | ||
31 | $1f | US | Blue |
Symbols and Digits
Byte | Hex | GEOS, ASCII, PETSCII | Unicode Name |
---|---|---|---|
32 | $20 | SPACE | |
33 | $21 | ! | EXCLAMATION MARK |
34 | $22 | " | QUOTATION MARK |
35 | $23 | # | NUMBER SIGN |
36 | $24 | $ | DOLLAR SIGN |
37 | $25 | % | PERCENT SIGN |
38 | $26 | & | AMPERSAND |
39 | $27 | ' | APOSTROPHE |
40 | $28 | ( | LEFT PARENTHESIS |
41 | $29 | ) | RIGHT PARENTHESIS |
42 | $2a | * | ASTERISK |
43 | $2b | + | PLUS SIGN |
44 | $2c | , | COMMA |
45 | $2d | - | HYPHEN-MINUS |
46 | $2e | . | FULL STOP |
47 | $2f | / | SOLIDUS |
48 | $30 | 0 | DIGIT ZERO |
49 | $31 | 1 | DIGIT ONE |
50 | $32 | 2 | DIGIT TWO |
51 | $33 | 3 | DIGIT THREE |
52 | $34 | 4 | DIGIT FOUR |
53 | $35 | 5 | DIGIT FIVE |
54 | $36 | 6 | DIGIT SIX |
55 | $37 | 7 | DIGIT SEVEN |
56 | $38 | 8 | DIGIT EIGHT |
57 | $39 | 9 | DIGIT NINE |
58 | $3a | : | COLON |
59 | $3b | ; | SEMICOLON |
60 | $3c | < | LESS-THAN SIGN |
61 | $3d | = | EQUALS SIGN |
62 | $3e | > | GREATER-THAN SIGN |
63 | $3f | ? | QUESTION MARK |
Uppercase Letters
The uppercase letters are rendered as uppercase letters in PETSCII Set 1 and as lowercase letters in PETSCII Set 2.
Byte | Hex | GEOS, ASCII | Unicode Name | PETSCII 1 | PETSCII 2 |
---|---|---|---|---|---|
64 | $40 | @ | COMMERCIAL AT | @ | @ |
65 | $41 | A | LATIN CAPITAL LETTER A | A | a |
66 | $42 | B | LATIN CAPITAL LETTER B | B | b |
67 | $43 | C | LATIN CAPITAL LETTER C | C | c |
68 | $44 | D | LATIN CAPITAL LETTER D | D | d |
69 | $45 | E | LATIN CAPITAL LETTER E | E | e |
70 | $46 | F | LATIN CAPITAL LETTER F | F | f |
71 | $47 | G | LATIN CAPITAL LETTER G | G | g |
72 | $48 | H | LATIN CAPITAL LETTER H | H | h |
73 | $49 | I | LATIN CAPITAL LETTER I | I | i |
74 | $4a | J | LATIN CAPITAL LETTER J | J | j |
75 | $4b | K | LATIN CAPITAL LETTER K | K | k |
76 | $4c | L | LATIN CAPITAL LETTER L | L | l |
77 | $4d | M | LATIN CAPITAL LETTER M | M | m |
78 | $4e | N | LATIN CAPITAL LETTER N | N | n |
79 | $4f | O | LATIN CAPITAL LETTER O | O | o |
80 | $50 | P | LATIN CAPITAL LETTER P | P | p |
81 | $51 | Q | LATIN CAPITAL LETTER Q | Q | q |
82 | $52 | R | LATIN CAPITAL LETTER R | R | r |
83 | $53 | S | LATIN CAPITAL LETTER S | S | s |
84 | $54 | T | LATIN CAPITAL LETTER T | T | t |
85 | $55 | U | LATIN CAPITAL LETTER U | U | u |
86 | $56 | V | LATIN CAPITAL LETTER V | V | v |
87 | $57 | W | LATIN CAPITAL LETTER W | W | w |
88 | $58 | X | LATIN CAPITAL LETTER X | X | x |
89 | $59 | Y | LATIN CAPITAL LETTER Y | Y | y |
90 | $5a | Z | LATIN CAPITAL LETTER Z | Z | z |
91 | $5b | [ | LEFT SQUARE BRACKET | [ | [ |
92 | $5c | \ | REVERSE SOLIDUS | £ (pound) | £ (pound) |
93 | $5d | ] | RIGHT SQUARE BRACKET | ] | ] |
94 | $5e | ^ | CIRCUMFLEX ACCENT | ↑ (up arrow) | ↑ (up arrow) |
95 | $5f | _ | LOW LINE | ← (left arrow) | ← (left arrow) |
Lowercase Letters
The lowercase letters are rendered as graphics characters in PETSCII Set 1 and as uppercase letters in PETSCII Set 2.
Byte | Hex | GEOS, ASCII | Unicode Name | PETSCII 1 | PETSCII 2 |
---|---|---|---|---|---|
96 | $60 | ` | GRAVE ACCENT | (graphics) | (graphics) |
97 | $61 | a | LATIN SMALL LETTER A | (graphics) | A |
98 | $62 | b | LATIN SMALL LETTER B | (graphics) | B |
99 | $63 | c | LATIN SMALL LETTER C | (graphics) | C |
100 | $64 | d | LATIN SMALL LETTER D | (graphics) | D |
101 | $65 | e | LATIN SMALL LETTER E | (graphics) | E |
102 | $66 | f | LATIN SMALL LETTER F | (graphics) | F |
103 | $67 | g | LATIN SMALL LETTER G | (graphics) | G |
104 | $68 | h | LATIN SMALL LETTER H | (graphics) | H |
105 | $69 | i | LATIN SMALL LETTER I | (graphics) | I |
106 | $6a | j | LATIN SMALL LETTER J | (graphics) | J |
107 | $6b | k | LATIN SMALL LETTER K | (graphics) | K |
108 | $6c | l | LATIN SMALL LETTER L | (graphics) | L |
109 | $6d | m | LATIN SMALL LETTER M | (graphics) | M |
110 | $6e | n | LATIN SMALL LETTER N | (graphics) | N |
111 | $6f | o | LATIN SMALL LETTER O | (graphics) | O |
112 | $70 | p | LATIN SMALL LETTER P | (graphics) | P |
113 | $71 | q | LATIN SMALL LETTER Q | (graphics) | Q |
114 | $72 | r | LATIN SMALL LETTER R | (graphics) | R |
115 | $73 | s | LATIN SMALL LETTER S | (graphics) | S |
116 | $74 | t | LATIN SMALL LETTER T | (graphics) | T |
117 | $75 | u | LATIN SMALL LETTER U | (graphics) | U |
118 | $76 | v | LATIN SMALL LETTER V | (graphics) | V |
119 | $77 | w | LATIN SMALL LETTER W | (graphics) | W |
120 | $78 | x | LATIN SMALL LETTER X | (graphics) | X |
121 | $79 | y | LATIN SMALL LETTER Y | (graphics) | Y |
122 | $7a | z | LATIN SMALL LETTER Z | (graphics) | Z |
123 | $7b | { | LEFT CURLY BRACKET | (graphics) | (graphics) |
124 | $7c | | | VERTICAL LINE | (graphics) | (graphics) |
125 | $7d | } | RIGHT CURLY BRACKET | (graphics) | (graphics) |
126 | $7e | ~ | TILDE | π (pi) | (graphics) |
127 | $7f | DELETE | (graphics) | (graphics) |