Remove most VT52, VT100-VT525, ANSI escape sequences
Posted: Thu Sep 28, 2023 3:22 pm
It would be good if all escape sequences could be removed when printing to the screen, but not removed when echoing to a serial terminal or the console the emulator runs in, since terminal escape sequences that are not supported should be ignored by the device. This would also make it possible to run more programs that use e.g. ANSI escape sequences without modification and a connected terminal could act on those sequences. As it is now Escape is ignored, but not the characters after it, when printing to the screen.
The command sed 's/\x1b[[(]*[ ?!,0-9;]*[\x0-\x1a=>@-Za-z~]//g;s/\x1b.*[\\\x7\$]//g' removes most terminal escape sequences and doesn't remove real text except in these cases:
ESC 7 Save Cursor Position in Memory DECSC
ESC 8 Restore Cursor Position from Memory DECSR
ESC ( 0 Designate Character Set – DEC Line Drawing
ESC Y rc VT52 Direct Cursor Addressing
There might be other more obscure Escape codes that are also not terminated correctly, but since a control character or letter is usually printed often, not many characters would be ignored.
ESC 7 and ESC 8 can sometimes be replaced with ESC [ s and ESC [ u.
For this no real text is removed but the two visible characters for the coordinates are printed on the screen:
ESC Y rc VT52 Direct Cursor Addressing
If X16, in the outputed characters, encounters the Escape character it should not print it to screen and ignore the following characters until after it encounters a character with a code less than 27 or greater than 63. This is a simple rule because an escape sequence can contain ESC (27) and ? (63), but it can be ended with e.g. BEL (7) or a character equal to or greater than @ (64). This is a simpler rule than the sed command above.
I think this could be implemented so that printing is hardly slowed down. It is similar to quote mode. It's also better to start implementing correct Escape handling now so that people don't depend on the non standard behavior after Escape that exists now.
The command sed 's/\x1b[[(]*[ ?!,0-9;]*[\x0-\x1a=>@-Za-z~]//g;s/\x1b.*[\\\x7\$]//g' removes most terminal escape sequences and doesn't remove real text except in these cases:
ESC 7 Save Cursor Position in Memory DECSC
ESC 8 Restore Cursor Position from Memory DECSR
ESC ( 0 Designate Character Set – DEC Line Drawing
ESC Y rc VT52 Direct Cursor Addressing
There might be other more obscure Escape codes that are also not terminated correctly, but since a control character or letter is usually printed often, not many characters would be ignored.
ESC 7 and ESC 8 can sometimes be replaced with ESC [ s and ESC [ u.
For this no real text is removed but the two visible characters for the coordinates are printed on the screen:
ESC Y rc VT52 Direct Cursor Addressing
If X16, in the outputed characters, encounters the Escape character it should not print it to screen and ignore the following characters until after it encounters a character with a code less than 27 or greater than 63. This is a simple rule because an escape sequence can contain ESC (27) and ? (63), but it can be ended with e.g. BEL (7) or a character equal to or greater than @ (64). This is a simpler rule than the sed command above.
I think this could be implemented so that printing is hardly slowed down. It is similar to quote mode. It's also better to start implementing correct Escape handling now so that people don't depend on the non standard behavior after Escape that exists now.