Ncurses can colorize text but GNU utilities like ls
and diff
apparently colorize text without calling Ncurses. Can I, too, portably colorize text without calling Ncurses? For example, in C:
printf("the word \033[32mgreen\033[0m is printed in color\n");
This works on my installation but does not look very portable. On the other hand, if ls
and diff
do it more or less in this way, then who am I to call the technique nonportable?
Examining GNU sources, I notice that ls
uses dircolors
or $LS_COLORS
, but am not sure that this is relevant to anything but ls
. At any rate, as far as I can see, diff
colorizes using neither dircolors
nor $LS_COLORS
nor Ncurses.
Moreover, less -r
seems to handle my example's output without trouble.
Am I missing something? Is issuing raw escape codes like \033[32m
for green really the conventional way to colorize text whenever the full machinery of Ncurses is unwanted? Or does there exist a standard, more orderly lightweight technique of which I am unaware?
REFERENCES
A question from Stackoverflow's early days treats the topic.
For further information and convenience of reference, the escape sequences of VT100/ANSI/ECMA-48, including colorizers, are explained and cataloged in the Ncurses source toward the end of the source file misc/terminfo.src, excerpted as follows.
#### VT100/ANSI/ECMA-48
#
# ANSI Standard (X3.64) Control Sequences for Video Terminals and Peripherals
# and ECMA-48 Control Functions for Coded Character Sets.
#
# Much of the content of this comment is adapted from a table prepared by
# Richard Shuford, based on a 1984 Byte article. Terminfo correspondences,
# discussion of some terminfo-related issues, and updates to capture ECMA-48
# have been added. Control functions described in ECMA-48 only are tagged
# with * after their names.
#
# The table is a complete list of the defined ANSI X3.64/ECMA-48 control
# sequences. In the main table, \E stands for an escape (\033) character,
# SPC for space. Pn stands for a single numeric parameter to be inserted
# in decimal ASCII. Ps stands for a list of such parameters separated by
# semicolons. Parameter meanings for most parametrized sequences are
# decribed in the notes.
#
# Sequence Sequence Parameter or
# Mnemonic Name Sequence Value Mode terminfo
# -----------------------------------------------------------------------------
# APC Applicatn Program Command \E _ - Delim -
# BEL Bell * ^G - - bel
# BPH Break Permitted Here * \E B - * -
# BS Backpace * ^H - EF -
# CAN Cancel * ^X - - - (A)
# CBT Cursor Backward Tab \E [ Pn Z 1 eF cbt
# CCH Cancel Previous Character \E T - - -
# CHA Cursor Horizntal Absolute \E [ Pn G 1 eF hpa (B)
# CHT Cursor Horizontal Tab \E [ Pn I 1 eF tab (C)
# CMD Coding Method Delimiter * \E
# CNL Cursor Next Line \E [ Pn E 1 eF nel (D)
# CPL Cursor Preceding Line \E [ Pn F 1 eF -
# CPR Cursor Position Report \E [ Pn ; Pn R 1, 1 - - (E)
# CSI Control Sequence Intro \E [ - Intro -
# CTC Cursor Tabulation Control \E [ Ps W 0 eF - (F)
# CUB Cursor Backward \E [ Pn D 1 eF cub
# CUD Cursor Down \E [ Pn B 1 eF cud
# CUF Cursor Forward \E [ Pn C 1 eF cuf
# CUP Cursor Position \E [ Pn ; Pn H 1, 1 eF cup (G)
# CUU Cursor Up \E [ Pn A 1 eF cuu
# CVT Cursor Vertical Tab \E [ Pn Y - eF - (H)
# DA Device Attributes \E [ Pn c 0 - -
# DAQ Define Area Qualification \E [ Ps o 0 - -
# DCH Delete Character \E [ Pn P 1 eF dch
# DCS Device Control String \E P - Delim -
# DL Delete Line \E [ Pn M 1 eF dl
# DLE Data Link Escape * ^P - - -
# DMI Disable Manual Input \E \ - Fs -
# DSR Device Status Report \E [ Ps n 0 - - (I)
# DTA Dimension Text Area * \E [ Pn ; Pn SPC T - PC -
# EA Erase in Area \E [ Ps O 0 eF - (J)
# ECH Erase Character \E [ Pn X 1 eF ech
# ED Erase in Display \E [ Ps J 0 eF ed (J)
# EF Erase in Field \E [ Ps N 0 eF -
# EL Erase in Line \E [ Ps K 0 eF el (J)
# EM End of Medium * ^Y - - -
# EMI Enable Manual Input \E b Fs -
# ENQ Enquire ^E - - -
# EOT End Of Transmission ^D - * -
# EPA End of Protected Area \E W - - - (K)
# ESA End of Selected Area \E G - - -
# ESC Escape ^[ - - -
# ETB End Transmission Block ^W - - -
# ETX End of Text ^C - - -
# FF Form Feed ^L - - -
# FNK Function Key * \E [ Pn SPC W - - -
# GCC Graphic Char Combination* \E [ Pn ; Pn SPC B - - -
# FNT Font Selection \E [ Pn ; Pn SPC D 0, 0 FE -
# GSM Graphic Size Modify \E [ Pn ; Pn SPC B 100, 100 FE - (L)
# GSS Graphic Size Selection \E [ Pn SPC C none FE -
# HPA Horz Position Absolute \E [ Pn ` 1 FE - (B)
# HPB Char Position Backward \E [ j 1 FE -
# HPR Horz Position Relative \E [ Pn a 1 FE - (M)
# HT Horizontal Tab * ^I - FE - (N)
# HTJ Horz Tab w/Justification \E I - FE -
# HTS Horizontal Tab Set \E H - FE hts
# HVP Horz & Vertical Position \E [ Pn ; Pn f 1, 1 FE - (G)
# ICH Insert Character \E [ Pn @ 1 eF ich
# IDCS ID Device Control String \E [ SPC O - * -
# IGS ID Graphic Subrepertoire \E [ SPC M - * -
# IL Insert Line \E [ Pn L 1 eF il
# IND Index \E D - FE -
# INT Interrupt \E a - Fs -
# JFY Justify \E [ Ps SPC F 0 FE -
# IS1 Info Separator #1 * ^_ - * -
# IS2 Info Separator #1 * ^^ - * -
# IS3 Info Separator #1 * ^] - * -
# IS4 Info Separator #1 * ^\ - * -
# LF Line Feed ^J - - -
# LS1R Locking Shift Right 1 * \E ~ - - -
# LS2 Locking Shift 2 * \E n - - -
# LS2R Locking Shift Right 2 * \E } - - -
# LS3 Locking Shift 3 * \E o - - -
# LS3R Locking Shift Right 3 * \E | - - -
# MC Media Copy \E [ Ps i 0 - - (S)
# MW Message Waiting \E U - - -
# NAK Negative Acknowledge * ^U - * -
# NBH No Break Here * \E C - - -
# NEL Next Line \E E - FE nel (D)
# NP Next Page \E [ Pn U 1 eF -
# NUL Null * ^@ - - -
# OSC Operating System Command \E ] - Delim -
# PEC Pres. Expand/Contract * \E Pn SPC Z 0 - -
# PFS Page Format Selection * \E Pn SPC J 0 - -
# PLD Partial Line Down \E K - FE - (T)
# PLU Partial Line Up \E L - FE - (U)
# PM Privacy Message \E ^ - Delim -
# PP Preceding Page \E [ Pn V 1 eF -
# PPA Page Position Absolute * \E [ Pn SPC P 1 FE -
# PPB Page Position Backward * \E [ Pn SPC R 1 FE -
# PPR Page Position Forward * \E [ Pn SPC Q 1 FE -
# PTX Parallel Texts * \E [ \ - - -
# PU1 Private Use 1 \E Q - - -
# PU2 Private Use 2 \E R - - -
# QUAD Typographic Quadding \E [ Ps SPC H 0 FE -
# REP Repeat Char or Control \E [ Pn b 1 - rep
# RI Reverse Index \E M - FE - (V)
# RIS Reset to Initial State \E c - Fs -
# RM Reset Mode * \E [ Ps l - - - (W)
# SACS Set Add. Char. Sep. * \E [ Pn SPC / 0 - -
# SAPV Sel. Alt. Present. Var. * \E [ Ps SPC ] 0 - - (X)
# SCI Single-Char Introducer \E Z - - -
# SCO Sel. Char. Orientation * \E [ Pn ; Pn SPC k - - -
# SCS Set Char. Spacing * \E [ Pn SPC g - - -
# SD Scroll Down \E [ Pn T 1 eF rin
# SDS Start Directed String * \E [ Pn ] 1 - -
# SEE Select Editing Extent \E [ Ps Q 0 - - (Y)
# SEF Sheet Eject & Feed * \E [ Ps ; Ps SPC Y 0,0 - -
# SGR Select Graphic Rendition \E [ Ps m 0 FE sgr (O)
# SHS Select Char. Spacing * \E [ Ps SPC K 0 - -
# SI Shift In ^O - - - (P)
# SIMD Sel. Imp. Move Direct. * \E [ Ps ^ - - -
# SL Scroll Left \E [ Pn SPC @ 1 eF -
# SLH Set Line Home * \E [ Pn SPC U - - -
# SLL Set Line Limit * \E [ Pn SPC V - - -
# SLS Set Line Spacing * \E [ Pn SPC h - - -
# SM Select Mode \E [ Ps h none - - (W)
# SO Shift Out ^N - - - (Q)
# SOH Start Of Heading * ^A - - -
# SOS Start of String * \E X - - -
# SPA Start of Protected Area \E V - - - (Z)
# SPD Select Pres. Direction * \E [ Ps ; Ps SPC S 0,0 - -
# SPH Set Page Home * \E [ Ps SPC G - - -
# SPI Spacing Increment \E [ Pn ; Pn SPC G none FE -
# SPL Set Page Limit * \E [ Ps SPC j - - -
# SPQR Set Pr. Qual. & Rapid. * \E [ Ps SPC X 0 - -
# SR Scroll Right \E [ Pn SPC A 1 eF -
# SRCS Set Reduced Char. Sep. * \E [ Pn SPC f 0 - -
# SRS Start Reversed String * \E [ Ps [ 0 - -
# SSA Start of Selected Area \E F - - -
# SSU Select Size Unit * \E [ Pn SPC I 0 - -
# SSW Set Space Width * \E [ Pn SPC [ none - -
# SS2 Single Shift 2 (G2 set) \E N - Intro -
# SS3 Single Shift 3 (G3 set) \E O - Intro -
# ST String Terminator \E \ - Delim -
# STAB Selective Tabulation * \E [ Pn SPC ^ - - -
# STS Set Transmit State \E S - - -
# STX Start pf Text * ^B - - -
# SU Scroll Up \E [ Pn S 1 eF indn
# SUB Substitute * ^Z - - -
# SVS Select Line Spacing * \E [ Pn SPC \ 1 - -
# SYN Synchronous Idle * ^F - - -
# TAC Tabul. Aligned Centered * \E [ Pn SPC b - - -
# TALE Tabul. Al. Leading Edge * \E [ Pn SPC a - - -
# TATE Tabul. Al. Trailing Edge* \E [ Pn SPC ` - - -
# TBC Tab Clear \E [ Ps g 0 FE tbc
# TCC Tabul. Centered on Char * \E [ Pn SPC c - - -
# TSR Tabulation Stop Remove * \E [ Pn SPC d - FE -
# TSS Thin Space Specification \E [ Pn SC E none FE -
# VPA Vert. Position Absolute \E [ Pn d 1 FE vpa
# VPB Line Position Backward * \E [ Pn k 1 FE -
# VPR Vert. Position Relative \E [ Pn e 1 FE - (R)
# VT Vertical Tabulation * ^K - FE -
# VTS Vertical Tabulation Set \E J - FE -
#
# ---------------------------------------------------------------------------
#
# Notes:
#
# Some control characters are listed in the ECMA-48 standard without
# being assigned functions relevant to terminal control there (they
# referred to other standards such as ISO 1745 or ECMA-35). They are listed
# here anyway for completeness.
#
# (A) ECMA-48 calls this "CancelCharacter" but retains the CCH abbreviation.
#
# (B) There seems to be some confusion abroad between CHA and HPA. Most
# `ANSI' terminals accept the CHA sequence, not the HPA. but terminfo calls
# the capability (hpa). ECMA-48 calls this "Cursor Character Absolute" but
# preserved the CHA abbreviation.
#
# (C) CHT corresponds to terminfo (tab). Usually it has the value ^I.
# Occasionally (as on, for example, certain HP terminals) this has the HTJ
# value. ECMA-48 calls this "Cursor Forward Tabulation" but preserved the
# CHT abbreviation.
#
# (D) terminfo (nel) is usually \r\n rather than ANSI \EE.
#
# (E) ECMA-48 calls this "Active Position Report" but preserves the CPR
# abbreviation.
#
# (F) CTC parameter values: 0 = set char tab, 1 = set line tab, 2 = clear
# char tab, 3 = clear line tab, 4 = clear all char tabs on current line,
# 5 = clear all char tabs, 6 = clear all line tabs.
#
# (G) CUP and HVP are identical in effect. Some ANSI.SYS versions accept
# HVP, but always allow CUP as an alternate. ECMA-48 calls HVP "Character
# Position Absolute" but retains the HVP abbreviation.
#
# (H) ECMA calls this "Cursor Line Tabulation" but preserves the CVT
# abbreviation.
#
# (I) DSR parameter values: 0 = ready, 1 = busy, 2 = busy, will send DSR
# later, 3 = malfunction, 4 = malfunction, will send DSR later, 5 = request
# DSR, 6 = request CPR response.
#
# (J) ECMA calls ED "Erase In Page". EA/ED/EL parameters: 0 = clear to end,
# 1 = clear from beginning, 2 = clear.
#
# (K) ECMA calls this "End of Guarded Area" but preserves the EPA abbreviation.
#
# (L) The GSM parameters are vertical and horizontal parameters to scale by.
#
# (M) Some ANSI.SYS versions accept HPR, but more commonly `ANSI' terminals
# use CUF for this function and ignore HPR. ECMA-48 calls this "Character
# Position Relative" but retains the HPR abbreviation.
#
# (N) ECMA-48 calls this "Character Tabulation" but retains the HT
# abbreviation.
#
# (O) SGR parameter values: 0 = default mode (attributes off), 1 = bold,
# 2 = dim, 3 = italicized, 4 = underlined, 5 = slow blink, 6 = fast blink,
# 7 = reverse video, 8 = invisible, 9 = crossed-out (marked for deletion),
# 10 = primary font, 10 + n (n in 1..9) = nth alternative font, 20 = Fraktur,
# 21 = double underline, 22 = turn off 2, 23 = turn off 3, 24 = turn off 4,
# 25 = turn off 5, 26 = proportional spacing, 27 = turn off 7, 28 = turn off
# 8, 29 = turn off 9, 30 = black fg, 31 = red fg, 32 = green fg, 33 = yellow
# fg, 34 = blue fg, 35 = magenta fg, 36 = cyan fg, 37 = white fg, 38 = set
# fg color as in CCIT T.416, 39 = set default fg color, 40 = black bg
# 41 = red bg, 42 = green bg, 43 = yellow bg, 44 = blue bg, 45 = magenta bg,
# 46 = cyan bg, 47 = white bg, 48 = set bg color as in CCIT T.416, 39 = set
# default bg color, 50 = turn off 26, 51 = framed, 52 = encircled, 53 =
# overlined, 54 = turn off 51 & 52, 55 = not overlined, 56-59 = reserved,
# 61-65 = variable highlights for ideograms.
#
# (P) SI is also called LSO, Locking Shift Zero.
#
# (Q) SI is also called LS1, Locking Shift One.
#
# (R) Some ANSI.SYS versions accept VPR, but more commonly `ANSI' terminals
# use CUD for this function and ignore VPR. ECMA calls it `Line Position
# Absolute' but retains the VPA abbreviation.
#
# (S) MC parameters: 0 = start xfer to primary aux device, 1 = start xfer from
# primary aux device, 2 = start xfer to secondary aux device, 3 = start xfer
# from secondary aux device, 4 = stop relay to primary aux device, 5 =
# start relay to primary aux device, 6 = stop relay to secondary aux device,
# 7 = start relay to secondary aux device.
#
# (T) ECMA-48 calls this "Partial Line Forward" but retains the PLD
# abbreviation.
#
# (U) ECMA-48 calls this "Partial Line Backward" but retains the PLU
# abbreviation.
#
# (V) ECMA-48 calls this "Reverse Line Feed" but retains the RI abbreviation.
#
# (W) RM/SM modes are as follows: 1 = Guarded Area Transfer Mode (GATM),
# 2 = Keyboard Action Mode (KAM), 3 = Control Representation Mode (CRM),
# 4 = Insertion Replacement Mode, 5 = Status Report Transfer Mode (SRTM),
# 6 = Erasure Mode (ERM), 7 = Line Editing Mode (LEM), 8 = Bi-Directional
# Support Mode (BDSM), 9 = Device Component Select Mode (DCSM),
# 10 = Character Editing Mode (HEM), 11 = Positioning Unit Mode (PUM),
# 12 = Send/Receive Mode, 13 = Format Effector Action Mode (FEAM),
# 14 = Format Effector Transfer Mode (FETM), 15 = Multiple Area Transfer
# Mode (MATM), 16 = Transfer Termination Mode, 17 = Selected Area Transfer
# Mode, 18 = Tabulation Stop Mode, 19 = Editing Boundary Mode, 20 = Line Feed
# New Line Mode (LF/NL), Graphic Rendition Combination Mode (GRCM), 22 =
# Zero Default Mode (ZDM). The EBM and LF/NL modes have actually been removed
# from ECMA-48's 5th edition but are listed here for reference.
#
# (X) Select Alternate Presentation Variants is used only for non-Latin
# alphabets.
#
# (Y) "Select Editing Extent" (SEE) was ANSI "Select Edit Extent Mode" (SEM).
#
# (Z) ECMA-48 calls this "Start of Guarded Area" but retains the SPA
# abbreviation.
#
# ---------------------------------------------------------------------------
#
# Abbreviations:
#
# Intro an Introducer of some kind of defined sequence; the normal 7-bit
# X3.64 Control Sequence Introducer is the two characters "Escape ["
#
# Delim a Delimiter
#
# x/y identifies a character by position in the ASCII table (column/row)
#
# eF editor function (see explanation)
#
# FE format effector (see explanation)
#
# F is a Final character in
# an Escape sequence (F from 3/0 to 7/14 in the ASCII table)
# a control sequence (F from 4/0 to 7/14)
#
# Gs is a graphic character appearing in strings (Gs ranges from
# 2/0 to 7/14) in the ASCII table
#
# Ce is a control represented as a single bit combination in the C1 set
# of controls in an 8-bit character set
#
# C0 the familiar set of 7-bit ASCII control characters
#
# C1 roughly, the set of control chars available only in 8-bit systems.
# This is too complicated to explain fully here, so read Jim Fleming's
# article in the February 1983 BYTE, especially pages 214 through 224.
#
# Fe is a Final character of a 2-character Escape sequence that has an
# equivalent representation in an 8-bit environment as a Ce-type
# (Fe ranges from 4/0 to 5/15)
#
# Fs is a Final character of a 2-character Escape sequence that is
# standardized internationally with identical representation in 7-bit
# and 8-bit environments and is independent of the currently
# designated C0 and C1 control sets (Fs ranges from 6/0 to 7/14)
#
# I is an Intermediate character from 2/0 to 2/15 (inclusive) in the
# ASCII table
#
# P is a parameter character from 3/0 to 3/15 (inclusive) in the ASCII
# table
#
# Pn is a numeric parameter in a control sequence, a string of zero or
# more characters ranging from 3/0 to 3/9 in the ASCII table
#
# Ps is a variable number of selective parameters in a control sequence
# with each selective parameter separated from the other by the code
# 3/11 (which usually represents a semicolon); Ps ranges from
# 3/0 to 3/9 and includes 3/11
#
# * Not relevant to terminal control, listed for completeness only.
#
# Format Effectors versus Editor Functions
#
# A format effector specifies how following output is to be displayed.
# An editor function allows you to modify the display. Informally
# format effectors may be destructive; format effectors should not be.
#
# For instance, a format effector that moves the "active position" (the
# cursor or equivalent) one space to the left would be useful when you want to
# create an overstrike, a compound character made of two standard characters
# overlaid. Control-H, the Backspace character, is actually supposed to be a
# format effector, so you can do this. But many systems use it in a
# nonstandard fashion, as an editor function, deleting the character to the
# left of the cursor and moving the cursor left. When Control-H is assumed to
# be an editor function, you cannot predict whether its use will create an
# overstrike unless you also know whether the output device is in an "insert
# mode" or an "overwrite mode". When Control-H is used as a format effector,
# its effect can always be predicted. The familiar characters carriage
# return, linefeed, formfeed, etc., are defined as format effectors.
#
# NOTES ON THE DEC VT100 IMPLEMENTATION
#
# Control sequences implemented in the VT100 are as follows:
#
# CPR, CUB, CUD, CUF, CUP, CUU, DA, DSR, ED, EL, HTS, HVP, IND,
# LNM, NEL, RI, RIS, RM, SGR, SM, TBC
#
# plus several private DEC commands.
#
# Erasing parts of the display (EL and ED) in the VT100 is performed thus:
#
# Erase from cursor to end of line Esc [ 0 K or Esc [ K
# Erase from beginning of line to cursor Esc [ 1 K
# Erase line containing cursor Esc [ 2 K
# Erase from cursor to end of screen Esc [ 0 J or Esc [ J
# Erase from beginning of screen to cursor Esc [ 1 J
# Erase entire screen Esc [ 2 J
#
# Some brain-damaged terminal/emulators respond to Esc [ J as if it were
# Esc [ 2 J, but this is wrong; the default is 0.
#
# The VT100 responds to receiving the DA (Device Attributes) control
#
# Esc [ c (or Esc [ 0 c)
#
# by transmitting the sequence
#
# Esc [ ? l ; Ps c
#
# where Ps is a character that describes installed options.
#
# The VT100's cursor location can be read with the DSR (Device Status
# Report) control
#
# Esc [ 6 n
#
# The VT100 reports by transmitting the CPR sequence
#
# Esc [ Pl ; Pc R
#
# where Pl is the line number and Pc is the column number (in decimal).
#
# The specification for the DEC VT100 is document EK-VT100-UG-003.
#### ANSI.SYS
#
# Here is a description of the color and attribute controls supported in the
# the ANSI.SYS driver under MS-DOS. Most console drivers and ANSI
# terminal emulators for Intel boxes obey these. They are a proper subset
# of the ECMA-48 escapes.
#
# 0 all attributes off
# 1 foreground bright
# 4 underscore on
# 5 blink on/background bright (not reliable with brown)
# 7 reverse-video
# 8 set blank (non-display)
# 10 set primary font
# 11 set first alternate font (on PCs, display ROM characters 1-31)
# 12 set second alternate font (on PCs, display IBM high-half chars)
#
# Color attribute sets
# 3n set foreground color / 0=black, 1=red, 2=green, 3=brown,
# 4n set background color \ 4=blue, 5=magenta, 6=cyan, 7=white
# Bright black becomes gray. Bright brown becomes yellow,
# These coincide with the prescriptions of the ISO 6429/ECMA-48 standard.
#
# * If the 5 attribute is on and you set a background color (40-47) it is
# supposed to enable bright background.
#
# * Many VGA cards (such as the Paradise and compatibles) do the wrong thing
# when you try to set a "bright brown" (yellow) background with attribute
# 5 (you get a blinking yellow foreground instead). A few displays
# (including the System V console) support an attribute 6 that undoes this
# braindamage (this is required by iBCS2).
#
# * Some older versions of ANSI.SYS have a bug that causes thems to require
# ESC [ Pn k as EL rather than the ANSI ESC [ Pn K. (This is not ECMA-48
# compatible.)