In a bash script, I try to read lines from standard input, using built-in read
command after setting IFS=$'\n'
. The lines are truncated at 4095 character limit if I paste input to the read. This limitation seems to come from reading from terminal, because this worked perfectly fine:
fill=
for i in $(seq 1 94); do fill="${fill}x"; done
for i in $(seq 1 100); do printf "%04d00$fill" $i; done | (read line; echo $line)
I experience the same behavior with Python script (did not accept longer than 4095 input from terminal, but accepted from pipe):
#!/usr/bin/python
from sys import stdin
line = stdin.readline()
print('%s' % line)
Even C program works the same, using read(2)
:
#include <stdio.h>
#include <unistd.h>
int main(void)
{
char buf[32768];
int sz = read(0, buf, sizeof(buf) - 1);
buf[sz] = '\0';
printf("READ LINE: [%s]\n", buf);
return 0;
}
In all cases, I cannot enter longer than about 4095 characters. The input prompt stops accepting characters.
Question-1: Is there a way to interactively read from terminal longer than 4095 characters in Linux systems (at least Ubuntu 10.04 and 13.04)?
Question-2: Where does this limitation come from?
Systems affected: I noticed this limitation in Ubuntu 10.04/x86 and 13.04/x86, but Cygwin (recent version at least) does not truncate yet at over 10000 characters (did not test further since I need to get this script working in Ubuntu). Terminals used: Virtual Console and KDE konsole
(Ubuntu 13.04) and gnome-terminal
(Ubuntu 10.04).
Please refer to termios(3) manual page, "Canonical and noncanonical mode".
By default the terminal (standard input) is in canonical mode; in this mode the kernel will buffer the input line before returning the input to the application. The hard-coded limit for Linux (maybe N_TTY_BUF_SIZE
defined in ${linux_source_path}/include/linux/tty.h
) is set to 4096 allowing input of 4095 characters not counting the ending new line. In noncanonical mode there will by default be no buffering by kernel and the read(2) system call returns immediately once a single character of input is returned (key is pressed). You can manipulate the terminal settings to read a specified amount of characters or set a time-out for non-canonical mode, but then too the hard-coded limit is 4095 per the termios(3)
manual page.
Bash read
builtin command still works in non-canonical mode as can be demonstrated by the following:
IFS=$'\n' # Allow spaces and other white spaces.
stty -icanon # Disable canonical mode.
read line # Now we can read without inhibitions set by terminal.
stty icanon # Re-enable canonical mode (assuming it was enabled to begin with).
After this modification of adding stty -icanon
you can paste longer than 4096 character string and read it successfully using bash
built-in read
command (I successfully tried longer than 10000 characters).
If you put this in a file, i.e. make it a script, you can use strace
to see the system calls called, and you will see read(2)
called multiple times, each time returning a single character.
I do not have a workaround for you, but I can answer question 2.
In linux PIPE_BUF is set to 4096 (in limits.h
) If you do a write of more than 4096 to a pipe it will be truncated.
From /usr/include/linux/limits.h
:
#ifndef _LINUX_LIMITS_H
#define _LINUX_LIMITS_H
#define NR_OPEN 1024
#define NGROUPS_MAX 65536 /* supplemental group IDs are available */
#define ARG_MAX 131072 /* # bytes of args + environ for exec() */
#define LINK_MAX 127 /* # links a file may have */
#define MAX_CANON 255 /* size of the canonical input queue */
#define MAX_INPUT 255 /* size of the type-ahead buffer */
#define NAME_MAX 255 /* # chars in a file name */
#define PATH_MAX 4096 /* # chars in a path name including nul */
#define PIPE_BUF 4096 /* # bytes in atomic write to a pipe */
#define XATTR_NAME_MAX 255 /* # chars in an extended attribute name */
#define XATTR_SIZE_MAX 65536 /* size of an extended attribute value (64k) */
#define XATTR_LIST_MAX 65536 /* size of extended attribute namelist (64k) */
#define RTSIG_MAX 32
#endif
The problem is definitely not the read() ; as it can read upto any valid integer value. The problem comes from the heap memory or the pipe size.. as they are the only possible limiting factors to the size..