I have constructed the following program to try to pipe in my own shell. A StringArray
is simply a char**
I have constructed. The code runs fine, but when I put in cat txt.txt | grep a
, nothing prints back to the screen. When debugging, I saw that the code seems to stop around like 152 (where the print-out command is), where pid==0
and i==0
.
For context, I'm calling this function in another function after a pipe has been detected.
void doPipe(StringArray sa)
{
printf("In 69\n");
int filedes[2]; // pos. 0 output, pos. 1 input of the pipe
int filedes2[2];
int num_cmds = 0;
char *command[256];
pid_t pid;
int err = -1;
int end = 0;
// Variables used for the different loops
int i = 0;
int j = 0;
int k = 0;
int l = 0;
// First we calculate the number of commands (they are separated
// by '|')
while (sa[l] != NULL){
if (strcmp(sa[l],"|") == 0){
num_cmds++;
}
l++;
}
num_cmds++;
// Main loop of this method. For each command between '|', the
// pipes will be configured and standard input and/or output will
// be replaced. Then it will be executed
while (sa[j] != NULL && end != 1){
k = 0;
// We use an auxiliary array of pointers to store the command
// that will be executed on each iteration
while (strcmp(sa[j],"|") != 0){
command[k] = sa[j];
j++;
if (sa[j] == NULL){
// 'end' variable used to keep the program from entering
// again in the loop when no more arguments are found
end = 1;
k++;
break;
}
k++;
}
// Last position of the command will be NULL to indicate that
// it is its end when we pass it to the exec function
command[k] = NULL;
j++;
printf("In 121\n");
// Depending on whether we are in an iteration or another, we
// will set different descriptors for the pipes inputs and
// output. This way, a pipe will be shared between each two
// iterations, enabling us to connect the inputs and outputs of
// the two different commands.
if (i % 2 != 0){
pipe(filedes); // for odd i
}else{
pipe(filedes2); // for even i
}
pid=fork();
if(pid==-1){
if (i != num_cmds - 1){
if (i % 2 != 0){
close(filedes[1]); // for odd i
}else{
close(filedes2[1]); // for even i
}
}
printf("Child process could not be created\n");
return;
}
if(pid==0){
printf("In 148\n");
// If we are in the first command
if (i == 0){
printf("In 152\n");
dup2(filedes2[1], STDOUT_FILENO);
}
// If we are in the last command, depending on whether it
// is placed in an odd or even position, we will replace
// the standard input for one pipe or another. The standard
// output will be untouched because we want to see the
// output in the terminal
else if (i == num_cmds - 1){
printf("In 162\n");
if (num_cmds % 2 != 0){ // for odd number of commands
dup2(filedes[0],STDIN_FILENO);
printf("In 166\n");
}else{ // for even number of commands
dup2(filedes2[0],STDIN_FILENO);
printf("In 166\n");
}
// If we are in a command that is in the middle, we will
// have to use two pipes, one for input and another for
// output. The position is also important in order to choose
// which file descriptor corresponds to each input/output
}else{ // for odd i
if (i % 2 != 0){
dup2(filedes2[0],STDIN_FILENO);
dup2(filedes[1],STDOUT_FILENO);
}else{ // for even i
dup2(filedes[0],STDIN_FILENO);
dup2(filedes2[1],STDOUT_FILENO);
}
}
if (execvp(command[0],command)==err){
kill(getpid(),SIGTERM);
}
}
// CLOSING DESCRIPTORS ON PARENT
if (i == 0){
close(filedes2[1]);
}
else if (i == num_cmds - 1){
if (num_cmds % 2 != 0){
close(filedes[0]);
}else{
close(filedes2[0]);
}
}else{
if (i % 2 != 0){
close(filedes2[0]);
close(filedes[1]);
}else{
close(filedes[0]);
close(filedes2[1]);
}
}
waitpid(pid,NULL,0);
i++;
}
}
One of your big problems may be doing waitpid
on each iteration of the pipeline construction. The waiting should be done at the end (remembering the pids in a list).
I had some difficulty understanding your code, so I did some simplification and cleanup. In particular, doing if (i % 2 ...)
everywhere made things harder.
I've cleaned up and fixed the code. I added a struct to make things easier to manage [please pardon the gratuitous style cleanup]:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
typedef struct {
int pipe_fildes[2];
} pipectl_t;
#define CLOSEME(_fd) \
do { \
close(_fd); \
_fd = -1; \
} while (0)
void
doPipe(char **sa)
{
pipectl_t pipes[2];
pipectl_t *pipein;
pipectl_t *pipeout;
pipectl_t *pipetmp;
int num_cmds = 0;
char *command[256];
pid_t pidlist[256];
pid_t pid;
int err = -1;
int end = 0;
// Variables used for the different loops
int icmd = 0;
int j = 0;
int k = 0;
int l = 0;
// First we calculate the number of commands (they are separated
// by '|')
for (int l = 0; sa[l] != NULL; ++l) {
if (strcmp(sa[l], "|") == 0)
num_cmds++;
}
num_cmds++;
for (int ipipe = 0; ipipe <= 1; ++ipipe) {
pipes[ipipe].pipe_fildes[0] = -1;
pipes[ipipe].pipe_fildes[1] = -1;
}
pipein = &pipes[0];
pipeout = &pipes[1];
// Main loop of this method. For each command between '|', the
// pipes will be configured and standard input and/or output will
// be replaced. Then it will be executed
while (sa[j] != NULL && end != 1) {
// We use an auxiliary array of pointers to store the command
// that will be executed on each iteration
k = 0;
while (strcmp(sa[j], "|") != 0) {
command[k] = sa[j];
j++;
k++;
if (sa[j] == NULL) {
// 'end' variable used to keep the program from entering
// again in the loop when no more arguments are found
end = 1;
break;
}
}
// Last position of the command will be NULL to indicate that
// it is its end when we pass it to the exec function
command[k] = NULL;
j++;
// swap input and output, so previous child's output becomes the new
// child's input
// NOTE: by doing this here, in one place, we eliminate all the i % 2
// if statements
pipetmp = pipein;
pipein = pipeout;
pipeout = pipetmp;
// are we the last command?
int lastflg = (icmd == (num_cmds - 1));
// last command does _not_ have an output pipe, so don't create one
if (! lastflg)
pipe(pipeout->pipe_fildes);
pid = fork();
// NOTE: fork failure almost never happens and is fatal
if (pid == -1) {
printf("Child process could not be created\n");
return;
}
// process child
if (pid == 0) {
// NOTE: after we've dup'ed a file descriptor, we close it
// first command does _not_ have a pipe for input
if (icmd > 0)
dup2(pipein->pipe_fildes[0],STDIN_FILENO);
CLOSEME(pipein->pipe_fildes[0]);
// last command does _not_ have a pipe for output
if (! lastflg)
dup2(pipeout->pipe_fildes[1],STDOUT_FILENO);
CLOSEME(pipeout->pipe_fildes[1]);
// close the parent sides of the pipes (in this child)
// close previous child's output descriptor (the feed for our input)
CLOSEME(pipein->pipe_fildes[1]);
// close next child's input descriptor (our feed for its input)
CLOSEME(pipeout->pipe_fildes[0]);
if (execvp(command[0], command) == err) {
#if 0
kill(getpid(), SIGTERM);
#else
exit(1);
#endif
}
}
// close all input descriptors for _this_ child
CLOSEME(pipein->pipe_fildes[0]);
CLOSEME(pipein->pipe_fildes[1]);
// close output side of _this_ child's output pipe [which becomes next
// child's input pipe]
CLOSEME(pipeout->pipe_fildes[1]);
pidlist[icmd] = pid;
icmd++;
}
// wait for all pids _after_ the entire pipeline is constructed
for (int icmd = 0; icmd < num_cmds; ++icmd)
waitpid(pidlist[icmd], NULL, 0);
}
// main -- main program
int
main(int argc,char **argv)
{
char *cp;
char *bp;
char buf[1000];
char **av;
char *avlist[256];
--argc;
++argv;
for (; argc > 0; --argc, ++argv) {
cp = *argv;
if (*cp != '-')
break;
switch (cp[1]) {
default:
break;
}
}
while (1) {
printf("> ");
fflush(stdout);
cp = fgets(buf,sizeof(buf),stdin);
if (cp == NULL)
break;
av = avlist;
bp = buf;
while (1) {
cp = strtok(bp," \t\r\n");
bp = NULL;
if (cp == NULL)
break;
*av++ = cp;
}
*av = NULL;
doPipe(avlist);
}
return 0;
}
UPDATE:
When I run this code, the same command cat txt.txt | grep a
only appears to do the first command, and not the second after the pipe. (It cats out the txt file but does not grep)
I tested the entire program before I posted. I just retested using a cat/grep
command. It worked, but that was my program unchanged.
Any ideas why this could be happening? I implemented your doPipe method in my code and passed in my StringArray sa which is just a char ** as well.
My suggestions are:
- Verify that my unchanged version works for you.
- Using
gdb
breakpoint on doPipe
and look at the arguments. For both programs, they should be the same.
- If
StringArray
is truly char **
, replace it in your version to ensure it makes no difference. That is void doPipe(char **sa)
and see if your code still compiles. In gdb
at the breakpoint, you should be able to do ptype sa
on both programs
- The
StringArray
looks a bit "Java-esque" to me :-) I'd avoid it, particularly here since execvp
wants a char **
- Verify that
sa
is properly NULL
terminated. If it isn't the last command in the pipeline may be bogus/garbage and the error checking for a failed execvp
isn't that robust.
- Verify that
num_cmds
is the same.
- Try
cat txt.txt | grep a | sed -e s/a/b/
. If you get the cat
and grep
, but not the sed
, this means num_cmds
is not correct
- Verify that caller's parsing of the buffer puts the
"|"
in a separate token. That is, this code works with cat txt.txt | grep a
but it will not work with: cat txt.txt|grep a
UPDATE #2:
BTW, if your pipe code still isn't working (e.g. the last command is not executed), check to see if the last token has newline on it (i.e. the newline wasn't stripped correctly).
I've tried all of this but still can't get my redirection code to work with this. Essentially, I'm confused as to where in this code I should be checking for '<' or '>'
Doing the general parsing to support redirection (e.g. <
or >
), pipes (e.g. |
), multiple commands per line (e.g. ;
), embedded sub-shells (e.g. (echo the date is ; date)
, and detached jobs (e.g. &
) can require a bit of care and you need a multilevel approach.
I suspect that after you get pipes and/or redirection working, you're tasked with implementing even more shell syntax. I've done this before, so, rather than you figuring it out piecemeal, here is what you'll need to do ...
You'll need to scan the input buffer char-by-char and save off tokens into a "token" struct that also has a type. You'll need a linked list of those structs. More on this below.
When you encounter a quoted string, you'll need to strip off the quotes: "abc"
--> abc
, being mindful of escaped quotes: "ab\"c
--> ab"c
.
Also, you have to be careful about quoted strings abutting [what perl
calls] "bareword" strings: echo abc
. If we have abc"d ef"ghi
, this needs to be concatenated into a single string token: abcd efghi
Backslashes on redirectors must also be accounted for. echo abc > def
is a redirection that will put abc
into the file def
. But, echo abc \> def
should just output abc > def
literally to stdout. Backslash on the other "punctuation" is similar.
You'll also have to handle the fact that punctuation doesn't have to have whitespace around it. That is, echo abc>def
has to be handled just as if it were echo abc > def
.
Also, punctuation inside a quoted string should be treated as if it were escaped above. That is, echo abc ">" def
is not a redirection and [again] should be treated as a simple command.
Also, if the current line ends in a backslash (e.g. \<newline>
), this means that the next line is a "continuation" line. You should strip the backslash and the newline. Then, read another line and continue to build up the token list.
Further, while &
can be for detached jobs, as in: date &
, it can also be part of a redirection, as in gcc -o myshell myshell.c 2>&1 >logfile
Okay, so to manage all this we need types for tokens and a token struct:
// token types
typedef enum {
TOKEN_NORMAL, // simple token/string
TOKEN_QUO1, // quoted string
TOKEN_QUO2, // quoted string
TOKEN_SEMI, // command separater (e.g. ;)
TOKEN_OREDIR, // output redirector (e.g. >)
TOKEN_IREDIR, // input redirector (e.g. <)
TOKEN_PIPE, // pipe separater (e.g. |)
TOKEN_AMP // an & (can be detach or redirect)
} toktype_t;
// token control
typedef struct token token_t;
struct token {
token_t *tok_next; // forward link
token_t *tok_prev; // backward link
toktype_t tok_type; // token type
char tok_str[256]; // token value
};
// token list
typedef struct tlist tlist_t;
struct token {
tlist_t *tlist_next; // forward link
tlist_t *tlist_prev; // backward link
token_t *tlist_head; // pointer to list head
token_t *tlist_tail; // pointer to list tail
};
Initially, after parsing an input line [being mindful of the continuations], we have a single tlist
.
If the list has ;
separaters in it, we split on them to create sublists. We then loop on the sublists and execute the commands in order.
When looking at a subcommand, if it ends in &
, the command must be run detached. We note that and pop it off the back of the list.
Okay, now we have a list that might be of the form:
cat < /etc/passwd | grep root | sed -e s/root/admin/ > /tmp/out
Now, we do a further split on the |
so we have a list that has three elements:
cat < /etc/passwd
grep root
sed -e s/root/admin/ > /tmp/out
Actually, each of those "lines" is a tlist
and this is a two dimensional list of lists:
list_of_tlists:
|
|
tlist[0] --> cat --> < --> /etc/passwd
|
|
tlist[1] --> grep --> root
|
|
tlist[2] --> sed --> -e --> s/root/admin/ --> > /tmp/out
As we create the pipeline, we note the redirections and do file open
instead of pipe
as needed.
Okay, that's the abstract.
See my answer here: Implementing input/output redirection in a Linux shell using C for a full and complete implementation.
On that page, there is code for just doing redirections. It could probably be adapted to include pipes by merging that code with the code I've posted here.
That OP asked for help in doing redirections and pipes.
Side note: At that time, there was a spate of shell implementation questions. So, I ended up producing a full shell that does pretty much everything. But, that version was too large to post on SO. So, in that page, find the pastebin link I posted. It has full source code. It can be downloaded, built, and run.
You may not want to use that code directly, but it should give you some ideas. Also, the full version may do things a little bit differently than what I've described above.