int execve(const char *filename, char *const argv[],
char *const envp[]);
#! interpreter [optional-arg]
For details of the latter case, see "Interpreter scripts" below.
argv is an array of argument strings passed to the new program. By convention, the first of these strings should contain the filename associated with the file being executed. envp is an array of strings, conventionally of the form key=value, which are passed as environment to the new program. Both argv and envp must be terminated by a null pointer. The argument vector and environment can be accessed by the called program's main function, when it is defined as:
int main(int argc, char *argv[], char *envp[])
execve() does not return on success, and the text, data, bss, and stack of the calling process are overwritten by that of the program loaded.
If the current program is being ptraced, a SIGTRAP is sent to it after a successful execve().
If the set-user-ID bit is set on the program file pointed to by filename, and the underlying filesystem is not mounted nosuid (the MS_NOSUID flag for mount(2)), and the calling process is not being ptraced, then the effective user ID of the calling process is changed to that of the owner of the program file. Similarly, when the set-group-ID bit of the program file is set the effective group ID of the calling process is set to the group of the program file.
The effective user ID of the process is copied to the saved set-user-ID; similarly, the effective group ID is copied to the saved set-group-ID. This copying takes place after any effective ID changes that occur because of the set-user-ID and set-group-ID permission bits.
If the executable is an a.out dynamically linked binary executable containing shared-library stubs, the Linux dynamic linker ld.so(8) is called at the start of execution to bring needed shared libraries into memory and link the executable with them.
If the executable is a dynamically linked ELF executable, the interpreter named in the PT_INTERP segment is used to load the needed shared libraries. This interpreter is typically /lib/ld-linux.so.2 for binaries linked with glibc.
All process attributes are preserved during an execve(), except the following:
The process attributes in the preceding list are all specified in POSIX.1-2001. The following Linux-specific process attributes are also not preserved during an execve():
Note the following further points:
#! interpreter [optional-arg]
The interpreter must be a valid pathname for an executable which is not itself a script. If the filename argument of execve() specifies an interpreter script, then interpreter will be invoked with the following arguments:
interpreter [optional-arg] filename arg...
where arg... is the series of words pointed to by the argv argument of execve(), starting at argv[1].
For portable use, optional-arg should either be absent, or be specified as a single word (i.e., it should not contain white space); see NOTES below.
On Linux prior to kernel 2.6.23, the memory used to store the environment and argument strings was limited to 32 pages (defined by the kernel constant MAX_ARG_PAGES). On architectures with a 4-kB page size, this yields a maximum size of 128 kB.
On kernel 2.6.23 and later, most architectures support a size limit derived from the soft RLIMIT_STACK resource limit (see getrlimit(2)) that is in force at the time of the execve() call. (Architectures with no memory management unit are excepted: they maintain the limit that was in effect before kernel 2.6.23.) This change allows programs to have a much larger argument and/or environment list. For these architectures, the total size is limited to 1/4 of the allowed stack size. (Imposing the 1/4-limit ensures that the new program always has some stack space.) Since Linux 2.6.25, the kernel places a floor of 32 pages on this size limit, so that, even when RLIMIT_STACK is set very low, applications are guaranteed to have at least as much argument and environment space as was provided by Linux 2.6.23 and earlier. (This guarantee was not provided in Linux 2.6.23 and 2.6.24.) Additionally, the limit per string is 32 pages (the kernel constant MAX_ARG_STRLEN), and the maximum number of strings is 0x7FFFFFFF.
The result of mounting a filesystem nosuid varies across Linux kernel versions: some will refuse execution of set-user-ID and set-group-ID executables when this would give the user powers she did not have already (and return EPERM), some will just ignore the set-user-ID and set-group-ID bits and exec() successfully. On Linux, argv and envp can be specified as NULL. In both cases, this has the same effect as specifying the argument as a pointer to a list containing a single null pointer. Do not take advantage of this misfeature! It is nonstandard and nonportable: on most other UNIX systems doing this will result in an error (EFAULT).
POSIX.1-2001 says that values returned by sysconf(3) should be invariant over the lifetime of a process. However, since Linux 2.6.23, if the RLIMIT_STACK resource limit changes, then the value reported by _SC_ARG_MAX will also change, to reflect the fact that the limit on space for holding command-line arguments and environment variables has changed.
In most cases where execve() fails, control returns to the original executable image, and the caller of execve() can then handle the error. However, in (rare) cases (typically caused by resource exhaustion), failure may occur past the point of no return: the original executable image has been torn down, but the new image could not be completely built. In such cases, the kernel kills the process with a SIGKILL signal.
The semantics of the optional-arg argument of an interpreter script vary across implementations. On Linux, the entire string following the interpreter name is passed as a single argument to the interpreter, and this string can include white space. However, behavior differs on some other systems. Some systems use the first white space to terminate optional-arg. On some systems, an interpreter script can have multiple arguments, and white spaces in optional-arg are used to delimit the arguments.
Linux ignores the set-user-ID and set-group-ID bits on scripts.
The EAGAIN error can occur when a preceding call to setuid(2), setreuid(2), or setresuid(2) caused the real user ID of the process to change, and that change caused the process to exceed its RLIMIT_NPROC resource limit (i.e., the number of processes belonging to the new real UID exceeds the resource limit). From Linux 2.6.0 to 3.0, this caused the set*uid() call to fail. (Prior to 2.6, the resource limit was not imposed on processes that changed their user IDs.)
Since Linux 3.1, the scenario just described no longer causes the set*uid() call to fail, because it too often led to security holes where buggy applications didn't check the return status and assumed that---if the caller had root privileges---the call would always succeed. Instead, the set*uid() calls now successfully change the real UID, but the kernel sets an internal flag, named PF_NPROC_EXCEEDED, to note that the RLIMIT_NPROC resource limit has been exceeded. If the PF_NPROC_EXCEEDED flag is set and the resource limit is still exceeded at the time of a subsequent execve() call, that call fails with the error EAGAIN. This kernel logic ensures that the RLIMIT_NPROC resource limit is still enforced for the common privileged daemon workflow---namely, fork(2) + set*uid() + execve().
If the resource limit was not still exceeded at the time of the execve() call (because other processes belonging to this real UID terminated between the set*uid() call and the execve() call), then the execve() call succeeds and the kernel clears the PF_NPROC_EXCEEDED process flag. The flag is also cleared if a subsequent call to fork(2) by this process succeeds.
/* myecho.c */ #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]) { int j; for (j = 0; j < argc; j++) printf("argv[%d]: %s\n", j, argv[j]); exit(EXIT_SUCCESS); }
This program can be used to exec the program named in its command-line argument:
/* execve.c */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main(int argc, char *argv[]) { char *newargv[] = { NULL, "hello", "world", NULL }; char *newenviron[] = { NULL }; if (argc != 2) { fprintf(stderr, "Usage: %s <file-to-exec>\n", argv[0]); exit(EXIT_FAILURE); } newargv[0] = argv[1]; execve(argv[1], newargv, newenviron); perror("execve"); /* execve() only returns on error */ exit(EXIT_FAILURE); }
We can use the second program to exec the first as follows:
$ cc myecho.c -o myecho $ cc execve.c -o execve $ ./execve ./myecho argv[0]: ./myecho argv[1]: hello argv[2]: world
We can also use these programs to demonstrate the use of a script interpreter. To do this we create a script whose "interpreter" is our myecho program:
$ cat > script.sh #! ./myecho script-arg ^D $ chmod +x script.sh
We can then use our program to exec the script:
$ ./execve ./script.sh argv[0]: ./myecho argv[1]: script-arg argv[2]: ./script.sh argv[3]: hello argv[4]: world