Files and Directories
Introduction
In the previous chapter we covered the basic functions that perform I/O. The discussion centered on I/O for regular files—opening a file, and reading or writing a file. We’ll now look at additional features of the file system and the properties of a file. We’ll start with the stat functions and go through each member of the stat structure, looking at all the attributes of a file. In this process, we’ll also describe each of the functions that modify these attributes: change the owner, change the permissions, and so on. We’ll also look in more detail at the structure of a UNIX file system and symbolic links. We finish this chapter with the functions that operate on directories, and we develop a function that descends through a directory hierarchy.
stat, fstat, fstatat, and lstat Functions
The discussion in this chapter centers on the four stat functions and the information they return.
#include <sys/stat.h>
int stat(const char *restrict pathname, struct stat *restrict buf);
int fstat(int fd, struct stat *buf);
int lstat(const char *restrict pathname, struct stat *restrict buf);
int fstatat(int fd, const char *restrict pathname, struct stat *restrict buf, int flag);
All four return: 0 if OK, ?1 on error
Given a pathname, the stat function returns a structure of information about the named file. The fstat function obtains information about the file that is already open on the descriptor fd. The lstat function is similar to stat, but when the named file is a symbolic link, lstat returns information about the symbolic link, not the file referenced by the symbolic link. (We’ll need lstat in Section 4.22 when we walk down a directory hierarchy. We describe symbolic links in more detail in Section 4.17.)
The fstatat function provides a way to return the file statistics for a pathname relative to an open directory represented by the fd argument. The flag argument controls whether symbolic links are followed; when the AT_SYMLINK_NOFOLLOW flag is set, fstatat will not follow symbolic links, but rather returns information about the link itself. Otherwise, the default is to follow symbolic links, returning information about the file to which the symbolic link points. If the fd argument has the value AT_FDCWD and the pathname argument is a relative pathname, then fstatat evaluates the pathname argument relative to the current directory. If the pathname argument is an absolute pathname, then the fd argument is ignored. In these two cases, fstatat behaves like either stat or lstat, depending on the value of flag.
The buf argument is a pointer to a structure that we must supply. The functions fill in the structure. The definition of the structure can differ among implementations, but it could look like
struct stat {
mode_t st_mode; /* file type & mode (permissions) */
ino_t st_ino; /* i-node number (serial number) */
dev_t st_dev; /* device number (file system) */
dev_t st_rdev; /* device number for special files */
nlink_t st_nlink; /* number of links */
uid_t st_uid; /* user ID of owner */
gid_t st_gid; /* group ID of owner */
off_t st_size; /* size in bytes, for regular files */
struct timespec st_atim; /* time of last access */
struct timespec st_mtim; /* time of last modification */
struct timespec st_ctim; /* time of last file status change */
blksize_t st_blksize; /* best I/O block size */
blkcnt_t st_blocks; /* number of disk blocks allocated */
};
The st_rdev, st_blksize, and st_blocks fields are not required by POSIX.1. They are defined as part of the XSI option in the Single UNIX Specification.
The timespec structure type defines time in terms of seconds and nanoseconds. It includes at least the following fields:
time_t tv_sec;
long tv_nsec;
Prior to the 2008 version of the standard, the time fields were named st_atime, st_mtime, and st_ctime, and were of type time_t (expressed in seconds). The timespec structure enables higher-resolution timestamps. The old names can be defined in terms of the tv_sec members for compatibility. For example, st_atime can be defined as st_atim.tv_sec.
Note that most members of the stat structure are specified by a primitive system data type (see Section 2.8). We’ll go through each member of this structure to examine the attributes of a file.
The biggest user of the stat functions is probably the ls -l command, to learn all the information about a file.
File Types
We’ve talked about two different types of files so far: regular files and directories. Most files on a UNIX system are either regular files or directories, but there are additional types of files. The types are
1) Regular file. The most common type of file, which contains data of some form. There is no distinction to the UNIX kernel whether this data is text or binary. Any interpretation of the contents of a regular file is left to the application processing the file.
2) Directory file. A file that contains the names of other files and pointers to information on these files. Any process that has read permission for a directory file can read the contents of the directory, but only the kernel can write directly to a directory file. Processes must use the functions described in this chapter to make changes to a directory.
3) Block special file. A type of file providing buffered I/O access in fixed-size units to devices such as disk drives.
4) Character special file. A type of file providing unbuffered I/O access in variable-sized units to devices. All devices on a system are either block special files or character special files.
5) FIFO. A type of file used for communication between processes. It’s sometimes called a named pipe. We describe FIFOs in Section 15.5.
6) Socket. A type of file used for network communication between processes. A socket can also be used for non-network communication between processes on a single host. We use sockets for interprocess communication in Chapter 16.
7) Symbolic link. A type of file that points to another file. We talk more about symbolic links in Section 4.17.
The type of a file is encoded in the st_mode member of the stat structure. We can determine the file type with the macros shown in Figure 4.1. The argument to each of these macros is the st_mode member from the stat structure.
Macro | Type of file |
S_ISREG() | regular file |
S_ISDIR() | directory file |
S_ISCHR() | character special file |
S_ISBLK() | block special file |
S_ISFIFO() | pipe or FIFO |
S_ISLNK() | symbolic link |
S_ISSOCK() | socket |
Figure 4.1 File type macros in <sys/stat.h>
POSIX.1 allows implementations to represent interprocess communication (IPC) objects, such as message queues and semaphores, as files. The macros shown in Figure 4.2 allow us to determine the type of IPC object from the stat structure. Instead of taking the st_mode member as an argument, these macros differ from those in Figure 4.1 in that their argument is a pointer to the stat structure.
Macro | Type of object |
S_TYPEISMQ() | message queue |
S_TYPEISSEM() | semaphore |
S_TYPEISSHM() | shared memory object |
Figure 4.2 IPC type macros in <sys/stat.h>
Message queues, semaphores, and shared memory objects are discussed in Chapter 15. However, none of the various implementations of the UNIX System discussed in this book represent these objects as files.
Example
The program in Figure 4.3 prints the type of file for each command-line argument.
Figure 4.3 Print type of file for each command-line argument
Sample output from Figure 4.3 is
(Here, we have explicitly entered a backslash at the end of the first command line, telling the shell that we want to continue entering the command on another line. The shell then prompted us with its secondary prompt, >, on the next line.) We have specifically used the lstat function instead of the stat function to detect symbolic links. If we used the stat function, we would never see symbolic links.
Historically, early versions of the UNIX System didn’t provide the S_ISxxx macros. Instead, we had to logically AND the st_mode value with the mask S_IFMT and then compare the result with the constants whose names are S_IFxxx. Most systems define this mask and the related constants in the file <sys/stat.h>. If we examine this file, we’ll find the S_ISDIR macro defined something like
Historically, early versions of the UNIX System didn’t provide the S_ISxxx macros. Instead, we had to logically AND the st_mode value with the mask S_IFMT and then compare the result with the constants whose names are S_IFxxx. Most systems define this mask and the related constants in the file <sys/stat.h>. If we examine this file, we’ll find the S_ISDIR macro defined something like
#define S_ISDIR(mode) (((mode) & S_IFMT) == S_IFDIR)
We’ve said that regular files are predominant, but it is interesting to see what percentage of the files on a given system are of each file type. Figure 4.4 shows the counts and percentages for a Linux system that is used as a single-user workstation. This data was obtained from the program shown in Section 4.22.
File type | Count | Percentage(%) |
regular file | 415,803 | 79.77 |
directory | 62,197 | 11.93 |
symbolic link | 40,018 | 8.25 |
character special | 155 | 0.03 |
block special | 47 | 0.01 |
socket | 45 | 0.01 |
FIFO | 0 | 0.00 |
Figure 4.4 Counts and percentages of different file types
Set-User-ID and Set-Group-ID
Every process has six or more IDs associated with it. These are shown in Figure 4.5.
real user ID | who we really are |
real group ID | |
effective user ID | used for file access permission checks |
effective group ID | |
supplementary group IDs | |
saved set-user-ID | saved by exec functions |
saved set-group-ID |
Figure 4.5 User IDs and group IDs associated with each process
- The real user ID and real group ID identify who we really are. These two fields are taken from our entry in the password file when we log in. Normally, these values don’t change during a login session, although there are ways for a superuser process to change them, which we describe in Section 8.11.
- The effective user ID, effective group ID, and supplementary group IDs determine our file access permissions, as we describe in the next section. (We defined supplementary group IDs in Section 1.8.)
- The saved set-user-ID and saved set-group-ID contain copies of the effective user ID and the effective group ID, respectively, when a program is executed. We describe the function of these two saved values when we describe the setuid function in Section 8.11.
The saved IDs are required as of the 2001 version of POSIX.1. They were optional in older versions of POSIX. An application can test for the constant _POSIX_SAVED_IDS at compile time or can call sysconf with the _SC_SAVED_IDS argument at runtime, to see whether the implementation supports this feature.
Normally, the effective user ID equals the real user ID, and the effective group ID equals the real group ID.
Every file has an owner and a group owner. The owner is specified by the st_uid member of the stat structure; the group owner, by the st_gid member.
When we execute a program file, the effective user ID of the process is usually the real user ID, and the effective group ID is usually the real group ID. However, we can also set a special flag in the file’s mode word (st_mode) that says, ‘‘When this file is executed, set the effective user ID of the process to be the owner of the file (st_uid).’’ Similarly, we can set another bit in the file’s mode word that causes the effective group ID to be the group owner of the file (st_gid). These two bits in the file’s mode word are called the set-user-ID bit and the set-group-ID bit.
For example, if the owner of the file is the superuser and if the file’s set-user-ID bit is set, then while that program file is running as a process, it has superuser privileges. This happens regardless of the real user ID of the process that executes the file. As an example, the UNIX System program that allows anyone to change his or her password, passwd(1), is a set-user-ID program. This is required so that the program can write the new password to the password file, typically either /etc/passwd or /etc/shadow, files that should be writable only by the superuser. Because a process that is running set-user-ID to some other user usually assumes extra permissions, it must be written carefully. We’ll discuss these types of programs in more detail in Chapter 8.
Returning to the stat function, the set-user-ID bit and the set-group-ID bit are contained in the file’s st_mode value. These two bits can be tested against the constants S_ISUID and S_ISGID, respectively.
File Access Permissions
The st_mode value also encodes the access permission bits for the file. When we say Tle, we mean any of the file types that we described earlier. All the file types — directories, character special files, and so on—have permissions. Many people think of only regular files as having access permissions.
There are nine permission bits for each file, divided into three categories. They are shown in Figure 4.6.
st_mode mask | Meaning |
S_IRUSR
S_IWUSR S_IXUSR | user-read
user-write user-execute |
S_IRGRP
S_IWGRP S_IXGRP | group-read
group-write group-execute |
S_IROTH
S_IWOTH S_IXOTH | other-read other-write other-execute |
Figure 4.6 The nine file access permission bits, from <sys/stat.h>
The term user in the first three rows in Figure 4.6 refers to the owner of the file. The chmod(1) command, which is typically used to modify these nine permission bits, allows us to specify u for user (owner), g for group, and o for other. Some books refer to these three as owner, group, and world; this is confusing, as the chmod command uses o to mean other, not owner. We’ll use the terms user, group, and other, to be consistent with the chmod command.
The three categories in Figure 4.6 — read, write, and execute—are used in various ways by different functions. We’ll summarize them here, and return to them when we describe the actual functions.
● | The first rule is that whenever we want to open any type of file by name, we must have execute permission in each directory mentioned in the name, including the current directory, if it is implied. This is why the execute permission bit for a directory is often called the search bit.
For example, to open the file /usr/include/stdio.h, we need execute permission in the directory /, execute permission in the directory /usr, and execute permission in the directory /usr/include. We then need appropriate permission for the file itself, depending on how we’re trying to open it: read-only, read–write, and so on. If the current directory is /usr/include, then we need execute permission in the current directory to open the file stdio.h. This is an example of the current directory being implied, not specifically mentioned. It is identical to our opening the file ./stdio.h. Note that read permission for a directory and execute permission for a directory mean different things. Read permission lets us read the directory, obtaining a list of all the filenames in the directory. Execute permission lets us pass through the directory when it is a component of a pathname that we are trying to access. (We need to search the directory to look for a specific filename.) Another example of an implicit directory reference is if the PATH environment variable, described in Section 8.10, specifies a directory that does not have execute permission enabled. In this case, the shell will never find executable files in that directory. |
● | The read permission for a file determines whether we can open an existing file for reading: the O_RDONLY and O_RDWR flags for the open function. |
● | The write permission for a file determines whether we can open an existing file for writing: the O_WRONLY and O_RDWR flags for the open function. |
● | We must have write permission for a file to specify the O_TRUNC flag in the open function. |
● | We cannot create a new file in a directory unless we have write permission and execute permission in the directory. |
● | To delete an existing file, we need write permission and execute permission in the directory containing the file. We do not need read permission or write permission for the file itself. |
● | Execute permission for a file must be on if we want to execute the file using any of the seven exec functions (Section 8.10). The file also has to be a regular file. |
- If the effective user ID of the process is 0 (the superuser), access is allowed. This gives the superuser free rein throughout the entire file system.
- If the effective user ID of the process equals the owner ID of the file (i.e., the process owns the file), access is allowed if the appropriate user access permission bit is set. Otherwise, permission is denied. By appropriate access permission bit, we mean that if the process is opening the file for reading, the user-read bit must be on. If the process is opening the file for writing, the user-write bit must be on. If the process is executing the file, the user-execute bit must be on.
- If the effective group ID of the process or one of the supplementary group IDs of the process equals the group ID of the file, access is allowed if the appropriate group access permission bit is set. Otherwise, permission is denied.
- If the appropriate other access permission bit is set, access is allowed. Otherwise, permission is denied.
These four steps are tried in sequence. Note that if the process owns the file (step 2), access is granted or denied based only on the user access permissions; the group permissions are never looked at. Similarly, if the process does not own the file but belongs to an appropriate group, access is granted or denied based only on the group access permissions; the other permissions are not looked at.
Ownership of New Files and Directories
When we described the creation of a new file in Chapter 3 using either open or creat, we never said which values were assigned to the user ID and group ID of the new file. We’ll see how to create a new directory in Section 4.21 when we describe the mkdir function. The rules for the ownership of a new directory are identical to the rules in this section for the ownership of a new file.
The user ID of a new file is set to the effective user ID of the process. POSIX.1 allows an implementation to choose one of the following options to determine the group ID of a new file:
- The group ID of a new file can be the effective group ID of the process.
- The group ID of a new file can be the group ID of the directory in which the file is being created.
FreeBSD 8.0 and Mac OS X 10.6.8 always copy the new file’s group ID from the directory. Several Linux file systems allow the choice between the two options to be selected using a mount(1) command option. The default behavior for Linux 3.2.0 and Solaris 10 is to determine the group ID of a new file depending on whether the set-group-ID bit is set for the directory in which the file is created. If this bit is set, the new file’s group ID is copied from the directory; otherwise, the new file’s group ID is set to the effective group ID of the process.
Using the second option—inheriting the directory’s group ID—assures us that all files and directories created in that directory will have the same group ID as the directory. This group ownership of files and directories will then propagate down the hierarchy from that point. This is used in the Linux directory /var/mail, for example.
As we mentioned earlier, this option for group ownership is the default for FreeBSD 8.0 and Mac OS X 10.6.8, but an option for Linux and Solaris. Under Solaris 10, and by default under Linux 3.2.0, we have to enable the set-group-ID bit, and the mkdir function has to propagate a directory’s set-group-ID bit automatically for this to work. (This is described in Section 4.21.)
access and faccessat Functions
As we described earlier, when we open a file, the kernel performs its access tests based on the effective user and group IDs. Sometimes, however, a process wants to test accessibility based on the real user and group IDs. This is useful when a process is running as someone else, using either the set-user-ID or the set-group-ID feature. Even though a process might be set-user-ID to root, it might still want to verify that the real user can access a given file. The access and faccessat functions base their tests on the real user and group IDs. (Replace effective with real in the four steps at the end of Section 4.5.)
#include <unistd.h>
int access(const char *pathname, int mode);
int faccessat(int fd, const char *pathname, int mode, int flag);
Both return: 0 if OK, ?1 on error
The mode is either the value F_OK to test if a file exists, or the bitwise OR of any of the flags shown in Figure 4.7.
mode | Description |
R_OK | test for read permission |
W_OK | test for write permission |
X_OK | test for execute permission |
Figure 4.7 The mode flags for access function, from <unistd.h>
The faccessat function behaves like access when the pathname argument is absolute or when the fd argument has the value AT_FDCWD and the pathname argument is relative. Otherwise, faccessat evaluates the pathname relative to the open directory referenced by the fd argument.
The flag argument can be used to change the behavior of faccessat. If the AT_EACCESS flag is set, the access checks are made using the effective user and group IDs of the calling process instead of the real user and group IDs.
Example
Figure 4.8 shows the use of the access function.
Figure 4.8 Example of access function
Here is a sample session with this program:
In this example, the set-user-ID program can determine that the real user cannot normally read the file, even though the open function will succeed.
In the preceding example and in Chapter 8, we’ll sometimes switch to become the superuser to demonstrate how something works. If you’re on a multiuser system and do not have superuser permission, you won’t be able to duplicate these examples completely.
umask Function
Now that we’ve described the nine permission bits associated with every file, we can describe the file mode creation mask that is associated with every process.
The umask function sets the file mode creation mask for the process and returns the previous value. (This is one of the few functions that doesn’t have an error return.)
#include <sys/stat.h>
mode_t umask(mode_t cmask);
Returns: previous file mode creation mask
The cmask argument is formed as the bitwise OR of any of the nine constants from Figure 4.6: S_IRUSR, S_IWUSR, and so on.
The file mode creation mask is used whenever the process creates a new file or a new directory. (Recall from Sections 3.3 and 3.4 our description of the open and creat functions. Both accept a mode argument that specifies the new file’s access permission bits.) We describe how to create a new directory in Section 4.21. Any bits that are on in the file mode creation mask are turned off in the file’s mode.
Example
The program in Figure 4.9 creates two files: one with a umask of 0 and one with a umask that disables all the group and other permission bits.
Figure 4.9 Example of umask function
If we run this program, we can see how the permission bits have been set.
Most users of UNIX systems never deal with their umask value. It is usually set once, on login, by the shell’s start-up file, and never changed. Nevertheless, when writing programs that create new files, if we want to ensure that specific access permission bits are enabled, we must modify the umask value while the process is running. For example, if we want to ensure that anyone can read a file, we should set the umask to 0. Otherwise, the umask value that is in effect when our process is running can cause permission bits to be turned off.
In the preceding example, we use the shell’s umask command to print the file mode creation mask both before we run the program and after it completes. This shows us that changing the file mode creation mask of a process doesn’t affect the mask of its parent (often a shell). All of the shells have a built-in umask command that we can use to set or print the current file mode creation mask.
Users can set the umask value to control the default permissions on the files they create. This value is expressed in octal, with one bit representing one permission to be masked off, as shown in Figure 4.10. Permissions can be denied by setting the corresponding bits. Some common umask values are 002 to prevent others from writing your files, 022 to prevent group members and others from writing your files, and 027 to prevent group members from writing your files and others from reading, writing, or executing your files.
Mask bit | Meaning |
0400 | user-read |
0200 | user-write |
0100 | user-execute |
0040 | group-read |
0020 | group-write |
0010 | group-execute |
0004 | other-read |
0002 | other-write |
0001 | other-execute |
Figure 4.10 The umask file access permission bits
The Single UNIX Specification requires that the umask command support a symbolic mode of operation. Unlike the octal format, the symbolic format specifies which permissions are to be allowed (i.e., clear in the file creation mask) instead of which ones are to be denied (i.e., set in the file creation mask). Compare both forms of the command, shown below.
chmod, fchmod, and fchmodat Functions
The chmod, fchmod, and fchmodat functions allow us to change the file access permissions for an existing file.
#include <sys/stat.h>
int chmod(const char *pathname, mode_t mode);
int fchmod(int fd, mode_t mode);
int fchmodat(int fd, const char *pathname, mode_t mode, int flag);
All three return: 0 if OK, ?1 on error
The chmod function operates on the specified file, whereas the fchmod function operates on a file that has already been opened. The fchmodat function behaves like chmod when the pathname argument is absolute or when the fd argument has the value AT_FDCWD and the pathname argument is relative. Otherwise, fchmodat evaluates the pathname relative to the open directory referenced by the fd argument. The ?ag argument can be used to change the behavior of fchmodat—when the AT_SYMLINK_NOFOLLOW flag is set, fchmodat doesn’t follow symbolic links.
To change the permission bits of a file, the effective user ID of the process must be equal to the owner ID of the file, or the process must have superuser permissions.
The mode is specified as the bitwise OR of the constants shown in Figure 4.11.
mode | Description | |
S_ISUID
S_ISGID S_ISVTX | set-user-ID on execution
set-group-ID on execution saved-text (sticky bit) | |
S_IRWXU S_IRUSR S_IWUSR S_IXUSR | read, write, and execute by user (owner) read by user (owner) write by user (owner) execute by user (owner) | |
S_IRWXG S_IRGRP S_IWGRP S_IXGRP | read, write, and execute by group
read by group write by group execute by group | |
S_IRWXO S_IROTH S_IWOTH S_IXOTH | read, write, and execute by other (world) read by other (world) write by other (world) execute by other (world) | |
Figure 4.11 The mode constants for chmod functions, from <sys/stat.h>
Note that nine of the entries in Figure 4.11 are the nine file access permission bits from Figure 4.6. We’ve added the two set-ID constants (S_ISUID and S_ISGID), the saved-text constant (S_ISVTX), and the three combined constants (S_IRWXU, S_IRWXG, and S_IRWXO).
The saved-text bit (S_ISVTX) is not part of POSIX.1. It is defined in the XSI option in the Single UNIX Specification. We describe its purpose in the next section.
Example
Recall the final state of the files foo and bar when we ran the program in Figure 4.9 to demonstrate the umask function:
Figure 4.12 Example of chmod function
After running the program in Figure 4.12, we see that the final state of the two files is
In this example, we have set the permissions of the file bar to an absolute value, regardless of the current permission bits. For the file foo, we set the permissions relative to their current state. To do this, we first call stat to obtain the current permissions and then modify them. We have explicitly turned on the set-group-ID bit and turned off the group-execute bit. Note that the ls command lists the group-execute permission as S to signify that the set-group-ID bit is set without the group-execute bit being set.
On Solaris, the ls command displays an l instead of an S to indicate that mandatory file and record locking has been enabled for this file. This behavior applies only to regular files, but we’ll discuss this more in Section 14.3.
Finally, note that the time and date listed by the ls command did not change after we ran the program in Figure 4.12. We’ll see in Section 4.19 that the chmod function updates only the time that the i-node was last changed. By default, the ls -l lists the time when the contents of the file were last modified.
The chmod functions automatically clear two of the permission bits under the following conditions:
● | On systems, such as Solaris, that place special meaning on the sticky bit when used with regular files, if we try to set the sticky bit (S_ISVTX) on a regular file and do not have superuser privileges, the sticky bit in the mode is automatically turned off. (We describe the sticky bit in the next section.) To prevent malicious users from setting the sticky bit and adversely affecting system performance, only the superuser can set the sticky bit of a regular file.In FreeBSD 8.0 and Solaris 10, only the superuser can set the sticky bit on a regular file. Linux 3.2.0 and Mac OS X 10.6.8 place no such restriction on the setting of the sticky bit, because the bit has no meaning when applied to regular files on these systems. Although the bit also has no meaning when applied to regular files on FreeBSD, everyone except the superuser is prevented from setting it on a regular file. |
● | The group ID of a newly created file might potentially be a group that the calling process does not belong to. Recall from Section 4.6 that it’s possible for the group ID of the new file to be the group ID of the parent directory. Specifically, if the group ID of the new file does not equal either the effective group ID of the process or one of the process’s supplementary group IDs and if the process does not have superuser privileges, then the set-group-ID bit is automatically turned off. This prevents a user from creating a set-group-ID file owned by a group that the user doesn’t belong to.FreeBSD 8.0 fails an attempt to set the set-group-ID in this case. The other systems silently turn the bit off, but don’t fail the attempt to change the file access permissions. |
Sticky Bit
chown, fchown, fchownat, and lchown Functions
File Size
File Systems
link, linkat, unlink, unlinkat, and remove Functions
rename and renameat Functions
Symbolic Links
Creating and Reading Symbolic Links
File Times
futimens, utimensat, and utimes Functions
mkdir, mkdirat, and rmdir Functions
Reading Directories
chdir, fchdir, and getcwd Functions
Device Special Files
Summary of File Access Permission Bits
Summary
参考