Unix FilesystemsHistorically, Unix has provided four basic filesystem-related abstractions: files, directory entries, inodes, and mount points. A filesystem is a hierarchical storage of data adhering to a specific structure. Filesystems contain files, directories, and associated control information. Typical operations performed on filesystems are creation, deletion, and mounting. In Unix, filesystems are mounted at a specific mount point in a global hierarchy[1] known as a namespace. This enables all mounted filesystems to appear as entries in a single tree[2].
A file is an ordered string of bytes. The first byte marks the beginning of the file and the last byte marks the end of the file. Each file is assigned a human-readable name for identification by both the system and the user. Typical file operations are read, write, create, and delete. The Unix concept of the file is in stark contrast to record-oriented filesystems, such as OpenVMS's Files-11. Record-oriented filesystems provide a richer, more structured representation of files than Unix's simple byte-stream abstraction, at the cost of simplicity and ultimate flexibility. Files are organized in directories. A directory is analogous to a folder and usually contains related files. Directories can also contain subdirectories; in this fashion, directories may be nested to form paths. Each component of a path is called a directory entry. A path example is "/home/wolfman/butter"the root directory /, the directories home and wolfman, and the file butter are all directory entries, called dentries. In Unix, directories are actually normal files that simply list the files contained therein. Because a directory is a file to the VFS, the same operations performed on files can be performed on directories. Unix systems separate the concept of a file from any associated information about it, such as access permissions, size, owner, creation time, and so on. This information is sometimes called file metadata (that is, data about the file's data) and is stored in a separate data structure from the file, called the inode. This name is short for index node, although these days the term "inode" is much more ubiquitous. All this information is tied together with the filesystem's control information, which is stored in the superblock. The superblock is a data structure containing information about the filesystem as a whole. Sometimes the collective data is referred to as filesystem metadata. Filesystem metadata includes information about both the individual files and the filesystem as a whole. Traditional, Unix filesystems implement these notions as part of their physical on-disk layout. For example, file information is stored as an inode in a separate block on the disk; directories are files; control information is stored centrally in a superblock, and so on. The Linux VFS is designed to work with filesystems that understand and implement such concepts. Non-Unix filesystems, such as FAT or NTFS, still work in Linux, but their filesystem code must provide the appearance of these concepts. For example, even if a filesystem does not support distinct inodes, it must assemble the inode data structure in memory as if it did. Or, if a filesystem treats directories as a special object, to the VFS they must represent directories as mere files. Oftentimes, this involves some special processing done on the fly by the non-Unix filesystems to cope with the Unix paradigm and the requirements of the VFS. Such filesystems still work, however, and usually suffer very little. |