Chapter 7 - File systems. This chapter discusses a programmer’s view of files and the file system. It describes fundamental file organizations, directory structures, operations on files and directories, and file sharing semantics, which specify the manner in which results of file manipulations performed by concurrent processes are visible to one another.
Trang 2Goals of a File system
• Convenient and fast access to files
– Different file organizations to suit different user requirements
• Reliable storage of files
– Prevents damage due to crashes, etc.
• Controlled sharing of files
– Unauthorized persons should not be able to access files
• … also ensure efficient use of I/O devices
– We discuss this aspect in Chapter 12
Trang 3Structure of a File system
• A file system is structured into a hierarchy of layers
– File system layer
* Deals with files as objects
Creates files, allocates disk space to them
Implements file sharing and prevents unauthorized access
Prevents damage due to crashes
– Input-Output Control System (IOCS) layer
* Implements file system operations, ensuring
Efficient access to records in files
Efficient operation of I/O devices
Trang 4File system and IOCS layers
• Each layer contains policy and mechanism modules
• A mechanism module invokes modules of the lower layer
Trang 5File system
• A Directory
– Groups a set of files
* It contains an entry for each of these files
* The entry contains information useful for accessing the file
– Users have different directories
* Helps to separate files of different users
* Provides file naming freedom
– Directories are organized in a hierarchical structure
* Permits a user to organize her files logically, e.g., according to activities
Trang 6Logical organization in file systems
• Two files named beta exist The file system must open the correct
one when a process executes open (beta, )
• Records may be organized differently in different files
Trang 7Logical organization in file systems
• Comments on the previous slide
– Two files named beta exist in the file system
* Thus users enjoy file naming freedom
* The rules of sharing are determined by file sharing semantics
– Files beta and phi have different organizations and are accessed differently
* File beta is a sequential file; its records are read in a sequence
* File phi is a direct file; its records can be read in random order
Trang 8Overview of file processing
• Implementation of file processing activities
– File processing activities are implemented through combined
actions of a compiler, modules of the file system and IOCS, and the kernel
* The compiler replaces open and read/write statements in a program with calls on file system and IOCS modules
These modules implement the desired file operations
These file system and IOCS modules are linked to the program
* A process executing the program invokes these modules during its operation to perform file operations
The modules may make system calls to invoke functionalities of the kernel
Trang 9File processing in a programSource program compiled program linked program
Trang 10File Types
• Files can be of two types
– Structured files
* A File contains records
* Each record contains a key field
Key fields of records in a file contain unique values
* Records in a file can be accessed in different ways
Trang 11Records and fields in a file
• File employee_info contains records for employees
Trang 12File operations performed by processes
• File operations
– Opening a file
* FS opens the file only if user possesses necessary privileges
– Reading or writing a record
* FS considers organization of the file and performs the operation accordingly
– Closing a file
* FS updates information concerning file size in its directory entry
– File creation, deletion and renaming
– Specifying who can access a file
* The information is entered in the file’s directory entry
Trang 13File organizations
• Record access pattern
– Definition
* The order in which a process accesses records in a file
– Two fundamental record access patterns in applications
* Sequential access
A process always performs an operation on the next record
Suits batched operations; e.g., payroll
* Random access
A process may access any record in a file
– File system should effectively support the record access pattern
of an application
Trang 14File organizations
• File organization
– Method of organizing and accessing data in a file
– Sequential file organization
* Records in the file are accessed sequentially
This file organization is independent of characteristics of I/O devices
– Direct file organization
* Records are accessed randomly—a record is found using an address calculation formula applied to the value in its key field
Files tend to be device dependent; may contain dummy records
– Index sequential file organization
* File has a hierarchical organization To locate a record, a region in the file is first located and then searched sequentially
Trang 15Sequential and direct files
• In sequential files, an operation is always performed on the
next record
• In direct files, an address calculation formula computes
location of a record from its key value
• Dummy records have to exist for the formula to work
Trang 16An Index Sequential file
• The higher level index is searched to find a group of tracks
where a record may exist
• The track index is searched to find the track which may contain the record
• The track is searched to check whether the record exists
Trang 17A typical directory entry
• Location info indicates where records of the file are located
• Protection info indicates who can access the file and in
what manner
• Flags indicate whether the file is itself a directory, etc.
Trang 18Two-level directories: Master and user directories
• A separate user directory exists for each user This arrangement
provides file naming freedom
• A user can access files of other users by going to that
user’s directory through the master directory
Trang 19Generalizations of the two-level directory structure
• Generalization provides several benefits
– Directory as a file
* A directory can exist in another directory
* It leads to a directory hierarchy
* User can form groups and subgroups of files
– Generalized syntax for accessing files
* A path name permits any file in the directory hierarchy to be
accessed (subject to access privileges); e.g., /projects/alpha
• Accessing a file
– Each user has a home directory and a current directory
– A file is accessed using absolute or relative path names
Trang 20Sample directory hierarchy
• A rectangle represents a directory, a circle represents a file
Trang 21Directory graphs
• Links
– A link provides an alternative method of naming a file
* It associates a name in a specified directory with a file that may exist anywhere in the directory hierarchy
* It can be used to provide a shortcut to access the file
– Use of links makes a directory hierarchy into a graph
* Alternative path names exist for the same file
Trang 22• link (~B, ~A/projects/beta, showcase) creates a link named
‘showcase’ from directory B to file ~A/projects/beta
Trang 23Mounting of a file system
• An OS may have several file systems
– Each file system has a separate directory hierarchy
– Directory hierarchy of two file systems can be combined through mounting
– A mount operation ‘attaches’ a file system to a directory in
another file system
* Mounting is typically a privileged operation
* It has a temporary effect that lasts until an unmount operation is performed
Trang 24Mounting of a file system
(a) Directory hierarchies of two file systems
(b) Command mount (meeting, ~A/admin) combines them as shown
Trang 25Interface between the file system and IOCS
• The file system and the IOCS jointly implement
operations on files
– A file system creates, deletes and protects files
* It allocates disk space to a file
* It permits a user to specify protection information
– The file system also performs opening and closing of files
* It performs path name resolution at open; it also checks whether the
user is authorized to access the file
– File operations are performed by the IOCS
• The file control block (FCB) is the interface between the
file system and the IOCS
Trang 26File Control Block (FCB)
• The FCB provides an interface between the file system and the IOCS
– The file system builds an FCB for a file when the file is opened
* The FCB contains all information required in processing the file
Some of this information is put by the language compiler
Some information is copied from the directory entry of the file
Some information is put by the file system and IOCS modules
– The IOCS uses the FCB to implement file processing
– The FCB is destroyed when the file is closed
• Thus, the FCB contains all information needed to
process a file
Trang 27Fields in the file control block (FCB)
• Three kinds of fields exist in the FCB
– File organization
* File name, type, organization and access method
* Device type and address
* Size of record and block @ , number of buffers @,
– Directory information
* Address of parent directory’s FCB
* Address of the file map table @
– Current state of file processing
* Address of the next record
: The terms block, buffer and file map table are defined later
Trang 28Overview of file operations
• File organization information is added to FCB by the compiled program
• File system adds directory information to the FCB and maintains
information about current state of file processing in the FCB
Trang 29Disk space allocation
• Disk space allocation is performed by the file system
A file map table (FMT) is constructed for each file
FMT contains pointers to disk blocks allocated to the file
Trang 30Disk space allocation
• Data and meta data
– Data written into a file is called file data
* A disk block containing only file data is called a data block
– Data used by the file system to control its own operation is called
control data or meta data, e.g.,
* Address of a disk block
* The file map table (FMT)
* An index block (it is defined later)
Trang 31Linked allocation
• Loc info field of directory entry points to a linked list of blocks
• The free list links unused disk blocks
Trang 32File allocation table (FAT)
• FAT is a variant of linked allocation
– It stores the control data separately from the file data; it is used
Trang 33Indexed allocation and file map table (FMT)
• The file map table (FMT) contains addresses of data blocks allocated to a file
• The loc info field of directory entry points to the FMT
Trang 34Multi-level indexed allocation
• Multi-level allocation is used to reduce the size of the
FMT
– An entry in the FMT may contain address of
* An index block which has entries that contain addresses of disk
blocks or addresses of other index blocks
* A data block; i.e., a disk block that contains only file data
– Several levels of indirection would be involved in accessing a disk block, which may slow down file processing
Trang 35
Two-level indexed allocation
• An index block points to disk blocks containing file data
• In multi-level allocation, an index block may also point to
Trang 36Disk status map (DSM)
• How to keep track of free disk space?
– Free list of disk blocks
– Disk status map (DSM)
Trang 37File Processing
its records and closing the file
– A file control block (FCB) is created for a file when the file is
opened
– The FCB is stored in a table of open files, called Active files table
(AFT)
* The offset of the FCB in the AFT is passed back to the process
It is called the internal id of the file
– The process uses it to perform operations on the file
* FCB is passed as a parameter to a system call used to invoke an IOCS functionality, such as reading or writing of a record
Trang 38Active files table (AFT)
• Each process has an active files table
• Offset in AFT to a file’s FMT is called internal id of the file
• It is passed back to the process that opened the file
Trang 39File system actions at open
• Perform path name resolution
– For each component in the path name, locate the correct
directory or file
* Use two pointers called Directory FCB pointer and File FCB pointer
– Handle path names passing through mount points
* A file should be allocated disk space in its own file system
• Retain sufficient information to perform a close operation
on the file
– Close may have to update the file’s entry in the parent directory – It may cause changes in the parent directory’s entry in ancestor directories
Trang 40File system actions while opening file info/alpha
• Build FCB for file info Copy information from its directory entry
• Build FCB for file alpha Copy information from its directory entry
• Put address of info’s FCB in alpha’s FCB
Trang 41File system actions while closing file info/phi
(a) Arrangement before the close operation
(b) After the close operation: A directory entry has been created
Trang 42Concurrent sharing of a file
• File sharing semantics (also called consistency
semantics)
– Rules governing file updates made by concurrent processes
* These rules are implemented by the file system
• Important file sharing semantics
– Immutable files
* A single file image is shared by all processes
– Unix semantics: Single image mutable files
* Changes in a file made by one process are immediately visible to other processes that have opened the file
– Session semantics: Multiple image mutable files
* Concurrent processes see different images of the file; changes made are not visible to one another
Trang 43File sharing semantics in Unix
• A single copy of the file and its FMT exists
• Every FCB for the file points to this copy of file, through its FMT
• Updates made by one process are immediately visible to other processes
Trang 44Session semantics
• Concurrent processes using a file are grouped into sessions
• Processes in a session share a copy of the file Their updates
are immediately visible to one another
• Processes in different sessions use different copies of the file
Trang 45File system reliability
• File system operation should not be affected by crashes
– File system deals with two kinds of data
* File data
Data stored in files
* Control data (also called meta data)
Data used by the file system, e.g., FCB, DSM, FMT, …
• If a crash occurs:
– File data should be preserved
– Control data should be consistent
Trang 46File system reliability
• Consequences of a crash
– If control data becomes inconsistent
* File data may be lost, or data in different files may be mixed up
* File system operation may not be possible
– For example, consider linked allocation of disk space
* Each disk block allocated to a file contains a pointer to the next disk block allocated to the file
* If a failure occurs before the pointers are consistently modified, some file data may be lost or may become mixed up
Trang 47Loss of data due to a crash
A disk block d j is being added to a file when a crash occurs
(a) Before the crash
(b) After the crash: d 1 points to d j , but d j has not been updated to point
to d 2 Hence d 2 and subsequent blocks become inaccessible
Trang 48Mix-up of data in linked allocation due to a crash
Block dk is to be deleted from beta and d j is to be added to alpha
(a) Initial situation of files alpha and beta
(b) The operation is performed with d j = d k and alpha is closed, so its
blocks are updated; a crash occurs, so blocks of beta are not updated
Trang 49File System Reliability Techniques
• Recovery
– Store the state of the file system periodically
* Full and incremental back-ups
– Revert to a previous recorded state when a failure occurs
– Cost of recovery: Cost of creating backups + cost of re-execution
• Fault tolerance
– Ability to operate correctly despite faults
– Only anticipated faults can be tolerated
– We discuss two fault tolerance techniques
* Stable storage