Advanced Operating Systems - Lecture 33: Disk interaction. This lecture will cover the following: physical disk structure; disk interaction; current disks provide a higher-level interface (SCSI); disk scheduling; BSD 4.4 fast file system (FFS); FFS allocation policies;...
Trang 1CS703 Advanced Operating Systems
By Mr Farhan Zaidi
Trang 233
Trang 4 Specifying disk requests requires a lot of info:
Cylinder #, surface #, track #, sector #, transfer size,
Current disks provide a higher-level interface (SCSI)
The disk exports its data as a logical array of blocks [0 … N]
Disk maps logical blocks to cylinder/ surface/ track/ sector.
Trang 5 Disk reads/writes in terms of sectors, not bytes
read/write single sector or adjacent groups
How to write a single byte? “Read-modify-write”
read in sector containing the byte
modify that byte
write entire sector back to disk
key: if cached, don’t need to read in
Sector = unit of atomicity
sector write done completely, even if crash in middle
(disk saves up enough momentum to complete)
larger atomic units have to be synthesized by OS
Trang 6 Because seeks are so expensive (milliseconds!), the
OS tries to schedule disk requests that are queued waiting for the disk
FCFS (do nothing)
Reasonable when load is low
Long waiting times for long request queues
SSTF (shortest seek time first)
Minimize arm movement (seek time), maximize request rate
Favors middle blocks
SCAN (elevator)
Service requests in one direction until done, then reverse
C-SCAN
Like SCAN, but only go in one direction (typewriter)
Trang 7 Disk bandwidth and cost/bit improving exponentially
similar to CPU speed, memory size, etc.
Seek time and rotational delay improving *very* slowly
why? require moving physical object (disk arm)
Some implications:
disk accesses a huge system bottleneck & getting worse
bandwidth increase lets system (pre-)fetch large chunks for about the same cost as small chunk
Result? Can improve performance if you can read lots of related stuff.
How to get related stuff? Cluster together on disk
Memory size increasing faster than typical workload size
More and more of workload fits in file cache
disk traffic changes: mostly writes and new data
Trang 8 Used a minimum of 4096 size disk block
Records the block size in superblock
Multiple file systems with different block sizes can co-reside
Improves performance in several ways
Superblock is replicated to provide fault tolerance
Trang 101. Allocate file inodes close to their containing
directories.
For mkdir, select a cylinder group with a more-than-average
number of free inodes.
For creat, place inode in the same group as the parent.
1. Concentrate related file data blocks in cylinder
groups.
Most files are read and written sequentially.
Place initial blocks of a file in the same group as its inode.
How should we handle directory blocks?
Place adjacent logical blocks in the same cylinder group.
Logical block n+1 goes in the same group as block n.
Switch to a different group for each indirect block.
Trang 11 Internal fragmentation in the file system blocks can
waste significant space for small files.
FFS solution: optimize small files for space
efficiency.
Subdivide blocks into 2/ 4/ 8 fragments (or just frags).
Trang 12 Clustering improves bandwidth utilization for large
files read and written sequentially.
FFS can allocate contiguous runs of blocks “most of
the time” on disks with sufficient free space
Trang 13 Reconstructs free list and reference counts on reboot
Enforces two invariants:
directory names always reference valid inodes
no block claimed by more than one inode
Does this with three ordering rules:
write newly allocated inode to disk before name entered in directory
remove directory name before inode deallocated
write deallocated inode to disk before its blocks are
placed on free list
File creation and deletion take 2 synchronous writes
Why does FFS need third rule? Inode recovery
FFS consistency and recovery
Trang 14 Files can be lost if directory destroyed or crash happens
before link can be set
New twist: FFS can find lost inodes
Facts:
FFS pre-allocates inodes in known locations on disk
Free inodes are initialized to all 0s
So?
Fact 1 lets FFS find all inodes (whether or not there are any pointers to them)
Fact 2 tells FFS that any inode with non-zero contents is (probably) still in use
fsck places unreferenced inodes with non-zero contents
in the lost+found directory
FFS: inode recovery