COMP 3511: Lecture 22

Date: 2024-11-19 14:35:50

Reviewed:

Topic / Chapter:

summary

❓Questions

❓ as a programmer, do we need to take locality & page fault into account
- 👨‍🏫 no, but low level, OS process might need to
❓

Notes

Other considerations (cont.)

Program structure
- Int[128,128] data;
- each row: stored in 1 page = 128 objects
- program 1 (1 page allocated)
```
for (j = 0; j < 128; j++)
    for (i = 0; i < 128; i++)
        data[i,j] = 0;
```
  - picking different row each times
    - resulting in: $128 \times 128 = 16, 384$ page faults
  - d
- program 2
```
for (i = 0; i < 128; i++)
    for (j = 0; j < 128; j++)
        data[i,j] = 0;
```
  - resulting in: $128$ page faults
    - once every row change

Mass Storage Systems

Moving head disk mechanism
- each disk platter: w/ flat circular shape
  - diameter of 1.8, 2.5, or 3.5 inches
- two surface of a platter: covered with a magnetic materials
  - stores information
- read-write head "files": just above each surface of every platter
  - heads: attacked to disk arm: moving all heads as a unit
  - 👨‍🏫 back and forth!
- platter surface: logically divided into circular tracks
  - then subdivided into hundreds of sectors
- sef of tracks at one arm position: makes a cylinder
  - thousands of concentric cylinders in a disk drive!
- - 👨‍🎓 lovely!
Mass storage structure overview
- magnetic disks: provide bulk of secondary storage of modern computers
  - drives: rotate at 60-250 times / second
  - transfer rate: rate of data flow between drive and computer
  - positioning time (random-access time): time to
    1. move disk arm to desired cylinder (seek time)
    2. time for desired sector to rotate under the disk head (rotational latency)
  - head crush: from disk head making (physical) contact w/ disk surface
    - 👨‍🏫 bad
- disks: can be removable
- drive: attache dto computer via IO bus
  - busses: vary
    - including EIDE, ATA, SATA, USB, Fibre Channel, SCSI, SAS, Firewire
    - host controller in computer: uses bus to talk to disk controller
      - disk controller: built into drive / storage array
Hard disk drive
- platters: range from /85" to 14" (historically)
  - commonly: 3.5", 2.5", and 1.8"
- range from: 30GB to 3TB per drive
- performance
  - transfer rate: (theoretical) 6 Gb / sec
  - effective transfer rate: (real) 1 Gb / sec
  - seek time: from 3 ms to 12 ms
    - average seek time: measured based on 1/3 of tracks
    - RPM: typically 5400, 7200, 10000, and 15000 rpm
    - latency: based on spindle speed
      - 1 / (RPM / 60) = 60 / RPM
  - average latency: 1/2 latency
  - e.g. 7200 RPM = 120 rps
    - average latency: 1/2*1/120 = 4.17 mini-seconds
spindle (rpm) average latency (ms)

4200 7.14

5400 5.56

7200 4.17

10000 3

15000 2
HDD performance
- access latency / average seek time: average seek time + average latency
  - for fast disk: 3 ms + 2 ms = 5 ms
  - for slow disk: 9 ms + 5.56 ms = 14.56 ms
- average IO time: average access time + (amount to transfer / transfer rate) + controller overhead
  - e.g. 4KB block on a 7200 RPM disk, w/ 5 ms average seek time & 1 Gb/s transfer rate w/ 0.1 ms controller overhead:
    - 5 ms + 4.17 ms + 4 KB / (1 Gb / s) + 0.1 ms =
      - (notice: one is bytes, the other bits)
      - 4 KB / (1 Gb / s) = 2^{-15} s = 0.0305 ms
    - 9.27 ms + 0.0305 ms = 9.3 ms
    - taking average 9.3 ms to transfer 4 KB, w/ effective bandwidth of 4 KB / 9.3 ms ≈ 3.5 Mb / sec only
    - w/ transfer rate at 1 Gb/s given the overhead
  - huge gap in HDD between random & sequential workloads
    - a disk w/ 300 GB capacity, avg. seek time 4 ms, RPM 15,000 RPM = 250 RPS (avg. latency = 2 ms), transfer rate 125 MB/s, 4KB read at random location
      - average IO time = 4 ms + (4 / 125,000) s + 2 ms = 6.03 ms
      - effective bandwidth: 4 KB / 6 ms = 0.66 MB/s
    - w/ sequential access of a 100 MB file: one seek & rotation needed (ideally)
      - can yield: effective bandwidth / transfer rate close to 125 MB/s
Solid state disk
- SSD: a nonvolatile memory (NVM) used like a hard drive
  - many tech variations, e.g. from DRAM w/ battery to maintain state its state in power failure
    - w/ flash-memory technology
      - single-level cell (SLC)
      - multi-level cell (MLC)
  - SSDs: can be more reliable than HDD as no moving / mechanical parts
  - much faster: as there is no seek time / rotation latency
  - consumes less power: power efficiency
  - yet, more expensive per MB, w/ less capacity & shorter life span
- as it's much faster than magnetic disks: standard bus interface mgiit be too slow, being a bottleneck
  - some: connect directly to system bus (e.g. PCI)
  - some: use them as a new cache tier
    - moving data between magnetic disk, SSDs, and memory: for optimizing performance
Magnetic tape
- magnetic tape: an early secondary storage medium
  - relatively permanent & can hold large quantities of data
  - access time: slow
    - moving to correct spot might take minutes
  - random access: ≈ 1000 times slower than magnetic disk
    - not useful for secondary storage in modern computer systems
- mainly used for backup, storage of infrequently used data
  - or: medium of transferring information from one system to another
- tape capacities: vary greatly
  - depending on particular kind of tape drive
    - w/ current capacities exceeding several terabytes
  - typically between 200 GB and 1.5 TB

spindle (rpm)	average latency (ms)
4200	7.14
5400	5.56
7200	4.17
10000	3
15000	2

Disk Storage & Scheduling

Disk structure
- drives: addressed as large 1d arrays of logical blocks
  - logical block: smallest unit of transfer
    - disk: represented as a no. of disk blocks
      - each block: w/ unique block number: disk address
    - size of a logical block: usually 512 bytes
    - low-level formatting: creates logical blocks on physical media
- 1d array of logical blocks: mapped into sectors of the disk (sequentially)
  - sector 0: first sector of first track
    - on the outermost cylinder
  - mapping proceeds in order through track, then rest of tracks in the cylinder
    - then rest of cylinders: from outermost to innermost
  - logical to physical address (consist of: cylinder number, track number in cylinder, sector no. in track): must be easy, except:
    - defective sectors: mapping: hides this by substituting spare sectors from elsewhere on disk
    - no. of sectors / track: may not by constant on some devices
      - non-constant no. of sectors / track: via constant angular velocity
Disk scheduling
- OS: responsible for using hardware efficiently
  - for disk drives: meas having fast access time & large disk bandwidth
- seek time: time for disk head arm to move the head to corresponding cylinder containing desired sector
  - measured by: seek distance in term of no. of cylinders / tracks
- rotational latency: additional time for the disk to rotate desired sector to disk head
- disk bandwidth: total no. of bytes transferred / total time between first service req. and completion of transfer
  - improve access time & bandwidth by: managing order of disk IO requests serviced
- many resources of disk IO requests
  - from OS, system processes, and user processes
  - IO requests: include
    - IO modes
    - disk address
    - memory address
    - no. of sectors to transfer
- OS: maintains a queue of requests / disk or device
  - in multiprogramming system w/ many processes: disk queue often has several pending requests
- idle disk: can immediately work on IO request
  - for busy disk: requests must be queued
  - optimization: only when the queue of request exists
- disk drive controllers: w/ small buffers & manage a queue of IO requests (of varying "depth")
  - which request to be completed & selected next?
  - disk scheduling
- now: illust scheduling algorithms w/ request queue, having 0-199 as cylinder numbers
  - e.g. 98, 183, 37, 122, 14, 124, 65, 67
  - current head position: 53
FCFS / FIFO
- intrinsically: fair, yet doesn't proved the fastest service
- resulting in: total head movement of 640 cylinders
SSTF: Shortest seek time first
- selects req. w/ least seek time from current head position
  - i.e. pending request closest to current head position
- SSTF: a form of SJF scheduling (greedy)
  - may cause: starvation of some requests
- results in: 236 cylinders
SCAN scheduling
- disk: starting at one end of the disk, moving towards the other end
  - on the way: servicing requests as it reaches each cylinder
- at the other end: direction of head movement: reversed, and continue the service
  - aka: elevator algorithm
- note: if requests are uniformly distributed across cylinders
  - heaviest density of requests: at other end of disk, wait the longest
  - must know: direction of head movement
- results in: 236 cylinders
LOOK scheduling
- similar to SCAN, but disk arm only goes as far as the final request in each direction
  - NOT the end of the disk
- 👨‍🏫 guaranteed to be not worst than SCAN
- results in: 208 cylinders
C-SCAN
- Circular-SCAN: provides a more uniform waiting time than SCAN
- head: moves from one end of the disk to the other, servicing requests as it goes
  - upon reaching the other end: immediately returns to the beginning of the disk
    - not serving any requests on the way back
  - treats:cylinders as a circular lists, wrapped around
    - from the last cylinder to the first one
- results in: 382 cylinders
C-LOOK
- LOOK version of C-SCAN
- disk arm: goes far as the last request in each direction
  - then reversing the direction
- better than C-SCAN, for the very same reason
- results in: 322 (C-LOOK), 308 (LOOK) cylinders
Selecting a disk-scheduling algorithm
- common solution: SSTF
  - increases performance of FCFS
  - SCAN, C-SCAN: performs better for systems w/ heavy load on disk
    - as: they are less likely to cause starvation problems
- scheduling performance: depends on the no. o& ype of requests
  - w/ 1 request: all scheduling algorithm must behave the same
- requests for disk services: greatly influenced by the file-allocation method (to be discussed)
  - contiguously allocated file: generate several requests, close together on the disk
    - resulting in limited head movement
  - linked or indexed file: may include blocks widely scattered on the disk
    - resulting in greater head movement
  - location of directories & index blocks: also important
    - as: they are accessed frequently
    - directory entry and file data on different cylinders
      - => cause excessive head movement
    - caching directory & index block in memory: helps
- disk-scheduling algorithm: written as a separate module of the OS
  - allowing to be replaced w/ different algorithm if necessary
  - either SSTF / LOOK: a reasonable choice as the default algorithm

RAID

RAID: improving reliability via redundancy
- by: Randy Kal & David Peterson, late 80-90s
  - 👨‍🏫 I know one of them very well
- RAID: Redundant Arrays of Independent Disks
  - in the past: composed of small, cheap disks
    - viewed as: a cost-effective alternatives to large & expensive disks
      - once called: redundant arrays of inexpensive disks
  - now: used for higher reliability via redundancy & higher data-transfer rate (access in parallel)
- increases: mean time to failure

COMP 3511: Operating Systems