COMP 3511: Lecture 22

Date: 2024-11-19 14:35:50

Reviewed:

Topic / Chapter:

summary

❓Questions
  • ❓ as a programmer, do we need to take locality & page fault into account
    • πŸ‘¨β€πŸ« no, but low level, OS process might need to
  • ❓

Notes

Other considerations (cont.)
  • Program structure

    • Int[128,128] data;
    • each row: stored in 1 page = 128 objects
    • program 1 (1 page allocated)
      for (j = 0; j < 128; j++)
          for (i = 0; i < 128; i++)
              data[i,j] = 0;
      
      • picking different row each times
        • resulting in: page faults
      • d
    • program 2
      for (i = 0; i < 128; i++)
          for (j = 0; j < 128; j++)
              data[i,j] = 0;
      
      • resulting in: page faults
        • once every row change
Mass Storage Systems
  • Moving head disk mechanism

    • each disk platter: w/ flat circular shape
      • diameter of 1.8, 2.5, or 3.5 inches
    • two surface of a platter: covered with a magnetic materials
      • stores information
    • read-write head "files": just above each surface of every platter
      • heads: attacked to disk arm: moving all heads as a unit
      • πŸ‘¨β€πŸ« back and forth!
    • platter surface: logically divided into circular tracks
      • then subdivided into hundreds of sectors
    • sef of tracks at one arm position: makes a cylinder
      • thousands of concentric cylinders in a disk drive!
    • 01_moving_head_disk
    • 02_real_head_disk
      • πŸ‘¨β€πŸŽ“ lovely!
  • Mass storage structure overview

    • magnetic disks: provide bulk of secondary storage of modern computers
      • drives: rotate at 60-250 times / second
      • transfer rate: rate of data flow between drive and computer
      • positioning time (random-access time): time to
        1. move disk arm to desired cylinder (seek time)
        2. time for desired sector to rotate under the disk head (rotational latency)
      • head crush: from disk head making (physical) contact w/ disk surface
        • πŸ‘¨β€πŸ« bad
    • disks: can be removable
    • drive: attache dto computer via IO bus
      • busses: vary
        • including EIDE, ATA, SATA, USB, Fibre Channel, SCSI, SAS, Firewire
        • host controller in computer: uses bus to talk to disk controller
          • disk controller: built into drive / storage array
  • Hard disk drive

    • platters: range from /85" to 14" (historically)
      • commonly: 3.5", 2.5", and 1.8"
    • range from: 30GB to 3TB per drive
    • performance
      • transfer rate: (theoretical) 6 Gb / sec
      • effective transfer rate: (real) 1 Gb / sec
      • seek time: from 3 ms to 12 ms
        • average seek time: measured based on 1/3 of tracks
        • RPM: typically 5400, 7200, 10000, and 15000 rpm
        • latency: based on spindle speed
          • 1 / (RPM / 60) = 60 / RPM
      • average latency: 1/2 latency
      • e.g. 7200 RPM = 120 rps
        • average latency: 1/2*1/120 = 4.17 mini-seconds
    spindle (rpm)average latency (ms)
    42007.14
    54005.56
    72004.17
    100003
    150002
  • HDD performance

    • access latency / average seek time: average seek time + average latency
      • for fast disk: 3 ms + 2 ms = 5 ms
      • for slow disk: 9 ms + 5.56 ms = 14.56 ms
    • average IO time: average access time + (amount to transfer / transfer rate) + controller overhead
      • e.g. 4KB block on a 7200 RPM disk, w/ 5 ms average seek time & 1 Gb/s transfer rate w/ 0.1 ms controller overhead:
        • 5 ms + 4.17 ms + 4 KB / (1 Gb / s) + 0.1 ms =
          • (notice: one is bytes, the other bits)
          • 4 KB / (1 Gb / s) = 2^{-15} s = 0.0305 ms
        • 9.27 ms + 0.0305 ms = 9.3 ms
        • taking average 9.3 ms to transfer 4 KB, w/ effective bandwidth of 4 KB / 9.3 ms β‰ˆ 3.5 Mb / sec only
        • w/ transfer rate at 1 Gb/s given the overhead
      • huge gap in HDD between random & sequential workloads
        • a disk w/ 300 GB capacity, avg. seek time 4 ms, RPM 15,000 RPM = 250 RPS (avg. latency = 2 ms), transfer rate 125 MB/s, 4KB read at random location
          • average IO time = 4 ms + (4 / 125,000) s + 2 ms = 6.03 ms
          • effective bandwidth: 4 KB / 6 ms = 0.66 MB/s
        • w/ sequential access of a 100 MB file: one seek & rotation needed (ideally)
          • can yield: effective bandwidth / transfer rate close to 125 MB/s
  • Solid state disk

    • SSD: a nonvolatile memory (NVM) used like a hard drive
      • many tech variations, e.g. from DRAM w/ battery to maintain state its state in power failure
        • w/ flash-memory technology
          • single-level cell (SLC)
          • multi-level cell (MLC)
      • SSDs: can be more reliable than HDD as no moving / mechanical parts
      • much faster: as there is no seek time / rotation latency
      • consumes less power: power efficiency
      • yet, more expensive per MB, w/ less capacity & shorter life span
    • as it's much faster than magnetic disks: standard bus interface mgiit be too slow, being a bottleneck
      • some: connect directly to system bus (e.g. PCI)
      • some: use them as a new cache tier
        • moving data between magnetic disk, SSDs, and memory: for optimizing performance
  • Magnetic tape

    • magnetic tape: an early secondary storage medium
      • relatively permanent & can hold large quantities of data
      • access time: slow
        • moving to correct spot might take minutes
      • random access: β‰ˆ 1000 times slower than magnetic disk
        • not useful for secondary storage in modern computer systems
    • mainly used for backup, storage of infrequently used data
      • or: medium of transferring information from one system to another
    • tape capacities: vary greatly
      • depending on particular kind of tape drive
        • w/ current capacities exceeding several terabytes
      • typically between 200 GB and 1.5 TB
Disk Storage & Scheduling
  • Disk structure

    • drives: addressed as large 1d arrays of logical blocks
      • logical block: smallest unit of transfer
        • disk: represented as a no. of disk blocks
          • each block: w/ unique block number: disk address
        • size of a logical block: usually 512 bytes
        • low-level formatting: creates logical blocks on physical media
    • 1d array of logical blocks: mapped into sectors of the disk (sequentially)
      • sector 0: first sector of first track
        • on the outermost cylinder
      • mapping proceeds in order through track, then rest of tracks in the cylinder
        • then rest of cylinders: from outermost to innermost
      • logical to physical address (consist of: cylinder number, track number in cylinder, sector no. in track): must be easy, except:
        • defective sectors: mapping: hides this by substituting spare sectors from elsewhere on disk
        • no. of sectors / track: may not by constant on some devices
          • non-constant no. of sectors / track: via constant angular velocity
  • Disk scheduling

    • OS: responsible for using hardware efficiently
      • for disk drives: meas having fast access time & large disk bandwidth
    • seek time: time for disk head arm to move the head to corresponding cylinder containing desired sector
      • measured by: seek distance in term of no. of cylinders / tracks
    • rotational latency: additional time for the disk to rotate desired sector to disk head
    • disk bandwidth: total no. of bytes transferred / total time between first service req. and completion of transfer
      • improve access time & bandwidth by: managing order of disk IO requests serviced
    • many resources of disk IO requests
      • from OS, system processes, and user processes
      • IO requests: include
        • IO modes
        • disk address
        • memory address
        • no. of sectors to transfer
    • OS: maintains a queue of requests / disk or device
      • in multiprogramming system w/ many processes: disk queue often has several pending requests
    • idle disk: can immediately work on IO request
      • for busy disk: requests must be queued
      • optimization: only when the queue of request exists
    • disk drive controllers: w/ small buffers & manage a queue of IO requests (of varying "depth")
      • which request to be completed & selected next?
      • disk scheduling
    • now: illust scheduling algorithms w/ request queue, having 0-199 as cylinder numbers
      • e.g. 98, 183, 37, 122, 14, 124, 65, 67
      • current head position: 53
  • FCFS / FIFO

    • intrinsically: fair, yet doesn't proved the fastest service
    • resulting in: total head movement of 640 cylinders
    • 03_fcfs_diagram
  • SSTF: Shortest seek time first

    • selects req. w/ least seek time from current head position
      • i.e. pending request closest to current head position
    • SSTF: a form of SJF scheduling (greedy)
      • may cause: starvation of some requests
    • results in: 236 cylinders
    • 04_sstf_diagram
  • SCAN scheduling

    • disk: starting at one end of the disk, moving towards the other end
      • on the way: servicing requests as it reaches each cylinder
    • at the other end: direction of head movement: reversed, and continue the service
      • aka: elevator algorithm
    • note: if requests are uniformly distributed across cylinders
      • heaviest density of requests: at other end of disk, wait the longest
      • must know: direction of head movement
    • results in: 236 cylinders
    • 05_scan_diagram
  • LOOK scheduling

    • similar to SCAN, but disk arm only goes as far as the final request in each direction
      • NOT the end of the disk
    • πŸ‘¨β€πŸ« guaranteed to be not worst than SCAN
    • results in: 208 cylinders
    • 06_look_diagram
  • C-SCAN

    • Circular-SCAN: provides a more uniform waiting time than SCAN
    • head: moves from one end of the disk to the other, servicing requests as it goes
      • upon reaching the other end: immediately returns to the beginning of the disk
        • not serving any requests on the way back
      • treats:cylinders as a circular lists, wrapped around
        • from the last cylinder to the first one
    • results in: 382 cylinders
    • 07_cscan_diagram
  • C-LOOK

    • LOOK version of C-SCAN
    • disk arm: goes far as the last request in each direction
      • then reversing the direction
    • better than C-SCAN, for the very same reason
    • results in: 322 (C-LOOK), 308 (LOOK) cylinders
    • 08_clook_diagram
  • Selecting a disk-scheduling algorithm

    • common solution: SSTF
      • increases performance of FCFS
      • SCAN, C-SCAN: performs better for systems w/ heavy load on disk
        • as: they are less likely to cause starvation problems
    • scheduling performance: depends on the no. o& ype of requests
      • w/ 1 request: all scheduling algorithm must behave the same
    • requests for disk services: greatly influenced by the file-allocation method (to be discussed)
      • contiguously allocated file: generate several requests, close together on the disk
        • resulting in limited head movement
      • linked or indexed file: may include blocks widely scattered on the disk
        • resulting in greater head movement
      • location of directories & index blocks: also important
        • as: they are accessed frequently
        • directory entry and file data on different cylinders
          • => cause excessive head movement
        • caching directory & index block in memory: helps
    • disk-scheduling algorithm: written as a separate module of the OS
      • allowing to be replaced w/ different algorithm if necessary
      • either SSTF / LOOK: a reasonable choice as the default algorithm
RAID
  • RAID: improving reliability via redundancy

    • by: Randy Kal & David Peterson, late 80-90s
      • πŸ‘¨β€πŸ« I know one of them very well
    • RAID: Redundant Arrays of Independent Disks
      • in the past: composed of small, cheap disks
        • viewed as: a cost-effective alternatives to large & expensive disks
          • once called: redundant arrays of inexpensive disks
      • now: used for higher reliability via redundancy & higher data-transfer rate (access in parallel)
    • increases: mean time to failure