COMP 3511: Lecture 23

Date: 2024-11-21 15:02:33

Reviewed:

Topic / Chapter:

summary

❓Questions

Notes

RAID (cont.)
  • RAID (cont.)

    • increases mean time to failure
      • chance that a disk out of disks fail: much higher than one specific disk failing
      • if: mean time to failure of a single disk is 100,000 hours
        • mean time to failure in 100 disks: 100,000 / 100 = 1,000 hours = 41.66 days!
    • data loss rate: unacceptable if only 1 copy of data is stored
      • solution: introducing redundancy
      • simply & most expensive approach: mirroring
        • duplicate every disk
      • every write: carried out on two physical disks
        • data: lost only if the second disk fails before first failed disk gets replaced
    • mean time to repair
  • RAID performance improvement

    • parallelism (disk system) via data striping: two main goals
      • increase: throughout of multiple small access by load balancing
      • reduce: response time of large access
    • bit-level striping
      • array of 8 disks: can be treated as a single disk
        • w/ each sectors are 8 times the normal sector size
      • bit level striping
    • block-level striping
      • blocks of a file: stripes across multiple disks
      • with disks: block of a file goes to
  • RAID structure

    • mirroring: providing high reliability, but expensive
    • striping: providing high data-transfer rates, not providing reliability
    • 01_RAID_structure
  • RAID (0+1) and (1+0)

    • both striped mirrors (RAID 1+0) or mirrored stripes (RAID 0+1):
      • provides high performance (RAID 0)
      • and high reliability (RAID 1)
        • πŸ‘¨β€πŸ« can also rely on error correcting code!
    • (RAID 0+1)
    • (RAID 1+0): drives are mirrored in pairs, and
    • 02_RAID_01_10
  • Other features

    • regardless of what RAID implemented:
      • useful features can be added at each level
    • snapshot: view of file before the last update took place
      • for recovery
    • replication
    • hot spare: dis
File-System Interface
  • File concept

    • contiguous logical address space
    • types
      • data
        • numeric
        • character
        • binary
      • program
    • content defined by the file creator
      • many types: e.g. test, source, and executable file
  • File attributes

    • name: information kept in human readable form
    • identifier
    • type: needed by systems that support different types
      • πŸ‘¨β€πŸ« in UNIX/Linux: a positive integer
    • location: pointer to file location on device
      • πŸ‘¨β€πŸ« can be very complex
      • depends on: how disk space is allocated to the file!
    • size: current file size
    • protection
    • time, date, and user identification: data for protection, security, and usage monitoring
    • information about files: kept in directory structure
      • maintained on the disk
        • part of which: currently in use can be cached in main memory
      • many variations: extended file attributes like file checksum
  • File operations

    • file: an abstract data type (ADT)
    • create:
    • write:
    • read:
    • reposition within file: seek
      • πŸ‘¨β€πŸ« not an issue unless you use magnetic tape
    • delete:
    • truncate:
    • open:
    • close:
    • such operations: involve changes of various OS kernel data structures
  • Open files

    • several data structures: needed to manage open files
      • open-file tables: tracks open files, system-wide open-file table
        • and per-process open-file table
      • file pointer: pointer to last read / write location
        • per process that has the file open
      • file-open count: counting the no. of processes that the file has been opened
        • allowing: removal of data from open-file table
          • when the last processes close it (when file-open count is 0)
      • disk locations of a file: cache of data access information
      • access rights: per-process access mode information
  • File types

    file typeusual extensionfunction
    executableusual extensionfunction
    objectusual extensionfunction
    source codeusual extensionfunction
    batchusual extensionfunction
    textusual extensionfunction
    word processorusual extensionfunction
    libraryusual extensionfunction
    print / viewusual extensionfunction
    archiveusual extensionfunction
    multimediausual extensionfunction
  • Access methods

    • sequential access: simplest access method
      read next
      write next
      reset 
      // no read after last write (rewrite)
      
    • direct access: file is fixed length logical records
      read n
      write n
      position to n
          read next
          write next
      rewrite n
      
      • n: relative block number
    • relative block numbers: allows OS to decide where file should be placed
      • more in disk block allocation problem
  • Other access methods

    • other file access methods: can be build on direct-access
    • generally: involve creation of an index for a file
    • keep index in memory for fast location of data
    • IBM
    • 03_access_methods
Structures
  • Directory structure

    • collection of nodes containing information about all files
    • 04_directory_structure
    • doth directory structure & files: reside on disk
  • Disk structure

    • disk: can be subdivided into partitions
    • disks / partitions: can be RAID protected against failure
    • partitions: aka minidisks / slices
    • volume: entity on a disk containing a file system
      • each volume containing a file system: keeps track of the file system into
        • in device directory or volume table of contents
    • other than general purpose file systems, many special purpose file systems exist
      • frequently within the same OS / computing system
      • πŸ‘¨β€πŸ« e.g. backup...
  • Typical file-system organization

    • 05_partition
  • Operations performed on directory

    • search for a file
    • create a file
    • delete a file
    • list a directory
    • rename a file
    • traverse the file system
  • Organize the directory (logically) to obtain..

    • efficiency: locating a file quickly
    • naming: convenient to users
      • two users: might have same name for different files
      • same file: can have several different names
    • grouping: logical grouping of files by properties
      • e.g. all Java programs, games, COMP 3511 ...
  • Single-level directory

    • single directory for all users
    • 06_single_directory
  • Two-level directory

    • e.g. separate directory for each user
    • 07_two_level_directory
    • a path name is required
      • to identify files accurately (e.g. /user1/cat)
    • same file name may exist under different users
    • more efficient searching than single-level directory
      • yet no grouping capability
  • Tree-structured directory

    • 08_tree_directory
    • efficient searching & grouping capability
    • current directory (working directory)
      cd /spell/mail/prog
      type list
      
    • path name: either absolute or relative
      • πŸ‘¨β€πŸ« absolute: always unique
    • creating a new file: done in the current directory
    • delete a file in the current directory
      • rm <file-name>
    • creating a new subdirectory: done in the current directory
      • mkdir <file-name>
    • πŸ‘¨β€πŸ« almost perfect... but it doesn't support shared file!
  • Acyclic-graph directories

    • πŸ‘¨β€πŸ« cycle: must be avoided as we might fall into infinite loop when we traverse!
    • now: we have shared subdirectories & files
      • more flexile and complex
    • 09_acyclic_graph_directory
    • new directory entry type
      • link: another name (pointer) to an existing file
      • resolve the link: follow pointer to locate the file
    • alias: 2 different path names
      • ensure: traversing shared structure more than once
      • deletion of one: might lead to dangling pointers
        • pointing to empty files, or even wrong files
    • also: difficulty ensuring there is no cycle in graph
      • either ban directory sharing (inconvenient), or run cycle detection per every new link (time consuming)
      • πŸ‘¨β€πŸ« no free lunch!
    • 10_general_graph_directory
  • File system mounting

  • File sharing

  • Protection

    • file owner / creator: should be able to control
      • what can be done
      • by whom
    • types of access
      • read
      • write
      • execute
      • as well as...
        • append
        • delete
        • list
  • Access lists and groups

    • mode of access: read, write , execute
      • a 3-bit integer, or 0-7 in decimal
      • when bit=1: has permission
    • three classes of users: on UNIX / Linux
      1. owner
      2. group
      3. general / public
      • thus, 3 3-bit integer to specify permission
    • sample
      • starts w/ d if directory, - otherwise
      • -rw-rw-r--: file w/ 664 permission
      • drwxrwxrwx: directory w/ 777 permission
File-System Implementation
  • File system structure

    • disk: provides most of second storage
      • on which: file systems are maintained
    • two characteristics: make disk convenient for this usage
      1. can be rewritten in place
        • can be read from block, and write it back to same place after modification
      2. disk: can access directly any block of info. it contains
        • simple to access any file either sequentially / randomly
        • switching from one file to another: requires only moving read-write heads & waiting for disk to rotate
    • to improve IO efficiency: IO transfers between memory & disk: done in units of blocks
      • each block: 1 or more sectors
      • sector size: varies from 32 bytes to 4KB
        • usually 512 bytes
    • file structure
      • logical storage unit
      • collection of related information
    • file system: resides on secondary storage (hard drive / disks)
      • provides: efficient & convenient access to disk
        • by allowing data to be: stored, located, and retried easily
      • provides: UI: file and file attributes, operations on files, directory for organizing filed
      • provides: data structure and algorithms for:
        • mapping logical file system onto physical secondary storage devices
    • file systems: organized into different layers
  • Layered file system

    • πŸ‘¨β€πŸ« multi layer file system: one (not only) way
    application programs 
    => logical file system
    => file-organization module
    => basic file system
    => IO control
    => devices
    
  • File system layers

    • IO control & device drivers: manage IO devices at the IO control layer
    • basic file system: issues generic commands to the appropriate device driver to read & write physical blocks of dick
      • caches: hold frequently used file-system metadata to improve performance
    • file organization module: knows files & their logical blocks
      • as well as physical blocks
      • translates logical block addr. to physical block addr.
        • pass this to basic file system for transfer
      • managers free disk space
        • disk block allocation
    • logical file system: manages metadata information
      • metadata: includes all of the file-system structure
        • except the actual data (i.e. contents of file)
      • managing: directory structure to provide: info needed by the file-organization module
      • translates: file name into file no. / file handle / location
        • by maintaining file control blocks
      • file control block (FCB) (aka inode in UNIX)
        • contains all info about file, including:
          • ownership, permissions, location of the file contents (on the disk)
        • also responsible for file protection