COMP 3511: Lecture 23

Date: 2024-11-21 15:02:33

Reviewed:

Topic / Chapter:

summary

❓Questions

Notes

RAID (cont.)

RAID (cont.)
- increases mean time to failure
  - chance that a disk out of $N$ disks fail: much higher than one specific disk failing
  - if: mean time to failure of a single disk is 100,000 hours
    - mean time to failure in 100 disks: 100,000 / 100 = 1,000 hours = 41.66 days!
- data loss rate: unacceptable if only 1 copy of data is stored
  - solution: introducing redundancy
  - simply & most expensive approach: mirroring
    - duplicate every disk
  - every write: carried out on two physical disks
    - data: lost only if the second disk fails before first failed disk gets replaced
- mean time to repair
RAID performance improvement
- parallelism (disk system) via data striping: two main goals
  - increase: throughout of multiple small access by load balancing
  - reduce: response time of large access
- bit-level striping
  - array of 8 disks: can be treated as a single disk
    - w/ each sectors are 8 times the normal sector size
  - bit level striping
- block-level striping
  - blocks of a file: stripes across multiple disks
  - with $n$ disks: block $i$ of a file goes to $(i mod n) + i$
RAID structure
- mirroring: providing high reliability, but expensive
- striping: providing high data-transfer rates, not providing reliability
RAID (0+1) and (1+0)
- both striped mirrors (RAID 1+0) or mirrored stripes (RAID 0+1):
  - provides high performance (RAID 0)
  - and high reliability (RAID 1)
    - 👨‍🏫 can also rely on error correcting code!
- (RAID 0+1)
- (RAID 1+0): drives are mirrored in pairs, and
Other features
- regardless of what RAID implemented:
  - useful features can be added at each level
- snapshot: view of file before the last update took place
  - for recovery
- replication
- hot spare: dis

File-System Interface

File concept
- contiguous logical address space
- types
  - data
    - numeric
    - character
    - binary
  - program
- content defined by the file creator
  - many types: e.g. test, source, and executable file
File attributes
- name: information kept in human readable form
- identifier
- type: needed by systems that support different types
  - 👨‍🏫 in UNIX/Linux: a positive integer
- location: pointer to file location on device
  - 👨‍🏫 can be very complex
  - depends on: how disk space is allocated to the file!
- size: current file size
- protection
- time, date, and user identification: data for protection, security, and usage monitoring
- information about files: kept in directory structure
  - maintained on the disk
    - part of which: currently in use can be cached in main memory
  - many variations: extended file attributes like file checksum
File operations
- file: an abstract data type (ADT)
- create:
- write:
- read:
- reposition within file: seek
  - 👨‍🏫 not an issue unless you use magnetic tape
- delete:
- truncate:
- open:
- close:
- such operations: involve changes of various OS kernel data structures
Open files
- several data structures: needed to manage open files
  - open-file tables: tracks open files, system-wide open-file table
    - and per-process open-file table
  - file pointer: pointer to last read / write location
    - per process that has the file open
  - file-open count: counting the no. of processes that the file has been opened
    - allowing: removal of data from open-file table
      - when the last processes close it (when file-open count is 0)
  - disk locations of a file: cache of data access information
  - access rights: per-process access mode information

File types

file type	usual extension	function
executable	usual extension	function
object	usual extension	function
source code	usual extension	function
batch	usual extension	function
text	usual extension	function
word processor	usual extension	function
library	usual extension	function
print / view	usual extension	function
archive	usual extension	function
multimedia	usual extension	function

Access methods
- sequential access: simplest access method
```
read next
write next
reset 
// no read after last write (rewrite)
```
- direct access: file is fixed length logical records
```
read n
write n
position to n
    read next
    write next
rewrite n
```
  - n: relative block number
- relative block numbers: allows OS to decide where file should be placed
  - more in disk block allocation problem
Other access methods
- other file access methods: can be build on direct-access
- generally: involve creation of an index for a file
- keep index in memory for fast location of data
- IBM

Structures

Directory structure
- collection of nodes containing information about all files
- doth directory structure & files: reside on disk
Disk structure
- disk: can be subdivided into partitions
- disks / partitions: can be RAID protected against failure
- partitions: aka minidisks / slices
- volume: entity on a disk containing a file system
  - each volume containing a file system: keeps track of the file system into
    - in device directory or volume table of contents
- other than general purpose file systems, many special purpose file systems exist
  - frequently within the same OS / computing system
  - 👨‍🏫 e.g. backup...
Typical file-system organization
Operations performed on directory
- search for a file
- create a file
- delete a file
- list a directory
- rename a file
- traverse the file system
Organize the directory (logically) to obtain..
- efficiency: locating a file quickly
- naming: convenient to users
  - two users: might have same name for different files
  - same file: can have several different names
- grouping: logical grouping of files by properties
  - e.g. all Java programs, games, COMP 3511 ...
Single-level directory
- single directory for all users
Two-level directory
- e.g. separate directory for each user
- a path name is required
  - to identify files accurately (e.g. /user1/cat)
- same file name may exist under different users
- more efficient searching than single-level directory
  - yet no grouping capability
Tree-structured directory
- efficient searching & grouping capability
- current directory (working directory)
```
cd /spell/mail/prog
type list
```
- path name: either absolute or relative
  - 👨‍🏫 absolute: always unique
- creating a new file: done in the current directory
- delete a file in the current directory
  - rm <file-name>
- creating a new subdirectory: done in the current directory
  - mkdir <file-name>
- 👨‍🏫 almost perfect... but it doesn't support shared file!
Acyclic-graph directories
- 👨‍🏫 cycle: must be avoided as we might fall into infinite loop when we traverse!
- now: we have shared subdirectories & files
  - more flexile and complex
- new directory entry type
  - link: another name (pointer) to an existing file
  - resolve the link: follow pointer to locate the file
- alias: 2 different path names
  - ensure: traversing shared structure more than once
  - deletion of one: might lead to dangling pointers
    - pointing to empty files, or even wrong files
- also: difficulty ensuring there is no cycle in graph
  - either ban directory sharing (inconvenient), or run cycle detection per every new link (time consuming)
  - 👨‍🏫 no free lunch!
File system mounting
File sharing
Protection
- file owner / creator: should be able to control
  - what can be done
  - by whom
- types of access
  - read
  - write
  - execute
  - as well as...
    - append
    - delete
    - list
Access lists and groups
- mode of access: read, write , execute
  - a 3-bit integer, or 0-7 in decimal
  - when bit=1: has permission
- three classes of users: on UNIX / Linux
  1. owner
  2. group
  3. general / public
  - thus, 3 3-bit integer to specify permission
- sample
  - starts w/ d if directory, - otherwise
  - -rw-rw-r--: file w/ 664 permission
  - drwxrwxrwx: directory w/ 777 permission

File-System Implementation

File system structure
- disk: provides most of second storage
  - on which: file systems are maintained
- two characteristics: make disk convenient for this usage
  1. can be rewritten in place
    - can be read from block, and write it back to same place after modification
  2. disk: can access directly any block of info. it contains
    - simple to access any file either sequentially / randomly
    - switching from one file to another: requires only moving read-write heads & waiting for disk to rotate
- to improve IO efficiency: IO transfers between memory & disk: done in units of blocks
  - each block: 1 or more sectors
  - sector size: varies from 32 bytes to 4KB
    - usually 512 bytes
- file structure
  - logical storage unit
  - collection of related information
- file system: resides on secondary storage (hard drive / disks)
  - provides: efficient & convenient access to disk
    - by allowing data to be: stored, located, and retried easily
  - provides: UI: file and file attributes, operations on files, directory for organizing filed
  - provides: data structure and algorithms for:
    - mapping logical file system onto physical secondary storage devices
- file systems: organized into different layers

Layered file system

👨‍🏫 multi layer file system: one (not only) way

application programs 
=> logical file system
=> file-organization module
=> basic file system
=> IO control
=> devices

File system layers
- IO control & device drivers: manage IO devices at the IO control layer
- basic file system: issues generic commands to the appropriate device driver to read & write physical blocks of dick
  - caches: hold frequently used file-system metadata to improve performance
- file organization module: knows files & their logical blocks
  - as well as physical blocks
  - translates logical block addr. to physical block addr.
    - pass this to basic file system for transfer
  - managers free disk space
    - disk block allocation
- logical file system: manages metadata information
  - metadata: includes all of the file-system structure
    - except the actual data (i.e. contents of file)
  - managing: directory structure to provide: info needed by the file-organization module
  - translates: file name into file no. / file handle / location
    - by maintaining file control blocks
  - file control block (FCB) (aka inode in UNIX)
    - contains all info about file, including:
      - ownership, permissions, location of the file contents (on the disk)
    - also responsible for file protection

COMP 3511: Operating Systems