COMP 3511: Lecture 23
Date: 2024-11-21 15:02:33
Reviewed:
Topic / Chapter:
summary
βQuestions
Notes
RAID (cont.)
-
RAID (cont.)
- increases mean time to failure
- chance that a disk out of disks fail: much higher than one specific disk failing
- if: mean time to failure of a single disk is 100,000 hours
- mean time to failure in 100 disks: 100,000 / 100 = 1,000 hours = 41.66 days!
- data loss rate: unacceptable if only 1 copy of data is stored
- solution: introducing redundancy
- simply & most expensive approach: mirroring
- duplicate every disk
- every write: carried out on two physical disks
- data: lost only if the second disk fails before first failed disk gets replaced
- mean time to repair
- increases mean time to failure
-
RAID performance improvement
- parallelism (disk system) via data striping: two main goals
- increase: throughout of multiple small access by load balancing
- reduce: response time of large access
- bit-level striping
- array of 8 disks: can be treated as a single disk
- w/ each sectors are 8 times the normal sector size
- bit level striping
- array of 8 disks: can be treated as a single disk
- block-level striping
- blocks of a file: stripes across multiple disks
- with disks: block of a file goes to
- parallelism (disk system) via data striping: two main goals
-
RAID structure
- mirroring: providing high reliability, but expensive
- striping: providing high data-transfer rates, not providing reliability
-
RAID (0+1) and (1+0)
- both striped mirrors (RAID 1+0) or mirrored stripes (RAID 0+1):
- provides high performance (RAID 0)
- and high reliability (RAID 1)
- π¨βπ« can also rely on error correcting code!
- (RAID 0+1)
- (RAID 1+0): drives are mirrored in pairs, and
- both striped mirrors (RAID 1+0) or mirrored stripes (RAID 0+1):
-
Other features
- regardless of what RAID implemented:
- useful features can be added at each level
- snapshot: view of file before the last update took place
- for recovery
- replication
- hot spare: dis
- regardless of what RAID implemented:
File-System Interface
-
File concept
- contiguous logical address space
- types
- data
- numeric
- character
- binary
- program
- data
- content defined by the file creator
- many types: e.g. test, source, and executable file
-
File attributes
name
: information kept in human readable formidentifier
type
: needed by systems that support different types- π¨βπ« in UNIX/Linux: a positive integer
location
: pointer to file location on device- π¨βπ« can be very complex
- depends on: how disk space is allocated to the file!
size
: current file sizeprotection
time, date, and user identification
: data for protection, security, and usage monitoring
- information about files: kept in directory structure
- maintained on the disk
- part of which: currently in use can be cached in main memory
- many variations: extended file attributes like file checksum
- maintained on the disk
-
File operations
- file: an abstract data type (ADT)
create
:write
:read
:reposition within file
: seek- π¨βπ« not an issue unless you use magnetic tape
delete
:truncate
:open
:close
:- such operations: involve changes of various OS kernel data structures
-
Open files
- several data structures: needed to manage open files
open-file tables
: tracks open files, system-wide open-file table- and per-process open-file table
- file pointer: pointer to last read / write location
- per process that has the file open
- file-open count: counting the no. of processes that the file has been opened
- allowing: removal of data from open-file table
- when the last processes close it (when file-open count is 0)
- allowing: removal of data from open-file table
- disk locations of a file: cache of data access information
- access rights: per-process access mode information
- several data structures: needed to manage open files
-
File types
file type usual extension function executable usual extension function object usual extension function source code usual extension function batch usual extension function text usual extension function word processor usual extension function library usual extension function print / view usual extension function archive usual extension function multimedia usual extension function -
Access methods
- sequential access: simplest access method
read next write next reset // no read after last write (rewrite)
- direct access: file is fixed length logical records
read n write n position to n read next write next rewrite n
n
: relative block number
- relative block numbers: allows OS to decide where file should be placed
- more in disk block allocation problem
- sequential access: simplest access method
-
Other access methods
- other file access methods: can be build on direct-access
- generally: involve creation of an index for a file
- keep index in memory for fast location of data
- IBM
Structures
-
Directory structure
- collection of nodes containing information about all files
- doth directory structure & files: reside on disk
-
Disk structure
- disk: can be subdivided into partitions
- disks / partitions: can be RAID protected against failure
- partitions: aka minidisks / slices
- volume: entity on a disk containing a file system
- each volume containing a file system: keeps track of the file system into
- in device directory or volume table of contents
- each volume containing a file system: keeps track of the file system into
- other than general purpose file systems, many special purpose file systems exist
- frequently within the same OS / computing system
- π¨βπ« e.g. backup...
-
Typical file-system organization
-
Operations performed on directory
- search for a file
- create a file
- delete a file
- list a directory
- rename a file
- traverse the file system
-
Organize the directory (logically) to obtain..
- efficiency: locating a file quickly
- naming: convenient to users
- two users: might have same name for different files
- same file: can have several different names
- grouping: logical grouping of files by properties
- e.g. all Java programs, games, COMP 3511 ...
-
Single-level directory
- single directory for all users
-
Two-level directory
- e.g. separate directory for each user
- a path name is required
- to identify files accurately (e.g.
/user1/cat
)
- to identify files accurately (e.g.
- same file name may exist under different users
- more efficient searching than single-level directory
- yet no grouping capability
-
Tree-structured directory
- efficient searching & grouping capability
- current directory (working directory)
cd /spell/mail/prog type list
- path name: either absolute or relative
- π¨βπ« absolute: always unique
- creating a new file: done in the current directory
- delete a file in the current directory
rm <file-name>
- creating a new subdirectory: done in the current directory
mkdir <file-name>
- π¨βπ« almost perfect... but it doesn't support shared file!
-
Acyclic-graph directories
- π¨βπ« cycle: must be avoided as we might fall into infinite loop when we traverse!
- now: we have shared subdirectories & files
- more flexile and complex
- new directory entry type
- link: another name (pointer) to an existing file
- resolve the link: follow pointer to locate the file
- alias: 2 different path names
- ensure: traversing shared structure more than once
- deletion of one: might lead to dangling pointers
- pointing to empty files, or even wrong files
- also: difficulty ensuring there is no cycle in graph
- either ban directory sharing (inconvenient), or run cycle detection per every new link (time consuming)
- π¨βπ« no free lunch!
-
File system mounting
-
File sharing
-
Protection
- file owner / creator: should be able to control
- what can be done
- by whom
- types of access
read
write
execute
- as well as...
append
delete
list
- file owner / creator: should be able to control
-
Access lists and groups
- mode of access: read, write , execute
- a 3-bit integer, or
0-7
in decimal - when
bit=1
: has permission
- a 3-bit integer, or
- three classes of users: on UNIX / Linux
- owner
- group
- general / public
- thus, 3 3-bit integer to specify permission
- sample
- starts w/
d
if directory,-
otherwise -rw-rw-r--
: file w/664
permissiondrwxrwxrwx
: directory w/777
permission
- starts w/
- mode of access: read, write , execute
File-System Implementation
-
File system structure
- disk: provides most of second storage
- on which: file systems are maintained
- two characteristics: make disk convenient for this usage
- can be rewritten in place
- can be read from block, and write it back to same place after modification
- disk: can access directly any block of info. it contains
- simple to access any file either sequentially / randomly
- switching from one file to another: requires only moving read-write heads & waiting for disk to rotate
- can be rewritten in place
- to improve IO efficiency: IO transfers between memory & disk: done in units of blocks
- each block: 1 or more sectors
- sector size: varies from 32 bytes to 4KB
- usually 512 bytes
- file structure
- logical storage unit
- collection of related information
- file system: resides on secondary storage (hard drive / disks)
- provides: efficient & convenient access to disk
- by allowing data to be: stored, located, and retried easily
- provides: UI: file and file attributes, operations on files, directory for organizing filed
- provides: data structure and algorithms for:
- mapping logical file system onto physical secondary storage devices
- provides: efficient & convenient access to disk
- file systems: organized into different layers
- disk: provides most of second storage
-
Layered file system
- π¨βπ« multi layer file system: one (not only) way
application programs => logical file system => file-organization module => basic file system => IO control => devices
-
File system layers
- IO control & device drivers: manage IO devices at the IO control layer
- basic file system: issues generic commands to the appropriate device driver to read & write physical blocks of dick
- caches: hold frequently used file-system metadata to improve performance
- file organization module: knows files & their logical blocks
- as well as physical blocks
- translates logical block addr. to physical block addr.
- pass this to basic file system for transfer
- managers free disk space
- disk block allocation
- logical file system: manages metadata information
- metadata: includes all of the file-system structure
- except the actual data (i.e. contents of file)
- managing: directory structure to provide: info needed by the file-organization module
- translates: file name into file no. / file handle / location
- by maintaining file control blocks
- file control block (FCB) (aka inode in UNIX)
- contains all info about file, including:
- ownership, permissions, location of the file contents (on the disk)
- also responsible for file protection
- contains all info about file, including:
- metadata: includes all of the file-system structure