Kevin Song
I'm a student at UT (that's the one in Austin) who studies things.
Processes And Scheduling
Threads + Synchronization
Memory
1. How do we deal with external devices?
- How do we get data to/from the device?
- How do we deal with the fact that the device might not be ready yet?
2. A special external device: the disk
- Disk speed (or lack thereof)
- I/O Characteristics of disks
3. How to use disk head scheduling to speed up disk access
4. How to dodge the problem by using SSDs instead
There are four pieces of hardware associated with an I/O device:
What is the bound on how long it takes to read a keypress? The OS must wait until the external resource is available.
Instead of having the CPU check, the device notifies the CPU (by sending a message on the bus) when the I/O operation is complete.
Direct memory access is a hardware-mediated protocol (i.e. the hardware needs to support it) which allows the device to write directly into main memory.
Instead of data-in/data-out registers, we have an address register and a DMA controller. The CPU tells the device the destination of the transfer.
The DMA controller reads/writes main memory directly over the system bus, and notifies the CPU when the entire transfer is complete, instead of a single word.
Can lead to bus or memory contention when heavily used, but still has better performance than interrupting the CPU.
Image Credits: Danlash @ en.wikipedia.org
Faster, Smaller, More Expensive
For a relatively fast hard drive, the average data access has a latency (request made to data available) of 12.0 ms.
Action | Latency | 1ns = 1s | Conversation Action |
---|---|---|---|
Values taken from "Numbers Every Programmer Should Know", 2020 edition
1s
4s
1m 40s
138d 21hr+
1ns
4ns
100ns
12ms
L1 Cache Hit
L2 Cache Hit
Cache Miss*
* a.k.a. Main Memory Reference
Disk Seek
Talking Normally
Long Pause
Go upstairs to ask Dr. Norman
Walk to Washington D.C. and back
DISK ACCESS
Other Destinations:
Platter: a thin metal disk that is coated in magnetic material. Magnetic material on the surface stores the data.
Each platter has two surfaces (top and bottom)
Drive Head: Reads by sensing magnetic field, writes by creating magnetic field.
Floats on an "air" cushion created by the spinning disk.
Sector: Smallest unit that the drive can read, usually 512 bytes large.
Drives are mounted on a spindle and spin around it (between 5,000 and 15,000 rpm)
Sectors are organized in circular tracks. Data on the same track can be read without moving the drive head.
{
Cylinder
A set of different surfaces with the same track index
Comb
A set of drive heads.
Even though we don't usually exploit this in the operating system.
Why? Because more disk travels past the read head per unit time closer to the edge of the platter!
Look at the diagram: even if the rings were the same thickness (which they aren't because I screwed up), the orange arc would clearly be larger than the blue one.
In fact, if you do some calculus, you find that 2x the distance from the spindle = 2x the area per second
The CPU presents the disk with the address of a sector:
Disk controller moves heads to the appropriate track.
Wait for sector to appear under the drive head.
Read or write the sector as it spins by.
The CPU presents the disk with the address of a sector:
Disk controller moves heads to the appropriate track.
Wait for sector to appear under the drive head.
Read or write the sector as it spins by.
The CPU presents the disk with the address of a sector:
Disk controller moves heads to the appropriate track.
Wait for sector to appear under the drive head.
Read or write the sector as it spins by.
The CPU presents the disk with the address of a sector:
Disk controller moves heads to the appropriate track.
Wait for sector to appear under the drive head.
Read or write the sector as it spins by.
The CPU presents the disk with the address of a sector:
Disk controller moves heads to the appropriate track.
Wait for sector to appear under the drive head.
Read or write the sector as it spins by.
The CPU presents the disk with the address of a sector:
Disk controller moves heads to the appropriate track.
Wait for sector to appear under the drive head.
Read or write the sector as it spins by.
The CPU presents the disk with the address of a sector:
Disk controller moves heads to the appropriate track.
Wait for sector to appear under the drive head.
Read or write the sector as it spins by.
Disk I/O time =
seek time + rotation time + transfer time
Remember that these times may change depending on where on the disk the data is stored and what the disk is currently doing!
Platters/Heads | 6/12 |
Capacity | 8 TB |
Rotation Speed | 7200 RPM |
Average Seek Time | 8.5 ms |
Track-to-track seek | 1.0 ms |
Surface Transfer BW (avg) | 180 MB/s |
Surface Transfer BW (max) | 220 MB/s |
Host transfer rate | 600 MB/s |
Cache size | 256MB |
Power-at-idle | 7.2 W |
Power-in-use | 8.8 W |
Disk I/O time =
seek time + rotation time + transfer time
For each read, need to seek, rotate, and transfer
Ouch.
Platters/Heads | 6/12 |
Capacity | 8 TB |
Rotation Speed | 7200 RPM |
Average Seek Time | 8.5 ms |
Track-to-track seek | 1.0 ms |
Surface Transfer BW (avg) | 180 MB/s |
Surface Transfer BW (max) | 220 MB/s |
Host transfer rate | 600 MB/s |
Cache size | 256MB |
Power-at-idle | 7.2 W |
Power-in-use | 8.8 W |
Disk I/O time =
seek time + rotation time + transfer time
We need to seek/settle/rotate once for all the reads, then transfer data from the disk.
150 | 16 | 147 | 14 | 72 | 83 |
---|
16 | 147 | 14 | 72 | 83 |
---|
|
147 | 14 | 72 | 83 |
---|
|
14 | 72 | 83 |
---|
|
72 | 83 |
---|
|
83 |
---|
|
---|
|
83 |
---|
Running these requests in FIFO order has resulted in the disk head moving 550 tracks. Can we do better than this?
Our first two moves are clearly dumb.
Rearrange our access order to greedily minimize the distance traveled between tracks.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
72 | 83 | 147 | 150 | 16 | 14 |
---|
Rearrange our access order to greedily minimize the distance traveled between tracks.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
83 | 147 | 150 | 16 | 14 |
---|
Rearrange our access order to greedily minimize the distance traveled between tracks.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
147 | 150 | 16 | 14 |
---|
Rearrange our access order to greedily minimize the distance traveled between tracks.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
150 | 16 | 14 |
---|
Rearrange our access order to greedily minimize the distance traveled between tracks.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 |
---|
Rearrange our access order to greedily minimize the distance traveled between tracks.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
14 |
---|
Rearrange our access order to greedily minimize the distance traveled between tracks.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
Running these requests in SSTF order has resulted in the disk head moving 221 tracks. Can we do better?
Notice that the jump in red is a particularly bad one. It occurs because SSTF walks us towards the high end of the disk, then "traps" us so that we have no choice but to jump all the way across the disk.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | 72 | 83 | 147 | 150 |
---|
Move the head in one direction until we hit the edge of the disk, then reverse.
Simple optimization: reset when no more requests exist between current request and edge of disk (called LOOK).
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
14 | 72 | 83 | 147 | 150 |
---|
Move the head in one direction until we hit the edge of the disk, then reverse.
Simple optimization: reset when no more requests exist between current request and edge of disk (called LOOK).
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
72 | 83 | 147 | 150 |
---|
Move the head in one direction until we hit the edge of the disk, then reverse.
Simple optimization: reset when no more requests exist between current request and edge of disk (called LOOK).
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
83 | 147 | 150 |
---|
Move the head in one direction until we hit the edge of the disk, then reverse.
Simple optimization: reset when no more requests exist between current request and edge of disk (called LOOK).
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
147 | 150 |
---|
Move the head in one direction until we hit the edge of the disk, then reverse.
Simple optimization: reset when no more requests exist between current request and edge of disk (called LOOK).
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
150 |
---|
Move the head in one direction until we hit the edge of the disk, then reverse.
Simple optimization: reset when no more requests exist between current request and edge of disk (called LOOK).
Move the head in one direction until we hit the edge of the disk, then reverse.
Simple optimization: reset when no more requests exist between current request and edge of disk (called LOOK).
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
Running these requests in LOOK order has resulted in the disk head moving 187 tracks. Can we do better?
Move the head towards inner edge until head reaches spindle, then reset to outer edge.
Reset immediately once no more requests towards inner edge: C-LOOK.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | R | 150 | 147 | 83 | 72 |
---|
Takes advantage of a hardware feature that allows the disk head to reset to the outer edge almost-instantly.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | R | 150 | 147 | 83 | 72 |
---|
Takes advantage of a hardware feature that allows the disk head to reset to the outer edge almost-instantly.
Move the head towards inner edge until head reaches spindle, then reset to outer edge.
Reset immediately once no more requests towards inner edge: C-LOOK.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | R | 150 | 147 | 83 | 72 |
---|
Takes advantage of a hardware feature that allows the disk head to reset to the outer edge almost-instantly.
Move the head towards inner edge until head reaches spindle, then reset to outer edge.
Reset immediately once no more requests towards inner edge: C-LOOK.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | R | 150 | 147 | 83 | 72 |
---|
Takes advantage of a hardware feature that allows the disk head to reset to the outer edge almost-instantly.
Move the head towards inner edge until head reaches spindle, then reset to outer edge.
Reset immediately once no more requests towards inner edge: C-LOOK.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | R | 150 | 147 | 83 | 72 |
---|
Takes advantage of a hardware feature that allows the disk head to reset to the outer edge almost-instantly.
Move the head towards inner edge until head reaches spindle, then reset to outer edge.
Reset immediately once no more requests towards inner edge: C-LOOK.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | R | 150 | 147 | 83 | 72 |
---|
Takes advantage of a hardware feature that allows the disk head to reset to the outer edge almost-instantly.
Move the head towards inner edge until head reaches spindle, then reset to outer edge.
Reset immediately once no more requests towards inner edge: C-LOOK.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | R | 150 | 147 | 83 | 72 |
---|
Takes advantage of a hardware feature that allows the disk head to reset to the outer edge almost-instantly.
Move the head towards inner edge until head reaches spindle, then reset to outer edge.
Reset immediately once no more requests towards inner edge: C-LOOK.
150 | 16 | 147 | 14 | 72 | 83 |
---|
Outstanding Requests:
Access Order:
16 | 14 | R | 150 | 147 | 83 | 72 |
---|
Takes advantage of a hardware feature that allows the disk head to reset to the outer edge almost-instantly.
Move the head towards inner edge until head reaches spindle, then reset to outer edge.
Reset immediately once no more requests towards inner edge: C-LOOK.
Disks can be partitioned into logically separate disks. Partitions can be used to pretend that there are more disks in a system than there actually are, but are also useful for performance, since access within a partition has shorter seeks.
In cases where extreme performance is needed (usually only for database servers), we can short-stroke the disk by restricting the partitions to the outer 1/3rd of the disk.
Solid state drives have no moving parts. This results in:
They are typically implemented using a technology called NAND flash.
set(4)
set(1)
set(4)
unset(1)?
unset(1)?
BANNED
unset_all()
This makes it nearly impossible to overwrite data in-place!
53
62
erp
Like an Etch-A-Sketch!
You can set whatever pixels you want by turning the knobs, but the only way to erase them is to reset everything at once (by shaking the unit).
Pages function similarly to sectors on a hard drive: reads and writes occur in units of minimum one page.
Okay!
Please write to pages 0, 15, 30, and 45!
Please write to pages 0, 15, 30, and 45!
Okay!
Okay!
Uh.....okay.
Please write to pages 0, 15, 30, and 45!
...how?
reset
I guess we gotta reset the blocks...
reset
I guess we gotta reset the blocks...
Guess we gotta reset the blocks.
Guess we gotta reset the blocks.
Guess we gotta reset the blocks.
We had 14 unused pages in each block, yet we reset the entire block so that we could write to one page?
Please write to pages 0, 15, 30, and 45!
Please write to pages 0, 15, 30, and 45!
REMAP 0 to 1.
0 is unusable.
REMAP 0 to 1.
0 is unusable.
REMAP 15 to 16.
16 is unusable.
REMAP 15 to 16.
16 is unusable.
We fix this the same way we fixed disk operations: try to avoid having to perform the slow operations by optimizing data layouts!
Write 1 page, erase 1 block, etc. 128 times
\( 128 \times \left( 50 \times 10^{-3} \text{ms} + 3 \text{ms} \right) \approx 385 \text{ms} \)
or 3.005 ms per page.
Write 1 page, write a second page, etc. 128 times,
then erase
\( 128 \times \left( 50 \times 10^{-3} \text{ms} + \right) + 3 \text{ms} \approx 9.4 \text{ms} \)
or 0.08 ms per page.
By amortizing erase costs, we get a 37.5x speedup!
When a user "erases" a page, lie and say the page is erased, without actually erasing, but mark the page as unusable. Then, at some later time, erase a bunch of unusable pages together.
Flash memory can stop reliably storing data:
To improve durability, the SSD's controller will use several tricks:
Metric | Spinning Disk | NAND Flash |
---|---|---|
Capacity/Cost | Excellent | Good |
Sequential IO/Cost | Good | Good |
Random IO/Cost | Poor | Good |
Power Usage | Fair | Good |
Physical Size | Fair | Excellent |
Resistance to Physical Damage | Poor | Good |
As we can see, SSDs provide solid all-around performance in all areas while being small--this is why they have essentially taken over the personal electronics arena (most phones, tablets, and laptops come with SSDs these days).
The capacity/cost ratio of hard drives means that they are a mainstay in servers and anything else that needs to store large amounts of data.
It may be a very powerful potato, but it's still a potato. The ability to communicate with external devices is required if a computer is to be useful.
These devices all have different properties and access patterns.
To do this, it uses one of three mechanisms to wait for I/O and do the actual data transfer:
Disks function as stable storage for the OS, as well as providing a large, cheap store of space.
A single disk read might take tens of millions of CPU cycles. This is because disks are complex mechanical devices that can only move so fast.
There are several major scheduling algorithms, including:
We can also partition a disk to reduce seek time for requests in a partition.
Much faster than disk drives (on average)
But don't have as much storage capacity and have to have extra code in the OS to support certain operations.
The questions in the following slides are intended to provoke some thought and tie back what you've learned in this section to things you've studied previously in this course, or to really test your knowledge of what you learned in this unit.
By Kevin Song
I'm a student at UT (that's the one in Austin) who studies things.