Though internal buffer is obviously improving performance yet it also has the limitations. It helps very little if you are doing a lot of random accesses to data in different parts of the disk, because if the disk has not loaded a piece of data recently in the past, it will not be in the cache.
The buffer is also of little help if you are reading a large amount of data from the disk because normally it will be very small if you are copying a 50 MB file. For example, on a typical disk with a 512 Bytes buffer a very small part of the file could be in the buffer and the rest must be read from the disk itself.
Due to these limitations, the cache does not have as much of an impact on overall system performance as you might think. How much it helps depends on its size to some extent, but at least as much on the intelligence of its circuitry; just like the hard disk's logic overall. And just like the logic overall, it's hard to determine in many cases exactly what the cache logic on a given drive is like. However the size of the cache of the disk is important to its overall impact in improving the performance of the system.
Caching reads from the hard disk and caching writes to the hard disk are similar in some ways, but very different in others. They are the same in their overall objective that is to decouple the fast computer from the slow mechanics of the hard disk. The key difference is that a write involves a change to the hard disk while a read does not.
With no write caching, every write to the hard disk involves a performance hit while the system waits for the hard disk to access the correct location on the hard disk and write the data. This takes at least 10 milliseconds on most drives, which is a long time in the computer world and really slows down performance as the system waits for the hard disk. This mode of operation is called write-through caching.
When write caching is enabled and the system sends a write to the hard disk, the logic circuit records the write in its much faster cache and then immediately sends back an acknowledgement to the operating system for completion of process. The rest of the system can then proceed on its way without having to sit around waiting for the actuator to position and the disk to spin, and so on. This is called write-back caching, because the data is stored in the cache and only written back to the platters later on. Write-back functionality of course improves performance.
Since cache memory is volatile, if the power goes out, its contents are lost. If there were any pending writes in the cache that were not written to the disk yet, they are gone forever and the rest of the system has no way to know this because when it is told by the hard disk as the completion. Therefore not only is some data lost but also the system does not even know which data, or even that it happened. The end result can be file consistency problems, operating system corruption, and so on. Due to this risk, in some situations write caching is not used at all.
This is especially true for applications where high data integrity is critical. Due to the improvement in performance that write caching offers, however, it is increasingly being used despite the risk, and the risk is being mitigated through the use of additional technology.
The most common technique is simply ensuring that the power does not go off. For added peace of mind, better drives that employ write caching have a write flush feature that tells the drive to immediately write to disk any pending writes in its cache. This is a command that would commonly be sent before the UPS batteries ran out if a power interruption was detected by the system or just before the system was to be shut down for any other reason.
|