80 likes | 182 Views
CCoH , Tiering and USB CacheOffload. Or how flash took over our lives…. Tiering. Tiering Disk(TD) – Will combine multiple VDs into a single VD The data that is accessed the most will be in the faster media VD(e.g. SSD)
E N D
CCoH, Tiering and USB CacheOffload Or how flash took over our lives…
Tiering • Tiering Disk(TD) – Will combine multiple VDs into a single VD • The data that is accessed the most will be in the faster media VD(e.g. SSD) • Overtime data that is accessed less often will be migrated to the slower media (e.g. HDD) • Only TD will be exposed to the OS, the member VDs are hidden • Even passthrough commands will not be allowed • Apps will show TD with its own icon • VD/PD mapping will survive power failure conditions, and will be kept in DDF • Data will be moved from LBA to LBA based on a Window size (1MB was proposed) • Member VDs with different parameters will be allowed • Except for FDE and DIF where they must match • There will be no Init nor Recon at the member VD level • TD can add additional VDs
Tiering - Continued • Rebuild, Copyback and CC are allowed on the member VDs • Importing a TD will only be allowed if all member VDs are available • Member VDs can not be deleted with the TD online • Deleting a TD will result in the deletion of all of the member VDs • If a member VD dies, and a read is done to data inside that VD, a medium error is returned to the host • Punctures will be migrated along with data (e.g. if a window is moved into the SSD tier, then any punctures in that window will need to be remapped as well) • Statistics are gathered at the TD level, not at the VD
CacheCade on Host • Utilizes a WarpDrive to cache data for other drives on the server • It is based on the CacheCade concept of using SSD drives to improve performance of HDD drives • The key difference is that it will increase performance for drives that are connected to “any” controller on the server • A WarpDrive is a PCI controller that has built in “SSD drives” • Multiple Warpdrives can be used • Will support • Read Cache (apparently the only supported option in phase 1) • RO - Accelerates reads, populated based on heuristics • WT – As above, but can also populate cache upon a write • Write Cache(WB) • Populates the cache for reads as WT above • Completes writes to host as soon as the cache has the data • It will be possible to add VD/PDs to be Cached after initial configuration(same applies to removal) • Boot devices can not be cached(at time of scoping)
CacheCade on Host - Continued • Phase 1 limits: • Max of 64 VDs can be cached • Upon a reboot all read cache data will be lost • Up to 8 Warpdrives will be supported • Cache Groups will have a name • Warpdrive will be the only caching device supported at first out • In theory CCoH could support any SSD device • No PFK will be used, the presence of Warpdrive will allow activation • In theory the CCoH driver can be loaded/unloaded • We should test CCoH in hypervisor environment • The failure of one WarpDrive should not affect the underlying VD/PDs or the other WarpDrives • Failure of a VD being cached by CCoH will not be reported to OS until an I/O arrives for such device
CacheCade on Host • How does it work? • Windows are created (suggested default is 1MB) • The number of windows available is based on “the size of the WarpDrive”/size of the Window • Cache Groups are created and the devices(VDs, PDs) in the cache group will be cached • All devices will have the same cache policy • The caching Warpdrive(s) are also part of the Cache Group • Phase 1 may only support a single CacheGroup • If multiple CacheGroups do exist, memory would not be shared across groups • Data will be cached for Read or Write per settings • As data is read multiple times it populates the cache • If data is no longer being read (e.g. goes cold) and other data is being read often(e.g. goes hot), then cold data in cache is replaced by hot data using LRU • Multiple modes • Auto • All, all local, all SAN • Manual • Off
USB Cache Offload • Utilizes a SuperCap (multiple large capacitors) to keep the circuitry for memory alive in order for the contents of DDR to be copied into flash • Why did we need another “BBU” solution? • In theory, once the memory has been copied into flash, it can stay there for 10 years • As cache has become larger and more energy hungry it has become harder and harder to retain the data (e.g. we used to say 72 hours, now we say 48) • How does it work? • Upon a power loss the contents of DDR are copied into flash • When power is restored the contents of flash are copied back onto DDR • We ensure that there is enough power to be able to handle a situation where power is lost while memory is being copied from flash to DDR • Cache Offload is a Premium Feature • There are some “new” AENs • MR_EVT_BBU_REPLACEMENT_NEEDED_SOH_BAD • MR_EVT_BBU_REPLACEMENT_NEEDED
USB Cache Offload - Continued • Testing Cache Offload is very similar to testing BBUs • Test that cache is flushed • Verify that Pinned Cache is destaged • Information needs to be tested as well as existing BBU events • Sidenote: Learn cycles could exist, however since the SuperCap is able to hold enough of a charge, transparent learn cycles may be supported • However, there are other areas that we need to be concerned about • There is a set of conditions referred to as Double Dip that need to be considered • Loss of Power during flush • Reset during flush • Permutations of those two • Surprisingly, if power is restored too quickly, it could also cause problems, thus this area needs to be tested as well • Too quickly means power comes back before the full contents of DDR have been copied to flash • It is important to verify that the correct data was flushed • And that it was flushed to the correct drive