Optimizing energy-hungry data-centersNow that digital video is ubiquitous, there are vast oceans of data being stored in gigantic data centers around the world. To ensure the accessibility and integrity of all that data, multiple copies of the files are being held, sometimes within the same data center, and sometimes at completely different locations. What if it was possible to reduce that redundancy without compromising the user experience or increasing the risk of data loss? This would mean that data centers could host the same amount of data on fewer pieces of hardware, reducing their energy use, which is significant and growing fast.
That's exactly what a group of researchers from Alcatel-Lucent’s Bell Labs and MIT claim can be achieved thanks to their smart model of Storage Area Networks with Network Coding.
Not being an IT expert, a lot of what is in their paper (pdf) is over my head, but their conclusions are clear: They claim that a 20 to 50% reduction in energy use is possible for enterprise-level storage area networks.
MIT's Technology Review has a simpler explanation, but it doesn't quite give enough details to truly understand how this works:
So-called storage area networks within data center servers rely on a tremendous amount of redundancy to make sure that downloading videos and other content is a smooth, unbroken experience for consumers. Portions of a given video are stored on different disk drives in a data center, with each sequential piece cued up and buffered on your computer shortly before it’s needed. In addition, copies of each portion are stored on different drives, to provide a backup in case any single drive is jammed up. A single data center often serves millions of video requests at the same time.
The new technology, called network coding, cuts way back on the redundancy without sacrificing the smooth experience. Algorithms transform the data that makes up a video into a series of mathematical functions that can, if needed, be solved not just for that piece of the video, but also for different parts. This provides a form of backup that doesn’t rely on keeping complete copies of the data. Software at the data center could simply encode the data as it is stored and decode it as consumers request it. (source)
So if you want all the details, check out the paper.
The authors will test their idea to make sure that the practice matches the theory, but if it does, they claim that there's nothing that keeps it form being deployed fairly quickly (and data center operators probably will welcome the idea, as energy costs are at the top of their expense list).