2021 Data Storage Trends: What They Mean to Business Leaders?
Here are some of the latest storage trends businesses can look out for as we enter a new year
- Increasing significance of hierarchical security (for data at rest and in flight).
A continued growth in the trend of hyperscale software ecosystems is underway, allowing for applications to be developed and deployed on smaller "atomic units" for business and locations that may not have the connectivity infrastructure required. More and more cloud native applications run in points of presence or colocation facilities around the world. With this asset partnership model becoming increasingly more common, it is necessary to protect data at each step of the process. In flight and at rest are critical spheres of protecting user data in a more distributed deployment model.
Advice for business leaders: Data-at-rest encryption is becoming increasingly mandatory in many industries as a way to avoid both external and insider threats. Although it might not be mandatory in your particular industry today, it might well become so tomorrow. Therefore, Seagate recommends moving to using encrypted disks as soon as possible to ensure there are no disruptions in the future if the requirement sneaks up on you.
- A broader adoption of object storage by enterprises.
With the explosion of useful data, object store is becoming the standard for mass capacity and offers advantages over traditional file stores including prescriptive metadata, scalability, and no hierarchical data structure. Systems benefit from greater intelligence incorporated in data sets and object stores provide this intelligence. Storage types include block, file, and object. Block is critical for so many mission critical applications that are performance sensitive. File has serviced legacy applications and provided robust architecture for years. Object storage is focused on new application development in combination with block storage to provide scale and performance in a symbiotic fashion. Many legacy file applications are also migrating to object storage infrastructure to take advantage of the economies of scale that object storage enables.
Advice to business leaders: Object storage is quickly becoming the de-facto standard for capacity storage quickly augmenting and displacing file storage due to improved economic efficiencies and scalability. Additionally, new programmers graduating today increasingly build workflows assuming object storage interfaces. Hire those people. If you haven't yet added object storage to your data center, now is the time to do so.
- A greater adoption of composability.
While the idea to separate systems into independent units that can be combined with other independent units is not new, a broader, open source-reliant adoption of composability is underway. Kubernetes — the open-source system for automating deployment, scaling, and management of containerized application — is at the core of this trend. Open source is the future of application development because it enables a much larger community to work on the problems that challenge many industries and it also allows for domain specific solutions that are leveraged from the open architectures. Composing the HW to optimally meet the SW or business needs is a natural migration.
Advice to business leaders: Today's data centers are moving toward composability because it provides easier deployment and redeployment of resources without requiring a priori configurations and statically configured ratios between compute, memory, and storage. Containers and Kubernetes are the core mechanisms of composability and it behooves all data centers to start embracing these technologies if they haven't already.
- Tiering in mass storage deployment (locate hot data on flash and all else on disk).
NVIDIA designs for GPUs divide the memory into different levels (registers, shared, and global). Each has different properties. Registers, which are low latency, have low memory. Global, which has high latency, offers large memory. NVIDIA provides a software interface that allows one to take advantage of the tiered memory and program optimized solutions to that architecture. Similarly, SSDs and HDDs can be viewed at different tiers. There is simply too much valuable data being created to consider a homogeneous storage strategy efficient.
Why does all this matter? A storage system that consists of all high-performance storage devices is likely more expensive than it needs to be. And a storage system that consists of all-mass capacity devices is likely not as performant as it needs to be. That’s how we arrived at the current tiering trend: it’s the way to strike the most efficient balance between cost and performance needs. With the advent of additional technologies (such as storage class memories), the need for architectures that can extract the most value from all classes of storage is paramount.
Advice to business leaders: In a world with infinite budget, data centers would be composed of merely very expensive storage media such as Intel's 3DXPoint. Unfortunately, economic realities preclude this and dictate hierarchical tiering — in which hot data resides on high-cost, high-performance media and less frequently accessed data resides on affordable, mass capacity media. Fortunately, data center software has become increasingly adept at identifying hot and cold data and migrating it accordingly. If your data center is not yet taking advantage of heterogeneous media for this purpose, you are either losing performance or paying more than is necessary for your storage.
Formative AI Increases the Usefulness of Data
Not only is data creation exploding, but the amount of that data that is useful is growing as well. Even archived data is being resurrected because advances related to AI/ML are allowing users to mine additional information from once archived data. Enterprise leaders must be prepared to store more data than ever to train models, for mining of critical information, and for archiving data given that its useful life is likely extending. Formative AI is a means through which data becomes more insightful. Gartner defines formative AI as "a type of AI capable of dynamically changing to respond to a situation." IDC sees formative AI as "an umbrella term given to a range of emerging AI and related technologies that can dynamically change in response to situational variances." Formative AI relates to the tiering trend since it depends on having a flexible architecture that can react to changes intelligently. You might be monitoring an AI model and get a signal that it is drifting. You could then use another model to search for the appropriate training data on your disk tier, and automatically move it to the flash tier to make the training go faster. The disk tier would also likely be an object store, so that ties in the object storage trend as well. The advantages are speed (since the data is moved to your fast tier automatically) and cost (since you can store on an inexpensive disk in an easy to access format until you need the data).
Advice to business leaders: Recent innovations in machine learning have finally unlocked the long-promised potential of artificial intelligence. Now these machine learning techniques clamor for ever larger data sets from which to extract ever more accurate insights. Because future insights and advancements in machine learning are challenging to predict, businesses today should start saving as much of their data as they possibly can to ensure that any and all future analyses can be done with the best possible training data.
Dr. John Morris is the Senior Vice President and Chief Technology Officer at Seagate.