Impact of AI on Storage Requirements

AI Adoption Will Impact Corporate Storage Requirements

By Mitch Klaassen

AI adoption is expected to drive exponential growth in data storage demand through 2028. In response to a Recon Analytics survey, commissioned by Seagate Technology and conducted in November 2024 on this topic, 61% of infrastructure buyers who predominately use cloud storage for AI data management  said they expect storage requirements to at least double by 2028, coming from longer retention times of 6 months to forever, 73% using daily or weekly LLM checkpointing, and 80% deem data replication for AI very or moderately important. 95% of storage buyers, using AI or planning to, say they are taking measures to accommodate the growing storage requirements, including 61% adopting more scalable storage, 56% implementing data management software, 49% using compression techniques and 55% upgrading existing storage infrastructure.  Recon Analytics’ research finds that wherever AI is adopted, existing storage practices will need to be upgraded to realize the full potential of AI.

Adapting Beyond Traditional Storage

As storage requirements grow, expect both cloud and on-premises storage to continue to grow. According to the 1,062 respondents surveyed, cloud storage is expected to remain the main storage vehicle for AI with 65% of data stored in the cloud versus in-house in 2024 and increasing to 69% by 2028 [see figure 1].  61% of respondents who predominately use cloud storage say their storage requirements will increase by over 100% over the next 3 years.

Figure 1: Cloud Usage as Percent of Customer’s Storage Current vs Future

46% of respondents believe that existing data storage methods will not be enough to keep up with demand.  Additional data storage solutions are being adopted to manage the increasing file sizes and quantity generated by AI [see figure 2], including 61% expanding usage of cloud storage solutions, 55% upgrading existing infrastructure, 56% adopting enhanced data management software and 49% implementing data compression techniques.

Figure 2: Measures Companies Take to Adapt to Growing Data Needs from AI

 

AI Infrastructure Components

Storage ranks as the second most important component of AI infrastructure per the survey respondents, only following security in importance [see figure 3]. 25% of respondents said security was the most important components followed by 18% saying storage. Sixty-six percent of respondents ranked storage amongst their four most important infrastructure concerns, while 68% ranked security in the top four. Compute and energy have been the hot topics of the AI conversation over the past few years, but storage and security are ranked higher when looking from the storage infrastructure buyer perspective.

Figure 3: AI Infrastructure Component Importance

 

Storage Growth from AI Model Retention

90% of respondents who have adopted AI believe longer data retention improves the quality of AI outcomes [see figure 4]. Of which 93% claim data retention requirements have changed due to the implementation of AI and the ability to refine models including checkpoints. The more data storage a company utilizes the more they see that longer retention times improve the quality of AI outcomes [see figure 4].  The importance of data replication to a company’s AI data management strategy also increases the amount of storage a company uses [see figure 4]. 52% of respondents who are currently using AI and who are also using more than 100 PB of storage, deem data replication improves AI outcomes as very important.   

Figure 4: Longer Data Retention Times Improve AI Outcomes by Current Storage Usage

73% of respondents say AI training is driving increased data storage as they are backing up their previously saved checkpointing data on a daily to weekly basis [see figure 5]. Compounding the storage impact of saving AI checkpoints, infrastructure buyers also need to factor in how long they will save each checkpoint as part of the LLM training. Of those respondents saving checkpoints daily (28% of respondents), 32% are retaining data for more than 12 months while 29% are retaining for six to 12 months. Companies already using 100+PB of storage are saving and backing up checkpoints on a daily to weekly basis with 87% of them storing these checkpoints in the cloud or in a mix of HDD and SDD [see figure 6].

Figure 5: Frequency of AI Model Training Checkpoints by Current Storage Usage

Figure 6: Checkpoint Backup Frequency and Location for Companies with 100+ PB of Storage

AI Adoption will Drive Future Storage Growth

As AI use cases and adoption becomes more pervasive, Recon Analytics forecasts companies will see exponential growth in their storage requirements. This will become even more evident when businesses move from their early AI trialing phase to being active AI users.  Training LLMs, data replication and longer data retention periods, all key elements of an AI strategy, will require increased storage investments to be successful.    

Study Background: In November of 2024, Recon Analytics surveyed 1,062 storage infrastructure buyers and decision makers from companies reporting greater than $10 million in annual revenues and in excess of 50 TB of current storage capacity across 10 counties. Each respondent included in the survey had to have already adopted AI or have plans to adopt AI in the next 3 years. Of those 1,062 respondents 72% are currently using AI and 28% plan to use AI in the next 3 years.  This study was commissioned by Seagate.