Efficient analysis of large-scale simulation data with Los Alamos and AirMettle partnership

Business
Webp grider
Gary A. Grider | https://www.usenix.org/conference/fast16/speaker-or-organizer/gary-grider-los-alamos-national-laboratory#:~:text=Gary%20Grider%20is%20the%20Leader,and%20deployment%20at%20Los%20Alamos.

In a collaboration between Los Alamos National Laboratory and AirMettle, a solution has been developed for the efficient analysis of complex data from large-scale simulation campaigns while ensuring data security. This innovative approach involves conducting certain aspects of data analytics in close proximity to the data storage which minimizes data transfer according to a press release.

Los Alamos National Laboratory announced, "These enhancements build on the benefits of AirMettle’s existing unique architecture."

On October 10, 2023, Los Alamos National Laboratory unveiled a collaborative effort with AirMettle to introduce standardized APIs for computational storage devices. This development, built upon AirMettle's Real-Time Smart Data Lake (RT-SDL) architecture, involves the establishment of a common Applications Programming Interface (API) extending the Non-Volatile Memory Express standard. This innovation enables in-place analytics, significantly reducing the costs and time required for scientific insights while maintaining robust data protection through erasure coding within the RT-SDL framework according to a press release.

“Our scientific large-scale simulations can generate hundreds of petabytes of highly dimensional floating-point data,” said Gary Grider, High Performance Computing division leader at Los Alamos, according to a press release. “But the data associated with a scientific feature of interest can be orders of magnitude smaller than the written data, so a key challenge is quickly and efficiently finding what’s relevant in this sea of data. To optimize this process, we’ve been drawn towards computational storage — processing data in-place and near storage — to eliminate unnecessary data movement while maintaining parallelism and adequate data protection.”

AirMettle is creating a real-time smart data lake system to streamline big data analytics, substantially boosting processing speed. This system operates within the data lake storage layer, managing ETL processes and computations, reducing network congestion, improving data currency, and enabling real-time operations according to the AirMettle website.

“Accelerating analytics of vast volumes of experiment and simulation data is a key requirement and challenge for the scientific community,” said Donpaul Stephens, founder and CEO of AirMettle, Inc., according to a press release.