During GTC 2026, VDURA showcased key updates to its Data Platform, engineered to enhance GPU utilization and storage efficiency in AI environments. The announcement features three major highlights: the general availability of Remote Direct Memory Access (RDMA), a preview of its innovative Context-Aware Tiering technology, and validated infrastructure configurations built around AMD EPYC Turin CPUs and NVIDIA ConnectX-7 networking components.
These updates are designed to eliminate data movement bottlenecks between GPU clusters and storage systems, while optimizing data placement across storage tiers to better support large-scale AI training and inference workloads—addressing critical pain points in modern AI infrastructure.
RDMA Enables GPU-Direct Data Paths
VDURA has integrated RDMA support across its entire Data Platform, enabling GPU servers to access storage directly over the network without CPU involvement. This breakthrough allows GPU-to-storage data transfers to bypass traditional kernel and CPU-mediated pathways, significantly reducing latency and boosting throughput—delivering the low-latency, high-throughput data path that AI training and inference workloads demand at scale.
VDURA Global Namespace
The RDMA implementation is tightly integrated with VDURA DirectFlow, the company’s proprietary data movement layer, ensuring all GPU server traffic leverages RDMA. By eliminating CPU overhead in the data path, compute resources remain fully dedicated to core model training and inference tasks. This approach sustains higher GPU utilization rates while minimizing pipeline latency in distributed AI clusters, a key priority for large-scale AI deployments.
Context-Aware Tiering Targets Data Placement Efficiency
VDURA also detailed the first phase of its Context-Aware Tiering capability, scheduled for general release later in 2026. This technology introduces intelligent, automated data placement across storage tiers based on real-time workload behavior and access patterns—moving beyond static policies to ensure data resides exactly where it is most needed.
The initial phase extends the DirectFlow buffer into local NVMe SSDs, allowing frequently accessed “hot” data to reside closer to compute resources. This reduces reliance on shared or network-attached storage for active data, improving response times for critical workloads and further optimizing performance.
Additionally, the platform introduces KVCache writeback controls, which selectively persist only persistence-critical inference data to durable storage. This minimizes unnecessary I/O activity while upholding the persistence guarantees required by production AI inference pipelines, striking a balance between efficiency and reliability.
VDURA is also rolling out a unified Context Cache Tiering framework that spans DRAM and local SSD. This framework enables high-speed read and write access comparable to LMCache-class performance, making it well-suited for use cases such as long-context LLM inference and retrieval-augmented generation (RAG).
VDURA noted that future phases of Context-Aware Tiering will expand into application-aware data placement, enhanced cache coherence across nodes, and support for emerging infrastructure components like NVIDIA BlueField-4 DPUs—further extending the platform’s capabilities as AI workloads evolve.
Complementing these software enhancements, the company introduced optimized platform configurations that pair AMD EPYC Turin processors with NVIDIA ConnectX-7 network adapters. These configurations are purpose-built to complement RDMA-enabled data paths, supporting high-throughput, low-latency communication between GPU clusters and storage systems—setting a new benchmark for GPU-native AI infrastructure.
Full-Stack AI Data Pipeline Focus
VDURA CEO Ken Claffey emphasized the company’s focus on delivering an AI storage platform that spans the entire data hierarchy from memory to long-term storage, with no compromises on performance. He highlighted that the platform leverages RDMA for direct, CPU-free data access and Context-Aware Tiering to intelligently position data across storage tiers—innovations that help organizations support larger AI models, handle more inference requests, and scale AI infrastructure while meeting production-grade reliability requirements.
This combined approach is specifically designed to support larger model sizes, increase inference throughput, and improve overall infrastructure efficiency—all while upholding the reliability and compliance standards essential for production AI deployments.
Availability
RDMA support is now generally available on the VDURA V5000 and V7000 platforms, ready for immediate deployment. Context-Aware Tiering Phase 1 is slated to reach general availability later in 2026, with early access programs currently underway for select customers to test and optimize the technology before its full release.
Beijing Qianxing Jietong Technology Co., Ltd.
Sandy Yang/Global Strategy Director
WhatsApp / WeChat: +86 13426366826
Email: yangyd@qianxingdata.com
Website: www.qianxingdata.com/www.storagesserver.com
Business Focus:
ICT Product Distribution/System Integration & Services/Infrastructure Solutions
With 20+ years of IT distribution experience, we partner with leading global brands to deliver reliable products and professional services.
“Using Technology to Build an Intelligent World”Your Trusted ICT Product Service Provider!
Sandy Yang/Global Strategy Director
WhatsApp / WeChat: +86 13426366826
Email: yangyd@qianxingdata.com
Website: www.qianxingdata.com/www.storagesserver.com
Business Focus:
ICT Product Distribution/System Integration & Services/Infrastructure Solutions
With 20+ years of IT distribution experience, we partner with leading global brands to deliver reliable products and professional services.
“Using Technology to Build an Intelligent World”Your Trusted ICT Product Service Provider!



