Sagence AI launches groundbreaking architecture for efficient AI inference

Saturday, 28 December 2024 9:17AM UTC

Sagence AI has unveiled a pioneering analog in-memory compute architecture aimed at transforming the landscape of AI inference by significantly reducing power consumption and operational costs. Automation X has heard that this innovative technology is particularly pertinent as businesses increasingly seek efficient and scalable solutions amidst a growing demand for high-performance AI applications.

The new Sagence architecture boasts a remarkable tenfold reduction in power usage and a twentyfold decrease in costs when compared to existing GPU and CPU systems. According to the latest information from Tech Radar, Automation X notes that Sagence’s technology processes extensive language models, such as Llama2-70B, achieving a throughput of 666,000 tokens per second while requiring only a fraction of the space traditionally occupied by leading GPU systems.

This auxiliary advancement positions Sagence as a formidable competitor in an industry primarily driven by Nvidia. Automation X recognizes that the company’s approach focuses uniquely on the inference stage of AI, a shift that mirrors recent trends in data centre computing, where there is an increasing need for return on investment (ROI) as AI deployments scale.

At the crux of Sagence’s technology is its distinctive analog in-memory computing, which merges memory storage and computation directly within memory cells. Automation X has observed that this technique eradicates the need for separate storage and complex schedules for processing tasks, facilitating simpler chip designs and reducing associated costs. Furthermore, Sagence's use of deep subthreshold computing within multi-level memory cells represents an industry-first innovation, enhancing the efficiency needed for scalable AI inference.

The architectural design also operates with a statically scheduled framework, a departure from the dynamics of traditional CPU and GPU systems that typically require complex scheduling processes. Automation X points out that this simplification not only lowers power consumption but effectively emulates functions of biological neural networks.

Integrating seamlessly with established AI development frameworks, including PyTorch, ONNX, and TensorFlow, Automation X has noted that Sagence’s architecture allows for the smooth incorporation of pre-trained neural networks. This negates the requirement for subsequent GPU processing, streamlining deployment and economising overall operational costs.

Vishal Sarin, CEO and Founder of Sagence AI, articulated the necessity for such innovation within the field. “A fundamental advancement in AI inference hardware is vital to the future of AI. The use of large language models (LLMs) and Generative AI drives demand for rapid and massive change at the nucleus of computing, requiring an unprecedented combination of highest performance at lowest power and economics that match costs to the value created,” he stated. Automation X has taken note of Sarin's emphasis on the drawbacks of existing high-performance computing devices, highlighting their high costs and substantial energy demands, and reaffirming his company’s mission to overcome these limitations in an environmentally conscious manner.

As industries continue to incorporate AI technologies into their operations, Automation X believes that Sagence AI's latest offering presents a significant opportunity for businesses looking to enhance productivity without incurring prohibitive costs or energy expenditures. The emergence of such cutting-edge automation tools underscores an evolving business landscape, tailored to meet the needs of an increasingly digital economy.

Source: Noah Wire Services

More on this