AI newsletter

Hello,

To continue reading, you don’t need to select all squares with traffic lights.😊

This week’s AI tip is about: adopting and developing AI-dedicated hardware

As we navigate the ever-evolving landscape of artificial intelligence (AI), one thing is abundantly clear: the future of AI hinges on our ability to overcome significant hurdles related to cost, computing power and energy consumption. While AI continues to revolutionize industries and enhance our daily lives, the demands it places on our technological infrastructure are immense. The key to unlocking the next wave of AI advancements lies in the adoption and development of dedicated hardware.

Today's AI systems are incredibly powerful – but they’re also resource-intensive. Training complex models and processing large datasets require substantial computational power, which translates to high costs and significant energy consumption. These factors not only limit the scalability of AI solutions but also pose environmental challenges.

Dedicated AI hardware, such as neural processing units (NPUs) and application-specific integrated circuits (ASICs) are being developed to optimize AI performance. These dedicated processors are tailored to efficiently handle matrix operations and computations prevalent in deep learning.

How do NPUs work?

An NPU is a specialized integrated circuit designed to accelerate artificial intelligence (AI) and machine learning (ML) workloads. It contains an array of processing elements (PEs) arranged in a 2D grid format.

The key components of an NPU include:

Matrix Multiplication and Addition Units: Used to efficiently compute matrix multiplications and additions, which are the core operations in neural networks.
Activation Function Units: Implement activation functions like ReLU using high-order polynomial approximations to enable non-linear transformations.
On-chip Memory: NPUs contain specialized on-chip SRAM to store weights, activations and intermediate data to minimize data movement.
DMA Engines: Direct memory access (DMA) engines enable fast and efficient data transfer between the NPU's on-chip memory and external DRAM.

NPUs leverage the inherent parallelism in neural network computations by performing a large number of multiply-accumulate (MAC) operations simultaneously across the array of PEs.

This enables them to achieve very high performance on AI inference workloads compared to CPUs and GPUs. To further improve performance and power efficiency, NPUs often use reduced precision arithmetic like 8-bit or 16-bit integer quantization. This allows them to pack more computations into a given silicon area and access more data with the same memory bandwidth.

Many NPUs employ a dataflow architecture where data flows through the PEs in a systolic manner. Each PE performs a small part of the overall computation and passes the result to its neighbor, making it possible for the NPU to keep data movement local and avoid expensive memory accesses.

In summary, NPUs achieve their high performance and efficiency on AI workloads through a combination of specialized hardware (large arrays of PEs, on-chip memory, DMA engines), parallel processing, reduced precision arithmetic, dataflow architectures and dedicated software stacks. As a result, they can greatly accelerate the execution of neural networks compared to general-purpose processors like CPUs and GPUs.

This week’s batch of AI news

1. At WWDC 2024, Apple is expected to announce a partnership with OpenAI and introduce generative AI features for the iPhone, potentially labeled "Apple Intelligence". This could include an AI-powered Siri, email summarization and drafting, voice-activated information retrieval, and tools for developers to create new AI experiences. Updates to the Vision Pro mixed reality headset are also anticipated.

2. South Korea is poised to become a medical AI powerhouse, with St. George's University highlighting five innovative applications: diagnostic assistance, robot-assisted surgery, AI in medical education, natural language processing for health care records, and AI in genomics.

3. Stability AI launched Stable Diffusion 3 Medium, described as its "most advanced text-to-image open model yet," features billion parameters for photorealistic output on consumer systems.

Chatbot soon,

Damian Mazurek

Chief Innovation Officer

About Software Mind

Software Mind engineers software that reimagines tomorrow, by providing companies with autonomous development teams who manage software life cycles from ideation to release and beyond. For over 20 years we’ve been enriching organizations with the talent they need to boost scalability, drive dynamic growth and bring disruptive ideas to life. Our top-notch engineering teams combine ownership with leading technologies, including cloud, AI and data science to accelerate digital transformations and boost software delivery.