Summary: Researchers have created the first sort-in-memory hardware system that can perform complicated, linear sorting operations without using traditional comparators. The staff demonstrated a quick, energy-efficient, and robust architecture built on memristors using a book Digit Read system and a novel Tree Node Skipping engine.
Comparing regular ASIC-based sorters, benchmark tests revealed significant improvements in speed, power, and area efficiency. This development opens the door for high-performance, smart technology in AI, big information, and edge computing.
Important Information
- Comparator-free style: New TNS and Digit Read algorithms eliminate bottlenecks in converter style.
- Outstanding gains: Compared to ASIC sorters, up to 7.7 % speed and 160 % energy efficiency were achieved.
- Programs that are versatile: Validated in large files workloads, neural systems, and pathfinding.
Peking University Cause
A study team led by Prof. Yang Yuchao from Peking University Shenzhen Graduate School of Electronic and Computer Engineering has created the first sort-in-memory technology program specifically designed for challenging, linear processing jobs.
The study, which was published in Nature Electronics ‘” A quick and reconfigurable sort-in-memory system based on memristors,” proposes a comparator-free architecture, overcoming one of the most challenging issues in the field of processing-in-memory ( PIM ) technology.
Background
Although sorting is a basic computing task, using standard hardware to accelerate is challenging due to its linear nature. Memristor-based PIM designs have shown guarantee for horizontal procedures, but they have long had trouble sorting.
The team led by Prof. Yang’s group eliminated the need for comparators, introduced a novel Digit Read system, and created a fresh algorithm and equipment design that reimagined how sorting can be carried out in memory.
What Makes A Difference?
This function represents a major advance in the transition from straight matrix operations to linear, high-complexity tasks like sorting, according to this work.
The team develops a high-throughput, energy-efficient option that meets the performance requirements of contemporary big data and AI applications by proposing a robust and configurable sorting framework.
Key results
The study describes a one-transistor–one-resistor ( 1T1R ) memristor array-based comparator-free sorting system that uses a Digit Read mechanism, which replaces traditional compare-select logic and significantly improves computational efficiency.
Additionally, the team created the Tree Node Skipping ( TNS ) algorithm, which reduces the need for extra operations and speeds up sorting by reusing traversal paths. Three Cross-Array TNS ( CA-TNS ) strategies were developed to scale performance across a range of datasets and configurations.
The Multi-Bank approach uses the multi-conductance states of memristors to improve intra-cell parallel, Bit-Slice distributes little widths to help pipelined sorting, and Multi-Level uses large datasets across arrays for parallel digesting.
Together, these advancements create a flexible and adaptable processing pedal that can handle a range of data widths and complexity.
Application presentations
The group created a memristor device and integrated it with FPGA and PCB technology to create a full, end-to-end show system in order to evaluate real-world efficiency. In standard testing, it delivered up to 77.70 faster rate, 160.4 higher energy efficiency, and 32.46 more region performance than the best ASIC-based sorting systems.
The system’s effectiveness was demonstrated in real-world situations, such as when it was able to calculate the shortest paths between 16 Beijing Metro stations while using little power and latency.
Its integration of TNS with memristor-based matrix-vector multiplication in the PointNet+ + model enabled run-time tunable sparsity in neural network inference, leading to 15 times faster and 67.1 % more energy efficiency.
These results demonstrate the system’s broad application in both conventional and AI-driven workloads.
Future effects
What is possible in processing-in-memory systems is redefined in this work. Prof. Yang’s team has opened the door to the development of next-generation intelligent hardware capable of supporting AI, real-time analytics, and edge computing by demonstrating a flexible, effective, and scalable sorting system. It pushes the limits of what memristor-based systems can achieve by laying the groundwork for future nonlinear computation acceleration.
About this research in AI and computational neuroscience
Author: Jiang Zhang
Source: Peking University
Contact: Jiang Zhang – Peking University
Image: The image is credited to Neuroscience News
Original Research: Disclosed access.
Yang Yuchao and colleagues have developed” A quick and reconfigurable sort-in-memory system based on memristors.” Nature Electronics
Abstract
A quick and easily reconfigurable memristor-based sort-in-memory system
Sorting is a fundamental component of contemporary computing systems. Hardware sorters are typically based on von Neumann architecture, and the performance of these are constrained by the CMOS memory and data transfer speed.
Memristors could be used to reduce these limitations, but current systems still rely on comparison operations, which results in limited sorting performance.
We present a quick and reconfigurable sort-in-memory system that uses one-transistor–one-resistor memristor array digit reads.
We create digit-read tree node skipping that supports a range of data types and quantities. We apply cross-array tree node skipping using multi-bank, bit-slice, and multi-level strategies.
We demonstrate experimentally that our paper-free sort-in-memory system can increase throughput by 7.70, energy efficiency by 160.4, and area efficiency by 32.46 in comparison to conventional sorting systems.
We apply Dijkstra’s shortest path search and neural network inference with in situ pruning to illustrate the potential of the approach to solve practical sorting problems as well as its compatibility with other compute-in-memory schemes.