# CRAFFT: High Resolution FFT Accelerator in Spintronic Computational RAM

Hüsrev Cılasun, Salonik Resch, Zamshed Chowdhury, Erin Olson, Masoud Zabihi, Zhengyang Zhao, Thomas Peterson, Jian-Ping Wang, Sachin Sapatnekar, Ulya Karpuzcu

Department of Electrical and Computer Engineering

University of Minnesota



University of Minnesota Driven to Discover<sup>sm</sup>



## Speaker Bio: Hüsrev Cılasun

- BSc from Istanbul Technical University, 2016
- Aselsan, Inc between 2016-2019
- PhD student at University of Minnesota since Fall 2019
- Adaptive, Technology-aware Architectures (ALTAI) Lab (PI: Ulya Karpuzcu)
  - Pushing traditional computing to its limits
  - Computing with post-CMOS devices and paradigms



### Motivation

- Ultra high resolution FFT (256K+ points)
  - Wireless communication, Wide-band spectrum analysis
  - Radar signal processing, sonar, echography
  - Frequency-hopping transmission detection
  - Telescope array imaging, High-res medical imaging
- Increased demand for
  - Memory access
  - Parallelism
  - Faster computation
  - Energy reduction
- Conventional hardware can't address efficiently!



## **CRAFFT** Solution

- Spintronic Computational RAM (CRAM)
  - Seamless memory access
  - Built-in massive parallelism
  - Energy-efficient logic
- Non-volatile memory
- True in memory computation
- Memory mode/logic mode



#### **CRAM MTJ Device**





#### CRAM MTJ Cell



#### Anti-Parallel (AP) High Resistance



### CRAM MTJ Cell Write



#### Parallel (P) Low resistance



### CRAM MTJ Cell Read









#### **CRAM** Gate Implementation





### Singleton's FFT in CRAM



**Computation:** Complex mult. by  $\omega$ , add.



### Singleton's FFT in CRAM



#### Data Transfer: Inter-subarray communication



#### Singleton's FFT in CRAM



#### Twiddle Distribution: Factor Broadcast



# **Evaluation Setup**

- In-house simulator
  - Functional verification
  - Energy and time extraction
- NVSim for peripherals
- Fixed-point and proof-ofconcept floating-point



| Parameter         | Value                          |
|-------------------|--------------------------------|
| Total memory      | 17MB – 66MB<br>(1M FFT)        |
| MTJ Resistances   | $3.15 k\Omega - 76.39 k\Omega$ |
| Switching Time    | 3ns – 1ns                      |
| Switching Current | 40µA – 3µA                     |



#### **Evaluation Results**



# Conclusion

- Fixed-point
  - Up to 2.73× speedup
  - Significantly lower Energy and EDP
- Proof-of-concept floating point
  - Up to 3.2× speedup
  - $1.93 \times$  more energy but  $1.66 \times$  lower EDP
- Acceleration for ultra-high res FFT
- Efficient addressing of scalability demands



# Questions?

