When Fast Fourier Transform Meets Transformer for Image Restoration

· ai coding design · Source ↗

TLDR

  • ECCV 2024 paper SFHformer fuses FFT with Transformer architecture for a single framework covering 10 image restoration tasks across 31 datasets.

Key Takeaways

  • Dual-domain hybrid structure: spatial domain handles local modeling, frequency domain handles global modeling via FFT inside the Transformer.
  • Unique positional coding and frequency dynamic convolution extract per-component frequency features, not a generic shared head.
  • Benchmarked on deraining, dehazing, deblurring, desnowing, denoising, super-resolution, low-light, and underwater enhancement, claiming SOTA across all.
  • Pretrained weights released for ITS/OTS dehazing, LOLv2 low-light, and GoPro motion deblur; train code open since October 2024.
  • Extension work SWFormer (multi-domain learning) published May 2025, with code at deng-ai-lab/SWFormer.

Hacker News Comment Review

  • No substantive HN discussion yet.

Original | Discuss on HN