Computer Vision with a Superpixelation Camera

Sasidharan Mahalingam

Portland State University

Rachel Brown*

Willamette University

Atul Ingle*

Portland State University

(* Joint Last Authors)

📄 Paper [Coming Soon] 💻 Code [Coming Soon] 🎥 Video [Coming Soon]

Abstract

Conventional cameras generate a lot of data that can be challenging to process in resource-constrained applications. Usually, cameras generate data streams on the order of the number of pixels in the image. However, most of this captured data is redundant for many downstream computer vision algorithms. We propose a novel camera design, which we call SuperCam, that adaptively processes captured data by performing superpixel segmentation on the fly. We show that SuperCam performs better than current state-of-the-art superpixel algorithms under memory-constrained situations. We also compare how well SuperCam performs when the compressed data is used for downstream computer vision tasks. Our results demonstrate that the proposed design provides superior output for image segmentation, object detection, and monocular depth estimation in situations where the available memory on the camera is limited. We posit that superpixel segmentation will play a crucial role as more computer vision inference models are deployed in edge devices. SuperCam would allow computer vision engineers to design more efficient systems for these applications.

Superpixel Image Comparison

We compare memory resitricted SNIC (we artificially reduce the image resolution to fit the algorithm into the various memory budgets shown) to SuperCam images using the same amount of memory. Below are results for three chosen memory budgets 68KB, 205KB and 615KB.

68KB

205KB

615KB

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

Computer Vision Tasks

We show the performance of SuperCam on three downstream tasks: Image Segmentation, Object Detection and Monocular Depth Estimation.

Image Segmentation

Comparison of the performance of Segment Anything Model v2 on the BSD500 dataset

68KB

205KB

615KB

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

Object Detection

Comparison of the performance of YOLOv12 model on the COCO dataset

479KB

684KB

1025KB

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

Monocular Depth Estimation

Comparison of the performance of Depth Anything v2 model on the DIODE dataset

410KB

547KB

684KB

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

SNIC

SuperCam

Hardware Emulation Results

We show hardware emulation results of our proposed SuperCam design on downstream computer vision tasks; namely image segmentation performance of Segment Anything Model v2, object detection performance of YOLOv12 model and monocular depth estimation performance of Depth Anything v2 model.

Figure: Hardware Emulation results of SuperCam

Frequently Asked Questions

What are superpixels?

Superpixels are groups of connected pixels that share similar characteristics, such as color, brightness, or texture. Instead of treating an image as a rigid grid of thousands of individual pixels, superpixel algorithms "cluster" pixels into larger, irregularly shaped segments that align with the natural boundaries and structures in a scene.

Why use superpixels?

In traditional imaging, cameras capture a rigid grid of pixels regardless of the scene's complexity. Superpixels offer a more "content-aware" alternative by grouping pixels into meaningful clusters. Using them provides three primary advantages:

Computational Efficiency: Processing an image as a few hundred superpixels instead of millions of individual pixels drastically reduces the "search space" for algorithms like object tracking or depth estimation.
Feature Preservation: Unlike standard downsampling (which blurs edges), superpixels are boundary-aware. They conform to the shapes of objects in a scene, preserving high-frequency details like edges and contours that are often lost in low-resolution grids.
Reduced Redundancy: In most natural scenes, neighboring pixels carry nearly identical information (e.g., a blue sky or a white wall). Superpixels represent these uniform areas as a single unit, allowing the camera or processor to focus its "attention" only where the scene actually changes.

What is a Superpixelation Camera?

A Superpixelation Camera is a novel imaging system that integrates the concept of "superpixels" directly into the image acquisition hardware. Instead of capturing light on a rigid grid, it adaptively groups sensing elements to match the natural boundaries of the scene, allowing for more efficient and content-aware data capture.

How is SuperCam different from traditional superpixel methods like SNIC?

Traditional superpixel algorithms need access to the entire image to generate the superpixel image. So a imaging system using such algorithms has to first capture the image and then process it to generate the final superpixel image. Our proposed SuperCam design however, generates the superpixel image, on the fly, at the level of the sensor. This results in lower compute, memory and bandwidth requirements to generate a superpixel image.

What applications does this technology enable?

This technology is designed for resource-constrained environments, enabling high-performance computer vision on hardware with limited power, memory or bandwidth. Primary applications include autonomous drone navigation, Edge-AI surveillance, high-speed industrial tracking, and remote sensing where data efficiency is critical. Another potential application is privacy aware imaging. As SuperCam does not capture the acutal high-fidelity image, it is inherently privacy preserving.

What are the limitations of this technology?

The main limitation for the proposed camera design is poor performance if the objects of interest are comparable to or smaller than the size of the superpixels. The features of the object are permanently lost during capture which results in below-par performance.

Can I use this code for commercial purposes?

The code is licensed under the MIT License, which allows for commercial use. However, we are patent pending and we kindly request that you cite our CVPR 2026 paper and contact the authors for relicensing in any resulting products or publications.

BibTeX [accepted, to appear]

@article{cvpr2026supercam,
        title={Computer Vision with a Superpixelation Camera},
        author={Mahalingam, Sasidharan and Brown, Rachel and Ingle, Atul},
        journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
        year={2026}
}