Quantization Tutorial

Skip Subscriptions, Set up Fast Local AI for Coding, Study, and Brainstorming

Learn how to run local AI models with LM Studio's user, power user, and developer modes, keeping data private and saving monthly fees.

IEEE

SearchQ: Search-Based Fine-Grained Quantization for Data-Free Model Compression

Abstract: The huge memory and computing costs of deep neural networks (DNNs) greatly hinder their deployment on resource-constrained devices with high efficiency. Quantization has emerged as an ...

IEEE

Multiplication-Free Lookup-Based CNN Accelerator Using Residual Vector Quantization and Its FPGA Implementation

Abstract: In this paper, a table lookup-based computing technique is proposed to perform convolutional neural network (CNN) inference without multiplication, and its FPGA implementation is ...

GitHub

APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers

This repository contains the official PyTorch implementation for the CVPR 2025 paper "APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results