Clicky

logo Pretrained Reversible Generation as Unsupervised Visual Representation Learning

1 Xi'an Jiaotong University 2 Shanghai Artificial Intelligence Laboratory
3 The Chinese University of Hong Kong 4 Nanjing University of Aeronautics and Astronautics
5 SenseTime Research
Accepted by ICCV 2025
*Equal contribution

Demo

PIP 静态示意图

Overview of the PRG as Unsupervised Visual Representation pipeline. Swiss-roll data is generated via (x, y) = (t cos t, t sin t), t ∈ [0, 3π], with a blue→red gradient as t increases.

Abstract

Recent generative models based on score matching and flow matching have significantly advanced generation tasks, but their potential in discriminative tasks remains underexplored. Previous approaches, such as generative classifiers, have not fully leveraged the capabilities of these models for discriminative tasks due to their intricate designs. We propose Pretrained Reversible Generation (PRG), which extracts unsupervised representations by reversing the generative process of a pretrained continuous generation model. PRG effectively reuses unsupervised generative models, leveraging their high capacity to serve as robust and generalizable feature extractors for downstream tasks. This framework enables the flexible selection of feature hierarchies tailored to specific downstream tasks. Our method consistently outperforms prior approaches across multiple benchmarks, achieving state-of-the-art performance among generative model based methods, including 78% top-1 accuracy on ImageNet at a resolution of 64×64. Extensive ablation studies, including out-of-distribution evaluations, further validate the effectiveness of our approach.

How can PRG work?

  • PRG proposes to turn a pretrained continuous time flow diffusion generator upside down running the model backward produces multi level features that after light fine tuning serve as an unsupervised representation extractor.
  • Stage 1: We unsupervisedly pretrain a reversible diffusion/flow model, learning the forward generative trajectory from raw data to latent space.
  • Stage 2: We run the model backward along this trajectory and jointly fine-tune it with a lightweight classifier, turning each timestep’s features into ready-to-use representations for downstream tasks.

Left — No-Frozen curves only

Right — No-Frozen + Frozen

Why choose PRG?

PRG delivers three standout benefits that make it a drop-in upgrade for modern generative pipelines:

  • Architecture-agnostic flexibility. PRG decouples the latent variable Z from the backbone, so U-Nets, Transformers, or any future network can plug in without retraining the representation.
  • Infinite-layer expressiveness. Built on continuous-time stochastic flows, PRG enjoys the “infinite-layer” capacity behind models like SD3 and DALL·E 3, yet keeps parameter count lean and supports joint generative-and-discriminative training.
  • Robustness & generalizability. Features stay stable across solvers (Euler, RK, etc.) and time steps, transfer effortlessly to new datasets and community diffusion/flow models, and fine-tune rapidly for fresh tasks.

Dive into our paper for the full technical breakdown and experimental results!

PRG EXP

CIFAR-10

Method Param. (M) Acc. (%)
Discriminative methods
WideResNet-28-10 [70] 36 96.3
ResNeXt-29-16×64d [66] 68 96.4
Generative methods
GLOW [19] N/A 84.0
Energy model [18] N/A 92.9
SBGC [73] N/A 95.0
HybViT [67] 43 96.0
DDAE [65] 36 97.2
Our methods
PRG-GVP-onlyPretrain 42 54.10
PRG-GVP-S 42 97.35
PRG-ICFM-S 42 97.59
PRG-OTCFM-S 42 97.65

Tiny-ImageNet

Method Param. (M) Acc. (%)
Discriminative methods
WideResNet-28-10 [70] 36 69.3
Generative methods
HybViT [67] 43 56.7
DDAE [65] 40 69.4
Our methods
PRG-GVP-onlyPretrain 42 15.34
PRG-GVP-S 42 70.98
PRG-ICFM-S 42 71.12
PRG-OTCFM-S 42 71.33

ImageNet

Method Param. (M) Acc. (%)
Discriminative methods
ViT-L/16 (384²) [17] 307 76.5
ResNet-152 (224²) [23] 60 77.8
Swin-B (224²) [39] 88 83.5
Generative methods
HybViT (32²) [67] 43 53.5
DMSZC-DiTXL2 (256²) [34] 338 77.5
iGPT-L (48²) [10] 1362 72.6
Our methods
PRG-GVP-onlyPretrain (64²) 122 20.18
PRG-GVP-XL (64²) 122 77.84
PRG-ICFM-XL (64²) 122 78.12
PRG-OTCFM-XL (64²) 122 78.13

BibTeX

@article{xue2024pretrained,
  title={Pretrained Reversible Generation as Unsupervised Visual Representation Learning},
  author={Xue, Rongkun and Zhang, Jinouwen and Niu, Yazhe and Shen, Dazhong and Ma, Bingqi and Liu, Yu and Yang, Jing},
  journal={arXiv preprint arXiv:2412.01787},
  year={2024}
}