Clicky

logo Pretrained Reversible Generation as Unsupervised Visual Representation Learning

1 Xi'an Jiaotong University 2 Shanghai Artificial Intelligence Laboratory
3 The Chinese University of Hong Kong 4 SenseTime Research
Accepted by ICCV 2025
*Equal contribution

Demo

PIP 静态示意图

Overview of the PRG as Unsupervised Visual Representation pipeline. Swiss-roll data is generated via (x, y) = (t cos t, t sin t), t ∈ [0, 3π], with a blue→red gradient as t increases.

Abstract

Recent generative models based on score matching and flow matching have significantly advanced generation tasks, but their potential in discriminative tasks remains underexplored. Previous approaches, such as generative classifiers, have not fully leveraged the capabilities of these models for discriminative tasks due to their intricate designs. We propose Pretrained Reversible Generation (PRG), which extracts unsupervised representations by reversing the generative process of a pretrained continuous generation model. PRG effectively reuses unsupervised generative models, leveraging their high capacity to serve as robust and generalizable feature extractors for downstream tasks. This framework enables the flexible selection of feature hierarchies tailored to specific downstream tasks. Our method consistently outperforms prior approaches across multiple benchmarks, achieving state-of-the-art performance among generative model based methods, including 78% top-1 accuracy on ImageNet at a resolution of 64×64. Extensive ablation studies, including out-of-distribution evaluations, further validate the effectiveness of our approach.

How can PRG work?

  • Stage 1: We unsupervisedly pretrain a reversible diffusion/flow model, learning the forward generative trajectory from raw data to latent space.
  • Stage 2: We run the model backward along this trajectory and jointly fine-tune it with a lightweight classifier, turning each timestep’s features into ready-to-use representations for downstream tasks.

Left — No-Frozen curves only

Right — No-Frozen + Frozen

Why choose PRG?

PRG (Pretrained Reversible Generation) delivers three standout benefits that make it a drop-in upgrade for modern generative pipelines:

  • Architecture-agnostic flexibility. PRG decouples the latent variable Z from the backbone, so U-Nets, Transformers, or any future network can plug in without retraining the representation.
  • Infinite-layer expressiveness. Built on continuous-time stochastic flows, PRG enjoys the “infinite-layer” capacity behind models like SD3 and DALL·E 3, yet keeps parameter count lean and supports joint generative-and-discriminative training.
  • Robustness & generalizability. Features stay stable across solvers (Euler, RK, etc.) and time steps, transfer effortlessly to new datasets and community diffusion/flow models, and fine-tune rapidly for fresh tasks.

Dive into our paper for the full technical breakdown and experimental results!

BibTeX

@article{xue2024pretrained,
  title={Pretrained Reversible Generation as Unsupervised Visual Representation Learning},
  author={Xue, Rongkun and Zhang, Jinouwen and Niu, Yazhe and Shen, Dazhong and Ma, Bingqi and Liu, Yu and Yang, Jing},
  journal={arXiv preprint arXiv:2412.01787},
  year={2024}
}