Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation

Abstract

In this paper, we address the challenge of compressing generative adversarial networks (GANs) for deployment in resource-constrained environments by proposing two novel methodologies: Distribution Matching for Efficient compression (DiME) and Network Interactive Compression via Knowledge Exchange and Learning (NICKEL).

DiME employs foundation models as embedding kernels for efficient distribution matching, leveraging maximum mean discrepancy to facilitate effective knowledge distillation. Simultaneously, NICKEL employs an interactive compression method that enhances the communication between the student generator and discriminator, achieving a balanced and stable compression process.

Our comprehensive evaluation on the StyleGAN2 architecture with the FFHQ dataset shows the effectiveness of our approach, with NICKEL & DiME achieving FID scores of 10.45 and 15.93 at compression rates of 95.73% and 98.92%, respectively. Remarkably, our methods sustain generative quality even at an extreme compression rate of 99.69%, surpassing the previous state-of-the-art performance by a large margin. These findings not only demonstrate our methodologies' capacity to significantly lower GANs' computational demands but also pave the way for deploying high-quality GAN models in settings with limited resources. Our code will be released soon.

Experimental Results

Comparison of stability of NICKEL-DIME and state-of-the-art compression methods

(a) Logits of D on ITGC

(b) Logits of D on ours

(c) Convergence of FID

In this figure, (a) indicates the logits of the discriminator for the pruned generator on ITGC. The green solid line represents the ideal equilibrium state. When the compression rate is 98.92% (blue dash line), it shows a more severe imbalance state compared to when the compression rate is 90.73% (red dash line). (b) indicates the logits of the discriminator for the pruned generator on NICKEL & DiME. Our method mitigates the imbalance between the discriminator and the pruned generator. (c) indicates the FID convergence plot when the compression rate is 98.92%. NICKEL & DiME converges the most stably.

Visualization of images generated by compressed StyleGAN2 on FFHQ and LSUN-CAT

(a) Compression rate - 90.73%

(b) Compression rate - 90.73%

(c) Compression rate - 98.92%

(d) Compression rate - 99.69%

BibTeX

@article{yeo2024nickel,
      title={Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation},
      author={Yeo, Sangyeop and Jang, Yoojin and Yoo, Jaejun},
      journal={arXiv preprint arXiv:2405.11614},
      year={2024}
    }
}

Acknowledgments

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No.2022R1C1C1008496), Institute of Information \& communications Technology Planning \& Evaluation (IITP) grant funded by the Korea government (MSIT) (No.RS-2020-II201336, Artificial Intelligence graduate school support (UNIST), No.RS-2021-II212068, Artificial Intelligence Innovation Hub, RS-2022-II220959, (Part 2) Few-Shot Learning of Causal Inference in Vision and Language for Decision Making, RS-2022-II220264, Comprehensive Video Understanding and Generation with Knowledge-based Deep Logic Neural Network).