Pseudo Label-Guided Model Inversion Attack via Conditional Generative Adversarial Network
Paper β’ 2302.09814 β’ Published
This repository contains artifacts from a white-box gradient-based model inversion attack on the Adult Census Income dataset.
The attack uses a VAE-learned data manifold prior to ensure reconstructed tabular records stay realistic:
| Metric | Class 0 (<=50K) | Class 1 (>50K) |
|---|---|---|
| Mean NN Distance | 1.63 | 1.57 |
| Feature Range Compliance | 94.5% | 76.1% |
| Membership Proxy | 63.0% | 98.5% |
| Mean Confidence | 99.4% | 99.96% |
Key Finding: The minority class (>50K) is significantly more vulnerable β 98.5% of reconstructed samples fall closer to training data than the typical training sample distance, indicating strong membership leakage.
reconstructions/class_0_reconstructed.npy β 200 reconstructed samples (class 0)reconstructions/class_1_reconstructed.npy β 200 reconstructed samples (class 1)reconstructions/class_0_sample.csv β 20 human-readable samplesreconstructions/class_1_sample.csv β 20 human-readable samplesmetrics.json β Full evaluation metricstarget_mlp.pt β Trained target model weightsvae_prior.pt β Trained VAE prior weightsThis model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = 'shumaket/adult-census-model-inversion'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.