Amirabbas Afzali

I’m Amirabbas Afzali, a B.Sc. student in Electrical Engineering with a minor in Mathematics at Sharif University of Technology, specializing in Communication Systems. Last summer, I was a research intern at the MLBio Lab at EPFL, working with Prof. Maria Brbić on Weak-to-Strong generalization for preference alignment in large language models.

I’m broadly interested in reliable decision-making in machine learning systems, including trustworthy ML, optimization, and reinforcement learning, especially where these topics intersect with human-AI alignment.

My current research focuses on understanding how preferences, robustness, and feedback signals shape model behavior. My recent work spans several research areas, including:

Post-training techniques for LLMs, such as preference learning and alignment
Trustworthy and robust machine learning, with emphasis on adversarial robustness and safety
Offline and robust reinforcement learning

Lately, I’ve been especially interested in the following topics — feel free to reach out if they resonate:

(i) LLM safety and adversarial alignment 🔗

(ii) Steering Vector for test-time alignment 🔗

(iii) Certified robustness and model verification 🔗

selected publications

NeurIPS

LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders

Borna Khodabandeh^*, Amirabbas Afzali^*, Amirhossein Afsharrad, and 4 more authors

Advances in Neural Information Processing Systems, 2025

arXiv Bib

@article{khodabandeh2025lorelagrangianoptimizedrobustembeddings,
  title = {LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders},
  author = {Khodabandeh, Borna and Afzali, Amirabbas and Afsharrad, Amirhossein and Mousavi, Seyed Shahabeddin and Lall, Sanjay and Amini, Sajjad and Moosavi-Dezfooli, Seyed-Mohsen},
  year = {2025},
  journal = {Advances in Neural Information Processing Systems},
  archiveprefix = {arXiv},
  primaryclass = {cs.LG},
  url = {https://arxiv.org/abs/2505.18884},
  equal_contribution = {Borna Khodabandeh and Amirabbas Afzali},
}

ICLR

Aligning Visual Contrastive learning models via Preference Optimization

Amirabbas Afzali^*, Borna Khodabandeh^*, Ali Rasekh, and 3 more authors

International Conference on Learning Representations, 2025

arXiv Bib

@article{afzali2025aligningvisualcontrastivelearning,
  title = {Aligning Visual Contrastive learning models via Preference Optimization},
  author = {Afzali, Amirabbas and Khodabandeh, Borna and Rasekh, Ali and JafariNodeh, Mahyar and kazemi, Sepehr and Gottschalk, Simon},
  year = {2025},
  journal = {International Conference on Learning Representations},
  archiveprefix = {arXiv},
  primaryclass = {cs.CV},
  url = {https://arxiv.org/abs/2411.08923},
  equal_contribution = {Amirabbas Afzali and Borna Khodabandeh},
}

RLC

One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Amirabbas Afzali, Amirhossein Afsharrad, Seyed Shahabeddin Mousavi, and 1 more author

Reinforcement Learning Conference, 2025

arXiv Bib

@article{afzali2025goalchallengesrobustpreference,
  title = {One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise},
  author = {Afzali, Amirabbas and Afsharrad, Amirhossein and Mousavi, Seyed Shahabeddin and Lall, Sanjay},
  year = {2025},
  journal = {Reinforcement Learning Conference},
  archiveprefix = {arXiv},
  primaryclass = {cs.LG},
  url = {https://arxiv.org/abs/2503.12301},
}