|
Surgan Jandial
Master’s in Robotics (MSR) student at the Carnegie Mellon University Robotics Institute, advised by Professor Fernando De la Torre and Professor Andrea Bajcsy.
Interested in Automatically Diagnosing Weaknesses and Vulnerabilities in Vision-Language and Large Language Model (VLM/LLM) Agents. Particularly, understanding whether knowledge of a model’s training data can help predict the downstream tasks or skills where it might fail—what I describe as identifying emergent weaknesses in foundation models.
Spent a wonderful Summer 2025 as a Research Intern at Microsoft -- Computer Use Agents. Developed novel metrics for Visual GUI Grounding and Red-Teaming Agent formulations to extract Image-Text data where GUI Grounding models fail.
Previously, did Applied Science at Adobe, working on Knowledge Distillation, Data / Model Selection, and VLMs / LLMs (Retrieval, Fine-Tuning, Fairness). Even before, was a CS undergrad at IIT Hyderabad, advised by Professor Vineeth Balasubramanian.
Email  /
Patents  /
Preprints  /
Personal interests
|
    
Google Scholar  / 
Linkedin  /
Twitter
|
Publications Component
|
|
GUI Grounders Do not (Truly) Understand the UI Elements they Click
Surgan Jandial, Yinheng Li, Justin Wagle, Kazuhito Koishida
Upcoming
Vision Language Models
Computer Use Agents
|
|
|
On the Fine-Grained Planning Abilities of VLM Web Agents
Surgan Jandial*, Oliver Wang*, Andrea Bajcsy , Fernando De la Torre
EMNLP, 2025
Vision Language Models
Web Agents
|
|
|
Thinking Fair and Slow: On the Efficacy of Structured Prompts for Debiasing Language Models
Surgan Jandial*, Shaz Furniturewala*, Abhinav Java, Pragyan Banerjee, Simra Shahid, Sumit Bhatia, Kokil Jaidka
EMNLP, 2024
Model Fairness
Large Language Models
|
|
|
All Should Be Equal in the Eyes of Language Models: Counterfactually Aware Fair Text Generation
Pragyan Banerjee*,
Abhinav Java*,
Surgan Jandial*,
Simra Shahid *,
Shaz Furniturewala,
Balaji Krishnamurthy, Sumit Bhatia
AAAI, 2024
Model Fairness
Large Language Models
|
|
|
Distilling the Undistillable: Learning from a Nasty Teacher
Surgan Jandial,
Yash Khasbage,
Arghya Pal,
Vineeth N Balasubramanian,
Balaji Krishnamurthy
ECCV, 2022
Knowledge Distillation
Model Stealing
Model Security
|
|
|
SAC: Semantic Attention Composition for Text-Conditioned Image Retrieval
Surgan Jandial*,
Pinkesh Badjatiya*,
Pranit Chawla*,
Ayush Chopra*,
Mausoom Sarkar, Balaji Krishnamurthy
WACV, 2022
Applications
Computer Vision
Vision Language Models
|
|
|
Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks
Surgan Jandial*,
Ayush Chopra*,
Mausoom Sarkar,
Piyush Gupta,
Balaji Krishnamurthy,
Vineeth N Balasubramanian
KDD, 2020  
Efficient Model Training
|
|
|
SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On
Surgan Jandial*,
Ayush Chopra*,
Kumar Ayush*,
Mayur Hemani,
Balaji Krishnamurthy,
Abhijeet Halwai
WACV, 2020
Also presented at Workshop on AI for Content Creation, CVPR 2020
Media Coverage: Venturebeat / Beebom / WWD
Applications
Computer Vision
|
|
Select Workshop Papers
|
|
|
Gatha: Relational Loss for enhancing text-based style transfer
Surgan Jandial,
Shripad Deshmukh,
Abhinav Java,
Simra Shahid, Balaji Krishnamurthy
6th Workshop on Computer Vision for Fashion, Art, and Design, CVPR 2023 (Oral)
Synthetic Data Generation
Vision Language Models
|
|
|
On Conditioning the Input Noise for Controlled Image Generation with Diffusion Models
Vedant Singh*,
Surgan Jandial *,
Ayush Chopra, Siddarth Ramesh, Balaji Krishnamurthy,
Vineeth N Balasubramanian
Workshop on AI for Content Creation, CVPR 2022 , AK Tweeted :)
Synthetic Data Generation
|
|
Patents
|
- Issued Cloth Warping Using Multi-Scale Patch Adversarial Loss
Application granted on 06/08/2021. US Patent number 11080817
- Issued Accurately Generating Virtual Try-On Images Utilizing a Unified Neural Network Framework
Application granted on 08/03/2021. US Patent number 11030782
- Issued Text-Conditioned Image Search with Transformation, Aggregation, and Composition of Visio-Linguistic Features
Application granted on 08/08/2023. US Patent number 11720651
- Issued Model Training with Retrospective Loss
Application granted on 10/24/2023. US Patent number 11797823
- Filed Text-Conditioned Image Search Based on Dual-Disentangled Feature Composition
Filled at the US Patent Office on 1/28/2021
- Filed Regularizing Targets in Model Distillation Utilizing Past State Knowledge of Students
Filled at the US Patent Office on 8/9/2022
- Filed Diffusion Model Image Generation
Filled at the US Patent Office on 8/31/2022
- Filed Systems and Methods for Data Augmentation
Filled at the US Patent Office on 10/11/2022
- Filed Systems and Methods for Machine Learning Transferability
Filled at the US Patent Office on 3/3/2023
- Filed Form Structure Similarity Detection
Filled at the US Patent Office on 3/27/2023
- Filed Personalized Form Error Correction Propagation
Filled at the US Patent Office on 4/27/2023
- Filed Knowledge Distillation Using Contextual Semantic Noise
Filled at the US Patent Office on 2/22/2023
- Filed Systems and Methods for Generating Synthetic Tabular Data for Machine Learning and Other Applications
Filled at the US Patent Office on 4/3/2023
- Filed One-Shot Document Snippet Search
Filled at the US Patent Office on 6/30/2023
- Filed Generating Alternative Examples for Content
Filled at the US Patent Office on 11/3/2023
- Filed A Novel Method and Apparatus for Text-Guided Style Transfer
Internally approved at Adobe Inc. in June 2023 for filing
- Filed Mask-CLIPstyler: Localized text-based style transfer in images
Internally approved at Adobe Inc. in July 2024 for filing
|
|