JL
Juhyoung Lee

Aug 2025

VQA with BLIP-2 + Phi-1.5 (Sub-3B Constraint)

Customized BLIP-2 architecture by replacing OPT with Phi-1.5 to meet a sub-3B parameter budget. Fine-tuned Q-Former with multi-stage training. Ranked 43rd / 242 teams (top 18%).

MultimodalVQABLIP-2Phi-1.5Q-FormerVision-LanguageFine-tuning