AI Engineer @ Lamida Inc.

👋 Hey, I'm Mridul Sharma

Exploring Multimodal language modeling, RL & VLAs.

I am AI Engineer at Lamida Inc. where I currently work on Multimodal AI with focus on Video Analysis.

My interests lies somewhere in the intersection of:

  • Multimodal Language Modeling (MLLMs)
  • RL/Planning
  • Vision Language Action Models
  • Human-AI Interaction

Featured Publications

Selected works from my research

View All
Manifold Research Publications, 2025

Benchmarking the Generality of Vision-Language-Action Models

Our findings reveal significant insights into the current state of multimodal AI, highlighting both promising capabilities and critical limitations that inform future research directions. We release our complete benchmark suite, evaluation framework, and detailed analysis to accelerate progress in this field.

Guruprasad P, Chowdhury S, Sikka H, Sharma M, Lu H, Rivera S, Khurana A, Wang Y
AAAI-AIA, 2026

Confirmation bias: A challenge for scalable oversight

We conducted two studies examining the performance of simple oversight protocols where evaluators know that the model is correct most of the time, but not all of the time.

Recchia G, Mangat CS, Nyachhyon J, Sharma M, Canavan C, Epstein-Gross D, Abdulbari M
AACL-IJCNLP, 2025

Consolidating and Developing Benchmarking Datasets for the Nepali Natural Language Understanding Tasks

We introduce twelvw new datasets, creating a new benchmark, the Nepali Language Understanding Evaluation (NLUE) benchmark, for evaluating the performance of models across a diverse set of Natural Language Understanding (NLU) tasks. The added tasks include single-sentence classification, similarity and paraphrase tasks, and Natural Language Inference (NLI) tasks. On evaluating the models using added tasks, we observe that the existing models fall short in handling complex NLU tasks effectively.

Nyachhyon J, Sharma M, Thapa P, Bal BK