Sign up to Manifold Research!

Already have an account? Sign in

Publications

MultiNet v1.0: A Comprehensive Benchmark for Evaluating Multimodal Reasoning and Action Models Across Diverse Domains

Pranav Guruprasad, Sudipta Chowdhury, Harsh Sikka, Mridul Sharma, Helen Lu, Sean Rivera, Aryan Khurana, Yangyue Wang, Hangliang Ren (2025)

Multimodal reasoning and action models hold immense promise as general-purpose agents, yet the current evaluation landscape remains fragmented with domain-specific benchmarks that fail to capture true generalization capabilities. This critical gap prevents us from understanding where these sophisticated systems excel

An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models

Pranav Guruprasad, Yangyue Wang, Sudipta Chowdhury, Jaewoo Song, Harshvardhan Sikka (2025)

Recent innovations in multimodal action models represent a promising direction for developing general-purpose agentic systems, combining visual understanding, language comprehension, and action generation. We introduce MultiNet - a novel, fully open-source benchmark and surrounding software ecosystem designed to rigorously evaluate

Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments

Pranav Guruprasad, Yangyue Wang, Sudipta Chowdhury, Harshvardhan Sikka, Paul Pu Liang (2025)

Vision-language-action (VLA) models represent an important step toward general-purpose robotic systems by integrating visual perception, language understanding, and action execution. However, systematic evaluation of these models, particularly their zero-shot generalization capabilities in procedurally out-of-distribution (OOD) environments, remains limited. In this

On Orbit Object Transportation With Spacecraft Swarms

Sidhdharth D. Sikka, Zehui Lu, Ayush Rai, Daniel DeLaurentis and Shaoshuai Mou (2025)

The expanding space economy has created an urgent need for reliable on-orbit transportation systems to support both commercial and scientific missions. This paper explores a cooperative approach to orbital transportation, where a swarm of spacecraft agents, each securely attached to

Refusal in LLMs is an Affine Function

Thomas Marshall, Adam Scherlis, Nora Belrose (2024)

We propose affine concept editing (ACE) as an approach for steering language models' behavior by intervening directly in activations. We begin with an affine decomposition of model activation vectors and show that prior methods for steering model behavior correspond to

Intelligent Digital Agents in the Era of Large Language Models

B Faught, H Lu, T Marshall, H Sikka, P Guruprasad, B Gauri (2024)

Below is the abstract for our recent paper, "Intelligent Digital Agents in the Era of Large Language Models". We’re growing our Research Team and pursuing new projects. If you’re interested in working together, join the conversation on

Jill Watson: A Virtual Teaching Assistant powered by ChatGPT

Karan Taneja, Pratyusha Maiti, Sandeep Kakar, Pranav Guruprasad, Sanjeev Rao, Ashok K. Goel (2024)

Conversational AI agents often require extensive datasets for training that are not publicly released, are limited to social chit-chat or handling a specific domain, and may not be easily extended to accommodate the latest advances in AI technologies. This paper

Does Transformer Interpretability Transfer to RNNs?

Gonçalo Paulo, Thomas Marshall, Nora Belrose (2024)

Recent advances in recurrent neural network architectures, such as Mamba and RWKV, have enabled RNNs to match or exceed the performance of equal-size transformers in terms of language modeling perplexity and downstream evaluations, suggesting that future systems may be built

PIILO: an open-source system for personally identifiable information labeling and obfuscation

L Holmes, S Crossley, H Sikka, W Morris (2023)

Purpose

This study aims to report on an automatic deidentification system for labeling and obfuscating personally identifiable information (PII) in student-generated text.

Design/methodology/approach

The authors evaluate the performance of their deidentification system on two data sets of student-generated

Designing a Communication Bridge between Communities: Participatory Design for a Question-Answering AI Agent

J Lee, V Nandan, H Sikka, S Rugaber, A Goel (2023)

How do we design an AI system that is intended to act as a communication bridge between two user communities with different mental models and vocabularies? Skillsync is an interactive environment that engages employers (companies) and training providers (colleges) in

Deidentifying Student Writing with Rules and Transformers

L Holmes, SA Crossley, W Morris, H Sikka, A Trumbore (2023)

As education increasingly takes place in technologically mediated settings, it has become easier to collect student data that would be valuable to researchers. However, much of this data is not available due to concerns surrounding the protection of student privacy.

Human-AI Interaction Design in Machine Teaching

K Taneja, H Sikka, A Goel (2022)

Machine Teaching (MT) is an interactive process where a human and a machine interact with the goal of training a machine learning model (ML) for a specified task. The human teacher communicates their task expertise and the machine student gathers

Reface: Real-time adversarial attacks on face recognition systems

S Hussain, T Huster, C Mesterharm, P Neekhara, K An, M Jere, H Sikka (2022)

Deep neural network based face recognition models have been shown to be vulnerable to adversarial examples. However, many of the past attacks require the adversary to solve an input-dependent optimization problem using gradient descent which makes the attack impractical in

Explanation as Question Answering based on a Task Model of the Agent's Design

A Goel, H Sikka, V Nandan, J Lee, M Lisle, S Rugaber (2022)

We describe a stance towards the generation of explanations in AI agents that is both human-centered and design-based. We collect questions about the working of an AI agent through participatory design by focus groups. We capture an agent's design through

A framework for interactive knowledge-aided machine teaching

K Taneja, H Sikka, A Goel (2022)

Machine Teaching (MT) is an interactive process where humans train a machine learning model by playing the role of a teacher. The process of designing an MT system involves decisions that can impact both efficiency of human teachers and performance

Agent Smith: Machine Teaching for Building Question Answering Agents.

AK Goel, H Sikka, E Gregori (2022)

Building AI agents can be costly. Consider a question answering agent such as Jill Watson that automatically answers students' questions on the discussion forums of online classes based on their syllabi and other course materials. Training a Jill on the

A Genetic Algorithm Based Approach for Satellite Autonomy

S Sikka, H Sikka (2021)

Autonomous spacecraft maneuver planning using an evolutionary algorithmic approach is investigated. Simulated spacecraft were placed into four different initial orbits. Each was allowed a string of thirty delta-v impulse maneuvers in six cartesian directions, the positive and negative x, y

WeightScale: Interpreting Weight Change in Neural Networks

AM Agrawal, A Tendle, H Sikka, S Singh (2021)

Interpreting the learning dynamics of neural networks can provide useful insights into how networks learn and the development of better training and design approaches. We present an approach to interpret learning in neural networks by measuring relative weight change on

Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change

AM Agrawal, A Tendle, H Sikka, S Singh, A Kayid (2021)

Understanding the per-layer learning dynamics of deep neural networks is of significant interest as it may provide insights into how neural networks learn and the potential for better training regimens. We investigate learning in Deep Convolutional Neural Networks (CNNs) by

Multimodal Modular Meta-Learning

H Sikka, A Tendle, A Kayid (2020)

Many real world prediction problems involve structured tasks across multiplemodalities. We propose to extend previous work in modular meta learning to themultimodal setting. Speciﬁcally, we present an algorithmic approach to apply taskaware modulation to a modular meta learning system that

Your link has expired.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.