Manifold Research Group tackles ambitious, high-impact research problems that traditional institutions overlook—those too engineering-intensive for academia and too exploratory for industry. Inspired by coordinated research models like ARPAs and FROs, we coordinate focused, cross-functional teams and a large asynchronous research contributor pool to systematically pursue and deliver paradigm-shifting science and technology.
MultiNet: A Generalist Benchmark for Multimodal Action models
Multinet is a comprehensive benchmarking initiative for evaluating generalist models that can process a wide variety of modalities and autonomously take action. Your work will help establish the foundation for truly multimodal AI systems that can perceive, understand, and act in complex environments. More information on this project is available here.
The Role
OS Team members form the core of Manifold Research Group. As an OS Research Fellow, you'll actively drive our ambitious projects—from initial roadmapping and technical implementation to writing impactful papers.
In this project, you'll be working on:
- Designing and implementing comprehensive benchmarks for evaluating multimodal AI systems across diverse tasks and modalities
- Profiling and optimizing large LLMs, VLMs, and Vision Language Action Models to understand their multimodal capabilities and limitations
- Collecting, curating, and cleaning diverse multimodal datasets that challenge current AI systems and reveal capability gaps
- A foundation in probability theory, linear algebra, and optimization methods, with hands-on experience in model fine-tuning techniques including LoRA/QLoRA and other parameter-efficient methods.
- Creating large-scale action datasets by training expert RL agents in simulated environments and collaborating with partner organizations to collect novel robotics data from real-world deployments
- Pioneering new types of software action datasets that capture complex GUI interactions, API sequences, and multi-step digital tasks to expand the scope of actionable AI
- Developing evaluation frameworks that measure cross-modal understanding, reasoning, and action generation
- Contributing to the MultiNet Evaluation Benchmark and Framework with improvements for scaling to larger, more capable models
Qualifications
Outstanding research emerges from driven, talented minds. For this project, we are looking for the following attributes:
- Demonstrated prior research experience, evidenced by published work in peer-reviewed conferences, journals, or recognized preprint platforms
- Hands-on experience with profiling and running experiments with large language models (LLMs), including performance analysis, ablation studies, and systematic evaluation
- Strong skills in data collection, curation, and cleaning, particularly for multimodal datasets combining text, images, and action sequences
- Interest and preliminary understanding of Vision Language Action Models (VLAs) and their architectures, training procedures, and evaluation methodologies
- Proficiency with Python and familiarity with Git, Linux (using the VMs on cloud)
- Proficiency with deep learning frameworks (PyTorch, JAX, or TensorFlow), with experience in distributed computing environments
Expectations
There are a few key expectations and clarifications we need to emphasize regarding the OS Research Team:
- Contribute approximately 10 hours per week to ensure meaningful progress and deep engagement with our projects. Flexibility around life commitments is understood; clear, proactive communication helps us support each other.
- Experience with being able to navigate the uncertainty of research w/ a high degree of autonomy.
- Our working language is English, and a strong proficiency is required to clearly communicate technical concepts without confusion or misunderstanding.
- This is a volunteer effort; none of us receive compensation of any kind—including monetary payment, academic credit, or other formal incentives. Our commitment is driven entirely by shared passion for impactful research.
- More information on OS Research Team expectations is available here.
We look forward to seeing your application!