Research Log #011

Welcome to Manifold, we're excited to share our eleventh Research Log!

The Research Log is basically a weekly update on all the progress we're making in building next generation Intelligent Systems, and we also share what we've been finding interesting in the broader research community in the "Pulse of AI" section that follows.

Let's begin!

NEKO Project

The NEKO Project aims to build the first large scale, Open Source "Generalist" Model, trained on numerous modalities. You can learn more about it here.

Game Performance reaches a Promising High Score

The Control thrust has resulted in successful training of the NEKO model on Atari Game data. In the most recent tests, the model achieved a score of 317, which compares favorable to SoTA. Future work includes investigation into the stability of the model across training, as well as training across more games. The team has also implemented a new data sampling method, which has sped up training significantly. Check out our work here:

GitHub - ManifoldRG/NEKO at sampling
In Progress Implementation of a Generalist Multimodal model capable of image, text, RL and Robotics tasks - GitHub - ManifoldRG/NEKO at sampling

Image/Text Grounding Dev Work is Underway

The Image/Text grounding implementation has begun, and text modality work is steadily underway! See our progress here:

GitHub - ManifoldRG/NEKO at add_text_modality
In Progress Implementation of a Generalist Multimodal model capable of image, text, RL and Robotics tasks - GitHub - ManifoldRG/NEKO at add_text_modality

Pulse of AI

Last week, we mentioned DALLE-3 coming out as a sign that OpenAI was pushing strongly towards multimodality. Sure enough, this week OpenAI has announced GPT-4V, which "enables users to instruct GPT-4 to analyze image inputs provided by the user".

GPT-4V(ision) system card

We have a few more interesting things we've been reading this week, check them out!

Introducing OpenLM | LAION
LAION
GELLO: A General, Low-Cost, and Intuitive Teleoperation Framework for Robot Manipulators
Imitation learning from human demonstrations is a powerful framework to teach robots new skills. However, the performance of the learned policies is bottlenecked by the quality, scale, and variety of the demonstration data. In this paper, we aim to lower the barrier to collecting large and high-qual…

