Welcome to Log #14. Here we share weekly progress on Manifold’s research projects and also the “Pulse of AI”: breakthroughs that are happening in the broader research community we’re excited about!

The NEKO Project aims to build the first large scale, Open Source "Generalist" Model, trained on numerous modalities. You can learn more about it here

  • Text Modality Updates: the text modality evaluation loop was updated to fix some issues. These updates now allow for text modality to be run completely separately from other modalities. Check it out here.
  • Language + Vision Update: The Language + Vision Paired objectives are now running! We are now in the process of evaluating it. We’ll have code available for this soon!
  • Data Sampling for Control Modality: the team resolved two major issues with data sampling, allowing for sampling from multiple games in atari during training and also resolving a memory bottleneck using data compression, further increasing speed of training. Checkout it out here.

Pulse of AI

  • Fuyu-8B:  a new multimodal model that accepts images and text from Adept AI. The difference between this new model and past ones is that it can understand images based on a decoder only Transformer. This makes it capable of processing different sized images and it is not relying on fixed sizes like CLIP. The release blog post has more details. Also, shout out to Andrew Car for his amazing thread on the topic!
  • Unisim: A new interactive simulator for the real world. “UniSim allows effective training of RL agents purely in simulation, which can be directly transferred onto real robots. This can pave the way to training policies without expensive real world intervention.” More information can be found at their website .  

See you next time!

-Santiago, Twitter: @snats_xyz

