The race towards achieving Artificial General Intelligence (AGI) is intensifying, with Google DeepMind making significant strides in this direction. After recently making Veo and Imagen 3 available on Google Cloud, the company has now introduced Genie 2, a large-scale foundation world model capable of generating a wide variety of playable 3D environments. This model will facilitate the development of embodied AI agents by transforming a single image into interactive virtual worlds that can be explored by humans or AI using standard keyboard and mouse controls. According to Google DeepMind, Genie 2 has the potential to enable future agents to be trained and evaluated in a limitless curriculum of novel worlds, paving the way for new and creative workflows for prototyping interactive experiences. The model, which is an upgrade of its predecessor Genie 1, expands its capabilities from 2D to 3D environments and includes features such as simulating physical interactions, modelling complex animations, and creating environments with realistic physics, lighting, and object interactions. Trained on a large-scale video dataset, Genie 2 employs autoregressive latent diffusion technology to generate frames sequentially in response to user actions. This model is a significant step towards creating more general AI agents, as it addresses the issue of limited and diverse training environments. Google DeepMind also showcased an AI agent named SIMA, which performed tasks in Genie 2-generated worlds by following natural-language instructions. The company believes that SIMA’s performance highlights the model’s ability to create unique testing environments for agents, demonstrating generalisation to unseen tasks. Another key feature of Genie 2 is its capacity to generate new content on the fly, maintaining consistent worlds for up to a minute. The model can also generate diverse perspectives, such as first-person and isometric views, and simulate real-world environments using photos. In addition to Genie 2, Google DeepMind has also launched GenCast, a new AI model that enhances weather predictions by providing faster and more accurate forecasts up to 15 days ahead, while also addressing weather uncertainties and risks. The company has also recently launched its experimental AI model, Gemini-Exp-1121, which rivals OpenAI’s GPT-4o. Google is also preparing to launch Google Gemini 2, which is expected to compete with OpenAI’s upcoming model, o1. In an exclusive interview, AI sceptic Gary Marcus stated that DeepMind is likely on a better path towards AGI compared to its competitors, although no company has yet found the definitive route to AGI.