Magma: A foundation model for multimodal AI agents
Presented by Jianwei Yang at Microsoft Research Forum, February 2025 “In this project we developed the first agentic foundation model, Magma, that can understand multimodal input and also take action...
View ArticleBelief state transformers
Presented by John Langford at Microsoft Research Forum, February 2025 “That ability to condition on generation, rather than evaluate the generation, ends up being amazingly useful in terms of giving...
View ArticleAutoGen v0.4: Reimagining the foundation of agentic AI for scale,...
Presented by Gagan Bansal at Microsoft Research Forum, February 2025 “When we released AutoGen, one of the first things that the developers absolutely loved about it was its simplicity and the many...
View ArticleLLMs for safe low-level programming
Presented by Aseem Rastogi and Pantazis Deligiannis at Microsoft Research Forum, February 2025 “We created a tool called RustAssistant that leverages the power of state-of-the-art LLMs to help...
View ArticleEfficiently generating long, high-quality, and dynamic videos using text prompts
The rapid development of AI has steadily advanced the field of text-to-video (T2V) generation, offering a rich and convenient video content creation experience and unlocking new possibilities in...
View ArticleNavigating different cultures: A heart-focused journey
On the world’s vast canvas, cultures from different regions are like unique brushstrokes, each conveying distinctive perspectives and styles. At Microsoft Research Asia, women researchers from diverse...
View ArticleAcademic growth, cultural immersion, and global connections: Korean interns’...
At the intersection of academic growth, cultural immersion, and international collaboration, internships at Microsoft Research Asia provide students with more than just technical experience. This...
View ArticleCan AI unlock the mysteries of the universe?
Astronomy, born from humanity’s innate curiosity about the stars, has long been a catalyst for revolutionary discoveries. As AI technology advances, intelligent agents powered by large language models...
View ArticleWHAMM! Real-time world modelling of interactive environments.
Today we are making available an interactive real-time gameplay experience in Copilot Labs. Head over to this link (opens in new tab) to play an AI rendition of Quake II gameplay, powered by Muse. 5...
View ArticlePIKE-RAG: Enabling industrial LLM applications with domain-specific data
A key challenge, and opportunity, of large language models (LLMs) is bridging the gap between their training data and the vast amount of unfamiliar information they encounter in real-world...
View Article