World As A Prompt
English translation of an italian post that was originally published on Levysoft.it
In the latest Microsoft Keynote on May 20, 2024, when introducing the new Copilot+ PCs, at minute 4:14, Satya Nadella says that they will introduce:
“This in turn will lead to a new category of devices that turn the world itself into a prompt”.
Satya Nadella
The concept of “World As Prompt” struck me greatly because it represents a significant evolution in the field of artificial intelligence, especially with the introduction of multimodal LLMs like GPT-4o (Omni), which can simultaneously analyze text, audio, video, and images. This type of multimodal model represents a qualitative leap compared to previous versions such as GPT-4, which was trained only on text data and therefore primarily operated on text input.
Indeed, the ability to directly process audio (or other types of input like video and images) without the need to first convert it to text, and then respond in the same format (thus, for example, audio in and audio out), significantly reduces the latency of response times, enhancing the effectiveness of communication and opening the door to much more natural interaction.
The idea that the “world becomes a prompt” suggests that technology can be increasingly integrated into daily life in a fluid and intuitive manner, similar to the way humans interact with each other.