Cellphones and smart glasses can now apply AI to what you see in real time
Can cameras make smarter virtual assistants? ChatGPT, Google, Meta are rolling out AI that can “see” the world around you
The Ray-Ban Meta glasses serve as a hands-free camera, but the tech will soon allow users to ask Meta’s AI about what it sees in real time. This week, Meta, the parent company of Facebook and Instagram, announced the V11 software update to the Ray-Ban Meta glasses and tests of a live AI feature. Meta’s announcement comes just days after ChatGPT began a slow roll out of a camera-assisted Advanced Voice Mode and Google gave developers access to Gemini 2.0 with Google Lens capabilities.
Early beta testers for Meta Ray-Ban glasses will now be able to try out the live AI features first teased during Connect 2024. Live AI enables Meta’s AI to see what the wearer sees, allowing the AI to answer questions about objects or places within the cameras’ field of view. Meta says the feature powers features like getting help and inspiration for cooking, gardening, or traveling to new places.
During a Live AI session, users will also be able to ask questions with the “Hey Meta” preface. And, while interrupting a human may be rude, Meta says that the AI can be interrupted for followup questions or even to change the topic entirely.
The beta roll out of Meta’s Live AI comes as several platforms race to develop multi-modal AI for real-time smart assistants. A multimodal AI can accept more than one type of input, such as using both voice and a camera feed to ask questions about objects.
Last week, ChatGPT also began its rollout of a feature that allows the app to use the camera feed or a view of the smartphone screen to ask and answer questions. First teased in May, the Advanced Voice mode with video allows subscribers to point their camera at an object to ask questions. The update also gives the AI a view of the screen to ask questions about what’s on the screen at the time too.
Earlier this month, Google launched Gemini 2.0, which can use Google Lens with Gemini Live, and began testing a glasses-based version of Gemini 2.0. Google first demonstrated Gemini 2.0 during a live event in May, including several demos for Project Astra using a smartphone camera to ask the AI questions about objects. However, during that demo, Gemini suggested opening the back of a film camera to troubleshoot a stuck rewind lever, advice that would ruin any images on the film inside. That error opened up criticism and skepticism over the AI’s ability to answer technical questions.
Multimodal AI could streamline interactions with smart assistants. Giving the AI access to a camera feed could skip a lot of typing or even ask questions on something that you don’t know the words for. But, as the flawed Gemini film advice illustrates, the early AI can be prone to errors.
Get the Digital Camera World Newsletter
The best camera deals, reviews, product advice, and unmissable photography news, direct to your inbox!
While Meta’s live AI is in testing, current owners of the smart shades can download V11 to get hands-free song identification with Shazam, a feature available in the US and Canada. Meta is also testing live translation, starting with spoken English, in the US and Canada.
You may also like
For more information, browse the best camera glasses, or read up on the best generative AI software.
With more than a decade of experience reviewing and writing about cameras and technology, Hillary K. Grigonis leads the US coverage for Digital Camera World. Her work has appeared in Business Insider, Digital Trends, Pocket-lint, Rangefinder, The Phoblographer and more.