Last week, Google debuted Gemini 2.0. The new family of AI models that power Google’s chatbot of the same name comes with new capabilities, like the ability to directly access information from services like Google Search and natively create images and audio to include in its responses. Google says its recent AI models are built for the “new agentic era” we’re entering, in which AI can access the internet and use tools to get things done for users.
As of this week, Gemini Advanced subscribers have access to try a handful of new models: Gemini 2.0 Flash Experimental, Gemini 2.0 Experimental Advanced, and Gemini 1.5 Pro with Deep Research. These join the existing options of standard 1.5 Pro (for “complex tasks”) and 1.5 Flash (for “everyday help”). It checks out that paying subscribers would get the chance to try new features early. But for a product that’s supposed to take some of the work out of intricate processes like in-depth research and, eventually, higher-stakes assignments like booking travel, Gemini is getting increasingly tricky to understand and use.
Welcome to Compiler,
your weekly digest of Google’s goings-on. I spend my days as Google Editor reading and writing about what Google’s up to across Android, Pixel, and more, and sum it up right here in this column. This is the Google news you need to understand this week.
A model for every task
Gemini Advanced subscribers now have a total of five Gemini models to choose between. More complex workloads are more resource intensive, so employing different models for different tasks makes sense. If a simpler Flash model can answer a given query just as well as a more complex Pro model can, running it through Flash instead of Pro will save a little computing power — a growing concern in the AI space.
But a drop-down menu that lets users manually choose between five different models for each given query seems like an awfully obtuse way to manage Gemini’s various capabilities. Learning the ins and outs of models with names like 1.5 Flash and 1.5 Pro with Deep Research seems like a big ask.
Gemini 1.5 Pro with Deep Research, for example, is the only of the five that can carry out Gemini’s Deep Research function that collates information from dozens or even hundreds of sources to create detailed reports. Gemini 2.0 Advanced, the newer, generally better model, still can’t do that. If you ask it to, it’ll do something, but it won’t let you know that your query would be better suited for 1.5 with Deep Research.
Isn’t AI supposed to simplify our lives?
The appeal of natural-language AI interfaces, theoretically, is that you don’t need to know how they work to use them. As opposed to a more traditional application, where you need to learn the nuances of the UI and where to find various functions to accomplish complicated tasks, with something like Gemini or ChatGPT, you shouldn’t need specialized knowledge — only a reasonably well-formed query. Layering on a menu of abstract models to choose from for each input (is this query everyday help or a complex task?) seems at odds with one of the most valuable characteristics of this type of application: approachability.
The option to manually pick which model your query runs through is a sensible perk for Advanced subscribers, but it shouldn’t be a requirement. To make Gemini easier to use, I’d like to see a future version that decides which model is best suited for your query automatically, without manual oversight. As it stands, Gemini won’t even let you know if you’ve used the wrong model for a given task. Isn’t AI supposed to simplify our lives?
Is Google Keep due for a glow-up?
Android 16 Developer Preview 2 packs an interesting change: it makes Google Keep a system application, meaning you can’t uninstall it without root access. At first blush, that might seem like more of an inconvenience than anything, but it likely means that Google has big plans for its note-taking app, including deeper system integrations — the ability to launch the app from the lock screen on Pixel phones, for example.
I’m excited about the possibility. I’ve used Keep for quick notes out of convenience for years, but I’ve never really liked it much. Compared to other apps I’ve used for note-taking — Evernote, Obsidian, Apple Notes — Keep’s always seemed a little barebones. You can search your notes and add labels, but there’s no robust categorization; you can’t create folders, and the app is still clinging to its original concept of notes represented as sticky note-style cards.
But if Keep does become a bigger focus for Google, picking up features like folders, some Gemini-powered AI categorization, and maybe a Quick Settings tile to open a new note on Android like Apple Notes has on iOS, I can see myself using it because I want to, and not just because it’s the note-taking app I happen to have installed.
Meanwhile…
Google’s Veo 2 video generator is looking wildly impressive. Google released a set of video clips (above) from its latest Veo 2 video generator this week, and for the most part, it’s very hard to tell the clips weren’t made by human hands. Veo 2 apparently has a better understanding of things like anatomy and physics than the original Veo did, which lets it create clips that have markedly less AI wonk and fewer hallucinations. You can sign up for a waitlist to try Veo 2 yourself at labs.google/videofx.
Latest development
Google says Veo 2 AI can generate videos without all the hallucinations
Five fingers per hand is a big step for AI
Google’s new Whisk experiment is a tool for visual brainstorming. Whisk lets you generate images based on a user-defined “setting,” “scene,” and “style.” For each aspect, you can either upload an existing image or enter a text prompt. You also have the option to refine output images with additional prompting. The results aren’t generally top-shelf quality, but Google positions Whisk more as a tool for ideation than creating ready-to-use imagery. You can try Whisk right now at labs.google/fx/tools/whisk.
Full story
Google’s new Whisk AI lets you drop images in as prompts to make new images
The latest Google Labs creation is fun
Gemini’s fact-checkers are reportedly weighing in on subjects they don’t know about. According to reporting from TechCrunch, contract workers who rate Gemini’s responses are no longer able to bypass responses that fall outside their understanding, with guidance from Google reportedly reading, in part, “You should not skip prompts that require specialized domain knowledge.” That’s fairly troubling! Remember to keep double-checking information provided by AI before acting on it.
Latest development
New Google policy instructs Gemini’s fact-checkers to act outside their expertise
Google may undermine its accuracy claims