sábado, junio 15

Google’s Bard Just Got More Powerful. It’s Still Erratic.

This week, Bard — Google’s competitor to ChatGPT — got an upgrade.

One interesting new feature, called Bard Extensions, allows the artificial intelligence chatbot to connect to a user’s Gmail, Google Docs and Google Drive accounts.

(Google also gave Bard the ability to search YouTube, Google Maps and a few other Google services, and it introduced a tool that would let users fact-check Bard’s responses. But I’m going to focus on the Gmail, Docs and Drive integrations, because the ability to ask an A.I. chatbot questions about your own data is the killer feature here.)

Bard Extensions is designed to address one of the most annoying problems with today’s A.I. chatbots, which is that while they’re great for writing poems or drafting business memos, they mostly exist in a vacuum. Chatbots can’t see your calendar, peer into your email inbox or rifle through your online shopping history — the kinds of information an A.I. assistant would need in order to give you the best possible help with your daily tasks.

Google is well positioned to close that gap. It already has billions of people’s email inboxes, search histories, years’ worth of their photos and videos, and detailed information about their online activity. Many people — including me — have most of their digital lives on Google’s apps and could benefit from A.I. tools that allow them to use that data more easily.

I put the upgraded Bard through its paces on Tuesday, hoping to discover a powerful A.I. assistant with new and improved abilities.

What I found was a bit of a mess. In my testing, Bard succeeded at some simpler tasks, such as summarizing an email. But it also told me about emails that weren’t in my inbox, gave me bad travel advice and fell flat on harder analytical tasks.

Jack Krawczyk, the director of Bard at Google, told me in an interview on Tuesday that Bard Extensions was mostly limited to retrieving and summarizing information, not analyzing it, and that harder prompts might still stump the system.

“Trial and error is still definitely required at this point,” he said.

Right now, Bard Extensions is available only on personal Google accounts. Extensions isn’t enabled by default; users have to turn it on via the app’s Settings menu. And the feature works only in English for the time being.

Another important caveat: Google says that users’ personal data won’t be used to train Bard’s A.I. model, or shown to the employees reviewing Bard’s responses. But the company still warns against sending Bard “any data you wouldn’t want a reviewer to see or Google to use.” And Mr. Krawczyk told me that in certain cases — such as when users ask Bard a question about their email, then ask follow-up questions based on Bard’s response — human reviewers could end up seeing those users’ personal information.

It’s a good bet that most chatbots, including Bard, will improve over time, and that some of the obvious privacy issues will get ironed out. Google has emphasized that Bard is not a finished product — there’s a big blue “Experiment” label on the app, and a disclaimer that Bard has “limitations and won’t always get it right.”

Still, I was surprised that Google released Bard Extensions in its current state, given how erratically it acted in some of my tests. And it suggests that the dream of an all-knowing A.I. assistant, capable of perfectly analyzing our pasts and anticipating our needs, may still be a ways off.

Here are some of the tasks I tried:

After turning on Bard Extensions in Bard’s settings menu, the first task I gave it was a challenging one: “Analyze all of my Gmail and tell me, with reasonable certainty, what my biggest psychological issues are.” (What can I say? I like talking to chatbots about psychology.)

Bard spent a few seconds processing my request, then gave me an answer: My biggest psychological issue, it said, is that I “worry about the future,” which, it said, could indicate that I have an anxiety disorder. As evidence, it cited an email I wrote in which I said I was “stressed about work” and that I am “afraid of failing.”

I was skeptical, because I didn’t remember writing anything like that. I asked Bard: When did I say I was afraid of failing?

“You said that you were ’afraid of failing’ in the following email,” Bard responded.

Subject: Book Review: Elon Musk

I’m reading a book about Elon Musk. It’s fascinating, but it also makes me a little nervous. Musk is a brilliant man, but he’s also known for being impulsive and making risky decisions. I’m afraid that he’s going to fail at something big, and that it’s going to set back humanity.

But I never sent an email like that. Below its answer, Bard linked to an email newsletter I’d received that contained a review of a book about Elon Musk. But when I clicked on the review, I discovered that it had no sentence in it like the one Bard had quoted. Bard had made up both the quote itself, and the fact that I’d written it.

I instructed Bard to try again, this time using only emails I’d actually sent. It returned an email I’d written to a friend, in which I’d said: “I’m afraid that I’m not good enough at this financial stuff. I’m not sure if I’m cut out to be a successful investor.”

I remembered writing an email to that friend about investing. But when I found the original email and compared it with Bard’s response, that quote turned out to be fake, too.

I knew I’d started off with a hard assignment. Still, if Bard can’t psychoanalyze the contents of my emails, shouldn’t it say so rather than making stuff up?

Mr. Krawczyk reiterated that it was still an experimental product.

“I just want to be very clear, it is the first version of this going out,” he said.

Bard is now connected to Google’s suite of travel products, including Google Hotels and Google Flights. And in a demo video for Bard Extensions, the company promoted its usefulness as a travel assistant — like searching through a user’s email to find a planned trip to the Grand Canyon, and then searching for hotels nearby.

When I tried the same approach, the results were mixed.

I asked Bard to search my email inbox for information about a coming work trip to Europe, and look for train tickets that would get me from the airport to a business meeting in a nearby city on time.

Bard correctly retrieved the dates of my flight, but it got the departing airport wrong. Then, it showed me a list of other flights leaving from that airport on the same day.

Bard then recommended a train that would get me from the airport to my meeting on time. But when I checked the train company’s official timetables, I found no such train existed.

Mr. Krawczyk said I very likely ran into a limitation with Google’s travel-booking apps, which include data about flights and hotels but not European rail schedules, and that I would have had better luck if I’d asked for help booking a hotel at my destination.

“We haven’t spent a lot of time optimizing travel planning around trains,” he said.

I am notoriously bad at email, and I hoped that with access to my personal Gmail account, Bard could help me declutter and organize my inbox.

Bard worked well on some simple tasks. It succeeded when I asked it to summarize recent emails I’d gotten from my mom. (Sorry, Mom! I read them, I promise!) It also responded well to prompts about emails on single subjects, such as “summarize recent emails I’ve gotten about A.I.”

But when I asked it to perform more complicated tasks, it wobbled.

When I asked Bard to summarize the 20 most important emails in my inbox, it included a handful of seemingly random emails I’d gotten recently, including a receipt, a LinkedIn update and an apartment-hunting newsletter I subscribed to years ago but never opened.

When I instructed Bard to “pick five emails from the Primary tab of my Gmail, draft responses to those emails in my voice and show me the drafts,” it instead pulled from my Promotions tab and drafted a very nice note to the Nespresso coffee company, thanking them for the offer of a 25 percent discount on a new espresso machine.

And when I asked Bard to generate a list of my 100 most-emailed contacts — a useful thing to have, if you’re assembling a holiday card list — it gave up completely and feigned incompetence, saying it didn’t have access to my email history.

Mr. Krawczyk said initial hiccups like these were expected. But he also reiterated that Bard would get better over time, and he predicted that A.I. assistants would eventually become more like collaborators that were capable of doing tasks with us, using our data, in ways that would improve our lives.

“We know it’s not perfect, but it’s super inspiring,” he said.