Some of the ways Big Tech companies feed your personal data to AI feel like privacy violations — or even theft
Your email is just the beginning. Meta, the owner of Facebook, took a billion Instagram posts from public accounts to train its AI and did not ask for permission. Microsoft uses your conversations with Bing to train the AI bot to answer questions better, and you can’t stop it.
Increasingly, tech companies are taking your conversations, photos, and documents to teach their AIs how to write, draw, and pretend to be human. You’re probably familiar with them using your data to target you with ads. But now they’re using it to create lucrative new technologies that could boost the economy — and make Big Tech even bigger.
We don’t yet understand the risk this behavior poses to your privacy, reputation or work. But you can’t do much about it.
Sometimes companies handle your data with care. However, their behavior is often out of sync with common expectations about what happens to your information, including content you consider private.
Zoom raised alarm last month when it announced that it could use the private content of video chats to improve its AI products before reversing course. Earlier this summer, Google updated its privacy policy to say it can use any “public information” to train its AI. (Google says this isn’t a new policy, it just wanted to clarify that it applies to its Bard chatbot.)
If you’re using Big Tech’s exciting new AI products, you might be forced to agree to help make their AI smarter through “data donation.” (That’s Google’s actual term for it.)
Getting Lost in Data: Most people have no way to make truly informed decisions about how to use their data. That might feel like a violation of privacy — or like theft.
“AI represents an unprecedented generational leap,” said Nicholas Piachaud, director of the open source nonprofit Mozilla Foundation. “This is the right time to step back and think: What is at stake here? Are we willing to give our privacy and personal data to these big companies? Or should privacy be the default?
It’s not new for tech companies to use your data to train AI products. Netflix uses the content you watch and your ratings to make recommendations. Facebook uses everything you like and comment on to train its AI how to curate your news feed and show you ads.
However, creative AI is different. Today’s AI arms race requires a lot of data. Elon Musk, owner of Twitter and CEO of Tesla, recently boasted to his biographer that he has access to 160 billion frames of video per day captured from cameras built into vehicles. Everyone’s Tesla to advance its AI ambitions.
“Everyone acts as if it is the manifest destiny of technological tools to be Building with human data is like that. “With the increasing use of AI tools, this leads to a skewed incentive to collect as much data as possible in advance.”
All this brings some privacy risks. Training AI to learn things about the world means it will also learn profound things about individuals.
Some tech companies even admit it in their fine print. When you use Google’s new AI writing coach for Docs, it warns: “Do not include personal, confidential, or sensitive information.”
The actual AI training process can be a bit intimidating. Sometimes it involves other people looking at the data. Humans are reviewing our back and forth with Google’s new search engine and the Bard chatbot, to name just two.
What’s even worse for your privacy is that AI sometimes leaks data out. Innovative AI systems that are notoriously difficult to control can retrieve personal information in response to a new, sometimes unforeseen, prompt.
It even happened to a tech company. Samsung employees are said to be using ChatGPT and have discovered three times that the chatbot has revealed company secrets. The company later banned the use of AI chatbots in the workplace. Apple, Spotify, Verizon and many banks have done the same.
Big Tech companies tell me they work hard to stop leaks. Microsoft says it will de-identify user data entered into Bing chat. Google says it automatically removes personally identifiable information from training data. Meta says it will train the creative AI not to reveal personal information – so it might share a celebrity’s birthday but not regular people.
Okay, but how effective are these measures? It’s one of those questions that companies won’t give a straight answer to. “While our filters are industry-leading, we continue to improve them,” Google said. And how often do they leak? “We believe it is very limited,” it said.
It’s nice to know Google’s AI only leaks our information occasionally. “It’s hard for them to say frankly ‘we don’t have any sensitive data,’” Winters said.
Perhaps privacy isn’t even the right word for this mess. It’s also about control. Who could have imagined that a vacation photo they posted in 2009 would be used by a megacorporation in 2023 to teach AI to create art, put a photographer out of work, or recognize someone’s face? Is that for the police?
There’s a fine line between “making the product better” and theft, and tech companies think they can do just that.
Which of our data is limited and unrestricted? Much of the answer is wrapped up in lawsuits, investigations and hopefully some new laws. But meanwhile, Big Tech is making its own rules.
I asked Google, Meta, and Microsoft to tell me exactly when they’re pulling user data from products core to modern life to make their next-generation AI products smarter. . Getting an answer is like chasing a squirrel through a funhouse.
They told me they have never used nonpublic user information in their largest AI models without permission. But those very carefully chosen words leave out plenty of cases where they are, in fact, building lucrative AI businesses with our digital lives.
Not all AI uses of data are the same or even problematic. But as users, we actually need a degree in computer science to understand what’s going on.
Google is a great example. It tells me its “underlying” AI models — the software behind things like Bard, the chatbot that answers anything — come mostly from “publicly available data from the internet.” Our private Gmail did not contribute to that, the company said.
However, Google still uses Gmail to train other AI products, like Smart Compose, Gmail’s writing assistant (which finishes sentences for you), and its new creativity coach Duet AI. Google argues that it’s fundamentally different because it takes data from a product to improve it.
There’s probably no way to create something like Smart Compose without looking at your email. But that doesn’t mean Google should just enable this feature by default. In Europe, where data laws are better, Smart Compose is disabled by default. Your data is also not required to use Google’s latest and greatest products, even if Google calls them “experimental” like Bard and Duet AI.
Facebook owner Meta also told me that it doesn’t train its largest AI model, called Llama 2, on user data. But it has trained other AI, like an image recognition system called SEER, on people’s public Instagrams.
And Meta won’t tell me how to use our personal data to train innovative AI products. After I objected, the company said it would “not train our generative AI models on people’s messages with their friends and family.” At least it agreed to draw some kind of red line.
Microsoft updated its services agreement this summer with broad language about user data and did not make any assurances to me about restricting our use of data for training products. its AI in consumer-facing programs like Outlook and Word. Mozilla even launched a campaign calling on the software giant to clarify. “If nine privacy experts can’t understand what Microsoft does with your data, what chance does the average person have?” Mozilla said.
It doesn’t have to be this way. Microsoft has plenty of guarantees for its lucrative business customers, including those who chat with the enterprise version of Bing, about keeping their data private. “The data always stays within the customer’s tenant and is never used for other purposes,” the spokesperson said.
Why do companies have more privacy than the rest of us?
#Gmail #Instagram #training


World Innovations: Top Trends Shaping the Future Worldwide
Global Migration Trends: Understanding the Modern Movement of People
World Sports: Discover the Most Exciting Global Sporting Events