DevTools Radio

GitHub Will Use Copilot Interaction Data from Free, Pro, and Pro+ Users to Train AI Models

April 6, 2026 3:20 Episode 0

Host A: Welcome back to DevTools Radio — I'm your host, and we've got a story today that's been blowing up in the developer community. GitHub just announced that starting April 24th, it's going to use interaction data from Copilot Free, Pro, and Pro+ users to train its AI models.

Host B: And the key word there is *interaction data* — because we're not just talking about code sitting in your repos. We're talking about code you're actively typing, chat inputs, inline suggestions you accept or reject, file names, repo structure — basically everything happening in a live Copilot session.

Host A: Exactly. And here's the part that's gotten people really riled up — you're opted in by default. You have to go manually turn it off if you don't want your data used.

Host B: The classic "we've already said yes on your behalf" move. Developers in GitHub's own community forum are calling it a dark pattern, and honestly, it's hard to argue with that framing when the notification email didn't even include a direct link to the settings page to opt out.

Host A: Right, one developer flagged that specifically. And separately, someone else pointed out the opt-out toggle isn't even available in GitHub's mobile app, so good luck if that's your primary way of managing settings.

Host B: Okay but let's steelman GitHub for a second — they do say this is helping. They've been testing with Microsoft employee data and they're seeing improved suggestion acceptance rates across multiple languages. So the training loop is apparently working.

Host A: Sure, and they do give 30 days' notice, which is more than a lot of companies do. But the scope concerns are real. Private repo code is fair game if you're actively working in it with Copilot open. GitHub draws this line between code "at rest" versus code "in session" — and only the latter gets collected.

Host B: That distinction matters a lot for enterprise developers, though. If I'm an individual contributor using a personal Pro license but I'm coding on my company's proprietary project, my employer never agreed to let that code influence a shared AI model. That's a pretty significant gray area.

Host A: GitHub actually does address this in the FAQ — they say if your account is a member of a paid organization, your interaction data is excluded from training regardless of your personal subscription tier. So there is a carve-out there.

Host B: That's reassuring in theory, but it shifts a lot of responsibility onto the organization to have set things up correctly. And there's a broader philosophical point being made on Reddit — even if you opt out, using Copilot at all is essentially teaching the model what "good code" looks like in your domain. Your architecture decisions, your naming conventions — that all shapes the model for competitors too.

Host A: One commenter put it pretty bluntly: when your competitor uses the same tool, they benefit from patterns your team essentially donated to the training set. There are also GDPR questions being raised — specifically whether GitHub's "legitimate interest" basis for processing personal data holds up under EU law.

Host B: So the bottom line for our listeners — if you're on Copilot Free, Pro, or Pro+, head to your Copilot settings and look for "Allow GitHub to use my data for AI model training" and make a conscious choice. Don't just let the default decide for you.

Host A: Well said. This is one of those stories that's going to keep developing, especially on the legal and compliance side. We'll be watching it closely here at DevTools Radio.

Host B: As always, thanks for tuning in — stay curious, stay informed, and maybe go check your settings after this.

Host A: We'll see you next time.

Listen to This Episode

Prefer to listen? Head back to the episode page for the full audio.