Starting April 24, GitHub Copilot will begin harvesting developer interactions, including code snippets, prompts, and outputs, to train its AI models under a new default-opt-in policy that exempts only business customers.
The change affects all Copilot Free, Pro, and Pro+ users whose coding activity will feed Microsoft's machine learning systems unless they manually disable the setting. Copilot Business and Enterprise subscribers remain protected by existing contract terms that prohibit such data collection.
GitHub frames the move as necessary for improving model performance through real-world interaction data rather than just publicly available code repositories. The company says it matches "established industry practices," a reference to U.S.-style opt-out norms rather than Europe's stricter opt-in requirements.
Developers must navigate to their privacy settings and disable "Allow GitHub to use my data for AI model training" before the April deadline to avoid automatic enrollment. Those who previously opted out of product improvement data collection will have their preferences preserved under the new system.
Interaction data includes everything developers type into Copilot chat interfaces along with generated code suggestions and contextual information about their projects. This material may be shared within Microsoft's corporate family of GitHub affiliates but won't go to third-party AI providers according to company statements.
The policy shift has generated immediate backlash on technical forums where developers question why handing over proprietary work should be treated as a default condition of service. Hacker News commenters noted the setting appears enabled by default in current interfaces, creating friction for those wanting to protect their intellectual property.
Business customers paying premium rates receive contractual guarantees against their coding patterns being mined for AI training, a distinction that highlights how enterprise negotiations produce different privacy outcomes than individual developer agreements.
GitHub employees will also contribute their own interaction data to model training alongside external user inputs, creating what the company calls a more diverse dataset for improving coding assistance across different programming contexts and workflows.















