What Copilot Actually Sends to Microsoft

GitHub Copilot sends context to Microsoft Azure OpenAI Service to generate completions. That context includes the current file, surrounding files in your workspace, and recent edit history. Microsoft's terms of business allow them to use this data to improve their models unless you have an enterprise agreement with telemetry disabled.

For most developers building standard web applications, this is an acceptable trade. For teams building in regulated industries, working on novel algorithms, or bound by client NDAs, the risk profile is different.

The Three Risk Categories

Category 1: Regulated Data in Code

If your codebase contains references to patient identifiers, account numbers, or PII data structures, even in schema definitions or variable names, sending that code to Copilot creates HIPAA, PCI-DSS, or GDPR exposure.

The risk is not that Copilot leaks your specific patient list. The risk is that identifiers referencing regulated data patterns leave your network perimeter without documented controls.

Category 2: Proprietary Algorithms

Quantitative trading strategies, fraud detection models, recommendation engines, and similar proprietary algorithms have real competitive value. If those algorithms live in your codebase and Copilot has access to your workspace, the logic is in transit to third-party infrastructure.

Category 3: Client Confidentiality

Enterprise contractors and consulting firms often have explicit confidentiality clauses covering client code. Using Copilot on that code without client consent may breach contract terms. Several large consulting firms discovered this in 2025 when clients began auditing AI tool usage.

Option 1: Disable Copilot for Sensitive Files

The simplest approach is selective exclusion. Copilot supports a .copilotignore file that works like .gitignore:

# .copilotignore
src/algorithms/
src/models/risk-engine/
config/credentials/
*.env

This tells Copilot not to index those directories. The limitation: developers still have Copilot available and can manually paste sensitive code into prompts. You cannot enforce exclusion through tooling alone.

When to use this approach

Teams where developer judgment is the right enforcement mechanism. It works well for small teams with strong security culture and low regulatory exposure.

Option 2: Enterprise Agreement with Telemetry Disabled

GitHub Copilot for Business and Copilot Enterprise include options to disable telemetry and data retention. With these settings:

- Prompts are not retained after completion generation - Code is not used for model training - Audit logs are available for enterprise review

This reduces risk but does not eliminate transit exposure. The code still leaves your network and is processed on Microsoft infrastructure, even if it is not retained.

When to use this approach

Organizations with Microsoft enterprise agreements already in place, where the primary concern is data retention rather than transit.

Option 3: Proxy-Level Protection with Pretense

Pretense operates as a local proxy between your IDE and any AI API, including the OpenAI-compatible API that Copilot extensions use. It intercepts every request, mutates proprietary identifiers before transit, and reverses the mutation in the response.

Setup for Copilot-compatible extensions:

bash

# Install Pretense

# Initialize in your project pretense init

# Start the proxy pretense start --port 9339

# Configure your Copilot-compatible extension to use the local proxy # In VS Code settings: # "github.copilot.advanced": { "serverUrl": "http://localhost:9339" } ```

With this configuration, the code that leaves your network contains only synthetic identifiers. Your proprietary function names, variable names, and class names never reach Microsoft servers.

What the developer experiences

Nothing changes. Copilot completions still appear inline. The suggestions reference synthetic names in the response, but Pretense reverses them before the IDE receives the completion. The developer sees completions with their real identifier names.

When to use this approach

Teams with proprietary algorithms, regulated data in codebases, or explicit client confidentiality requirements. Also appropriate for any team that wants an audit trail of what code was sent to AI APIs.

Enforcement at the CI/CD Layer

Developer-side protection works until a developer bypasses it. Enterprise enforcement requires pushing protection to the pipeline.

Pretense includes a GitHub Action that scans pull requests for unprotected AI API calls:

yaml

# .github/workflows/ai-security.yml
- uses: pretense/scan-action@v1
  with:
    fail-on: critical
    report: pr-comment

This blocks merges when proprietary identifiers are detected in outbound API calls, regardless of which AI tool generated them.

Practical Recommendation

The right approach depends on your regulatory environment and risk tolerance:

- Low regulatory exposure, small team: .copilotignore plus developer education - Enterprise with Microsoft contracts: Copilot for Business with telemetry disabled - Regulated industry or proprietary algorithms: Pretense proxy plus CI enforcement

The last category is the only approach that gives you both protection and an audit trail. If your SOC2 auditor or your client asks to see evidence of AI tool controls, a proxy log is a demonstrable artifact. An employee training record is not.

[See Pretense in action](/demo) or [start a free trial](/trial).

How to Protect Proprietary Code When Using GitHub Copilot

What Copilot Actually Sends to Microsoft

The Three Risk Categories

Category 1: Regulated Data in Code

Category 2: Proprietary Algorithms

Category 3: Client Confidentiality

Option 1: Disable Copilot for Sensitive Files

When to use this approach

Option 2: Enterprise Agreement with Telemetry Disabled

When to use this approach

Option 3: Proxy-Level Protection with Pretense

What the developer experiences

When to use this approach

Enforcement at the CI/CD Layer

Practical Recommendation

Ask Pretense anything