Assessing GitHub Copilot

My take on Copilot's risks, mapped against the OWASP Top 10 for LLM Applications.

My take on Copilot’s risks, mapped against the OWASP Top 10 for LLM Applications — and where I actually land on it.

Tools like GitHub Copilot are reshaping how engineers write code — GPT-4 quietly auto-completing functions, suggesting tests, and answering questions in the editor. The productivity story is real. But every new dev tool deserves a security review, so here’s mine, written more for engineers and security leads than for marketing decks.

Why Copilot stands out

Code completions

Copilot’s core feature is context-aware code suggestions. They’re surprisingly good, and they expose engineers to patterns they’d otherwise have to go searching for.

Copilot Chat

An IDE-native conversational interface. Generate code, write docs, build unit tests, ask why a function does what it does — all in the editor, in plain English.

CLI assistance

Copilot extends suggestions and completions to the terminal, which is sometimes weirdly delightful and sometimes wrong in confident ways.

Security and privacy concerns

Sensitive information disclosure

Anything you type into a prompt is potentially leaving your machine. The well-publicized Samsung ChatGPT incident is the canonical example: secrets, internal references, proprietary code — all easy to leak by accident.

Overreliance

Accepting LLM-generated code without scrutiny creates real, measurable risk: vulnerabilities, insecure patterns, and propagated misinformation. Strong reviews and validation are still non-negotiable.

Supply chain & model theft

LLMs can become part of your supply chain — both as a new path for malicious code and as a target for IP theft. Treat LLM-suggested dependencies the same way you’d treat a random GitHub fork: cautiously.

The framework: OWASP Top 10 for LLM Applications

I scored Copilot against the OWASP Top 10 for LLM Applications. Quick verdicts here, with rationale and recommended controls in the long-form version.

LLM01 — Prompt Injection: moderate risk, mostly mitigated by guardrails and prompt hygiene.
LLM02 — Insecure Output Handling: handled the same way you’d handle any third-party output: review, SAST, validation.
LLM03 — Training Data Poisoning: low direct risk to consumers, but worth tracking provider posture.
LLM04 — Model DoS: not really a consumer-facing risk for Copilot.
LLM05 — Supply Chain Vulnerabilities: meaningful — treat suggested dependencies as untrusted.
LLM06 — Sensitive Information Disclosure: the headline risk for most enterprises.
LLM07 — Insecure Plugin Design: applies more to extensions than to core Copilot usage.
LLM08 — Excessive Agency: increasingly relevant as Copilot Chat takes on more of the dev environment.
LLM09 — Overreliance: the cultural risk — fixed with reviews and education, not config.
LLM10 — Model Theft: not really applicable to a hosted product like Copilot.

Where I land

Copilot is safe to use in most enterprise environments when it’s wrapped in the right controls: clear data-handling policies, strong code review, secret scanning, and ongoing developer education. The biggest risks aren’t strictly technical — they’re cultural (overreliance) and behavioral (sensitive prompts).

The right question isn’t “is Copilot safe?” — it’s “are we set up to benefit from it without being burned by it?”