Nightshade
AI companies scrape public code to train their models — usually without asking the people who wrote it.
Nightshade gives developers a way to fight back. It subtly rewrites your source code so it still compiles and behaves exactly the same for humans and machines — but becomes "poisoned", low-quality training material for any AI that scrapes it. Your code keeps working; the scraper gets noise.
Eight obfuscation strategies — including misleading identifier renames, plausible dead-code injection, comment poisoning, string encoding, control-flow flattening, and a steganographic watermark that can prove ownership — applied through a weighted entropy pipeline over an AST/lexer. Built for Java first, with Python, JavaScript, and TypeScript support, it ships as a CLI, a GitHub Action, and a pre-commit hook. Named in homage to the Nightshade image-poisoning research project — this applies the same idea to source code. Co-created with Saif-ur-Rehman.
Released at v3.5.0 with a JUnit 5 test suite and a hardened release pipeline: SLSA provenance, Sigstore signing, an SBOM, and CodeQL scanning.
This project taught me compiler-level engineering: lexing, AST manipulation, and what "the code must behave identically" actually demands from a verification step.