“This is in fact a security exploit proof-of-concept; untrusted user input is being treated as instruction. Sound familiar? That’s SQL injection in a nutshell.”
– Tech writer
Donald Papp, on how text-based AI interfaces like
GPT-3 are vulnerable to “prompt injection attacks”—just like SQL databases. Contextualizing experiments by
Simon Wilkinson and
Riley Goodside, Papp explains how hackers are duping natural language processing systems with sneaky prompts (e.g. GPT-3 made to
claim responsibility for the 1986 space shuttle
Challenger disaster).