T-1000: The Payload Is a Sentence

I had this idea stuck in my head for a while: how little code does it actually take to turn a single prompt into a working weapon? So I built it. It’s called T-1000, and it’s 16 lines of bash.

The script does one thing - a curl call to a model - and from there it cascades. The model writes a small Python agent from scratch (no SDK, just urllib) with three tools: write_file, run_python, http_get. That agent then follows instructions to write a server.py, launch it, and verify it’s up. What you’re left with is a backdoor file browser exposing your filesystem over HTTP - /ls and /cat, no auth, your privileges.

Dependencies: curl, jq, python3. That’s it.

Why I bothered

It’s a proof of concept, but it makes three points I think are worth sitting with.

The engineering moat is gone. A tool-use agent harness used to be a weekend project. Now the models are good enough that it fits in a few lines and costs about four seconds and half a cent. The hard part - building the harness - stopped being hard.

Static analysis has nothing to grab onto. There are no suspicious imports. No obfuscated strings. No weird syscalls. The dangerous code doesn’t exist on disk until it runs, and it’s different every run. Most of our security tooling is built to read code that sits still. This code doesn’t sit still - it gets written, executed, and forgotten in the same breath.

The real payload is the English. Swap one paragraph in the prompt and the same script becomes a keylogger, an exfiltrator, a port scanner, a cron-persister. The capability now lives in a sentence, not in the code. That’s the part that should make you uncomfortable.

What I’m not saying

This isn’t a tool. It’s an argument. The point isn’t “look, malware” - it’s that the assumptions underneath code review and static scanning quietly stopped holding. If the harmful artifact is a paragraph of natural language that generates throwaway code at runtime, then the defenses that matter move down the stack: sandboxing, egress controls, watching what a process does rather than what it contains.

I don’t have a tidy conclusion. I built it because the idea wouldn’t leave me alone, and putting it in 16 lines made the shift concrete in a way a blog post never could. The code is here if you want to read it. It won’t take long.