Agent security is just security

Suddenly I have been hearing the term Landlock more in (agent) security circles. To me this is a bit weird because while Landlock is absolutely a useful Linux security tool, it’s been a bit obscure and that’s for good reason. It feels to me a lot like the how weird prevalence of the word delve became a clear tipoff that LLMs were the ones writing, not a human.

Here’s my opinion: Agentic LLM AI security is just security.

We do not need to reinvent any fundamental technologies for this. Most uses of agents one hears about provide the ability to execute arbitrary code as a feature. It’s how OpenCode, Claude Code, Cursor, OpenClaw and many more work.

Especially let me emphasize since OpenClaw is popular for some reason right now: You should absolutely not give any LLM tool blanket read and write access to your full user account on your computer. There are many issues with that, but everyone using an LLM needs to understand just how dangerous prompt injection can be. This post is just one of many examples. Even global read access is dangerous because an attacker could exfiltrate your browser cookies or other files.

Let’s go back to Landlock – one prominent place I’ve seen it mentioned is in this project nono.sh pitches itself as a new sandbox for agents. It’s not the only one, but indeed it heavily leans on Landlock on Linux. Let’s dig into this blog post from the author. First of all, I’m glad they are working on agentic security. We both agree: unsandboxed OpenClaw (and other tools!) is a bad idea.

Here’s where we disagree:

With AI agents, the core issue is access without boundaries. We give agents our full filesystem permissions because that’s how Unix works. We give them network access because they need to call APIs. We give them access to our SSH keys, our cloud credentials, our shell history, our browser cookies – not because they need any of that, but because we haven’t built the tooling to say “you can have this, but not that.”

No. We have had usable tooling for “you can have this, but not that” for well over a decade. Docker kicked off a revolution for a reason: docker run <app> is “reasonably completely isolated” from the host system. Since then of course, there’s many OCI runtime implementations, from podman to apple/container on MacOS and more.

If you want to provide the app some credentials, you can just use bind mounts to provide them like docker|podman|ctr -v ~/.config/somecred.json:/etc/cred.json:ro. Notice there the ro which makes it readonly. Yes, it’s that straightforward to have “this but not that”.

Other tools like Flatpak on Linux have leveraged Linux kernel namespacing similar to this to streamline running GUI apps in an isolated way from the host. For a decade.

There’s far more sophisticated tooling built on top of similar container runtimes since then, from having them transparently backed by virtual machines, Kubernetes and similar projects are all about running containers at scale with lots of built up security knowledge.

That doesn’t need reinventing. It’s generic workload technology, and agentic AI is just another workload from the perspective of kernel/host level isolation. There absolutely are some new, novel risks and issues of course: but again the core principle here is we don’t need to reinvent anything from the kernel level up.

Security here really needs to start from defaulting to fully isolating (from the host and other apps), and then only allow-listing in what is needed. That’s again how docker run worked from the start. Also on this topic, Flatpak portals are a cool technology for dynamic resource access on a single host system.

So why do I think Landlock is obscure? Basically because most workloads should already be isolated already per above, and Landlock has heavy overlap with the wide variety of Linux kernel security mechanisms already in use in containers.

The primary pitch of Landlock is more for an application to further isolate itself – it’s at its best when it’s a complement coarse-grained isolation techniques like virtualization or containers. One way to think of it is that often container runtimes don’t grant privileges needed for an application to further spawn its own sub-containers (for kernel attack surface reasons), but Landlock is absolutely a reasonable thing for an app to use to e.g. disable networking from a sub-process that doesn’t need it, etc.

Of course the challenge is that not every app is easy to run in a container or virtual machine. Some workloads are most convenient with that “ambient access” to all of your data (like an IDE or just a file browser).

But giving that ambient access by default to agentic AI is a terrible idea. So don’t do it: use (OCI) containers and allowlist in what you need.

(There’s other things nono is doing here that I find dubious/duplicative; for example I don’t see the need for a new filesystem snapshotting system when we have both git and OCI)

But I’m not specifially trying to pick on nono – just in the last two weeks I had to point out similar problems in two different projects I saw go by also pitched for AI security. One used bubblewrap, but with insufficient sandboxing, and the other was also trying to use Landlock.

On the other hand, I do think the credential problem (that nono and others are trying to address in differnet ways) is somewhat specific to agentic AI, and likely does need new tooling. When deploying a typical containerized app usually one just provisions a few relatively static credentials. In contrast, developer/user agentic AI is often a lot more freeform and dynamic, and while it’s hard to get most apps to leak credentials without completely compromising it, it’s much easier with agentic AI and prompt injection. I have thoughts on credentials, and absolutely more work here is needed.

It’s great that people want to work on FOSS security, and AI could certainly use more people thinking about security. But I don’t think we need “next generation” security here: we should build on top of the “previous generation”. I actually use plain separate Unix users for isolation for some things, which works quite well! Running OpenShell in a secondary user account where one only logs into a select few things (i.e. not your email and online banking) is much more reasonable, although clearly a lot of care is still needed. Landlock is a fine technology but is just not there as a replacement for other sandboxing techniques. So just use containers and virtual machines because these are proven technologies. And if you take one message away from this: absolutely don’t wire up an LLM via OpenShell or a similar tool to your complete digital life with no sandboxing.

LLMs and core software: human driven

It’s clear LLMs are one of the biggest changes in technology ever. The rate of progress is astounding: recently due to a configuration mistake I accidentally used Claude Sonnet 3.5 (released ~2 years ago) instead of Opus 4.6 for a task and looked at the output and thought “what is this garbage”?

But daily now: Opus 4.6 is able to generate reasonable PoC level Rust code for complex tasks for me. It’s not perfect – it’s a combination of exhausting and exhilarating to find the 10% absolutely bonkers/broken code that still makes it past subagents.

So yes I use LLMs every day, but I will be clear: if I could push a button to “un-invent” them I absolutely would because I think the long term issues in larger society (not being able to trust any media, and many of the things from Dario’s recent blog etc.) will outweigh the benefits.

But since we can’t un-invent them: here’s my opinion on how they should be used. As a baseline, I agree with a lot from this doc from Oxide about LLMs. What I want to talk about is especially around some of the norms/tools that I see as important for LLM use, following principles similar to those.

On framing: there’s “core” software vs “bespoke”. An entirely new capability of course is for e.g. a nontechnical restaurant owner to use an LLM to generate (“vibe code”) a website (excepting hopefully online orderings and payments!). I’m not overly concerned about this.

Whereas “core” software is what organizations/businesses provide/maintain for others. I work for a company (Red Hat) that produces a lot of this. I am sure no one would want to run for real an operating system, cluster filesystem, web browser, monitoring system etc. that was primarily “vibe coded”.

And while I respect people and groups that are trying to entirely ban LLM use, I don’t think that’s viable for at least my space.

Hence the subject of this blog is my perspective on how LLMs should be used for “core” software: not vibe coding, but using LLMs responsibly and intelligently – and always under human control and review.

Agents should amplify and be controlled by humans

I think most of the industry would agree we can’t give responsibility to LLMs. That means they must be overseen by humans. If they’re overseen by a human, then I think they should be amplifying what that human thinks/does as a baseline – intersected with the constraints of the task of course.

On “amplification”: Everyone using a LLM to generate content should inject their own system prompt (e.g. AGENTS.md) or equivalent. Here’s mine – notice I turn off all the emoji etc. and try hard to tune down bulleted lists because that’s not my style. This is a truly baseline thing to do.

Now most LLM generated content targeted for core software is still going to need review, but just ensuring that the baseline matches what the human does helps ensure alignment.

Pull request reviews

Let’s focus on a very classic problem: pull request reviews. Many projects have wired up a flow such that when a PR comes in, it gets reviewed by a model automatically. Many projects and tools pitch this. We use one on some of my projects.

But I want to get away from this because in my experience these reviews are a combination of:

  • Extremely insightful and correct things (there’s some amazing fine-tuning and tool use that must have happened to find some issues pointed out by some of these)
  • Annoying nitpicks that no one cares about (not handling spaces in a filename in a shell script used for tests)
  • Broken stuff like getting confused by things that happened after its training cutoff (e.g. Gemini especially seems to get confused by referencing the current date, and also is unaware of newer Rust features, etc)

In practice, we just want the first of course.

How I think it should work:

  • A pull request comes in
  • It gets auto-assigned to a human on the team for review
  • A human contributing to that project is running their own agents (wherever: could be local or in the cloud) using their own configuration (but of course still honoring the project’s default development setup and the project’s AGENTS.md etc)
  • A new containerized/sandboxed agent may be spawned automatically, or perhaps the human needs to click a button to do so – or perhaps the human sees the PR come in and thinks “this one needs a deeper review, didn’t we hit a perf issue with the database before?” and adds that to a prompt for the agent.
  • The agent prepares a draft review that only the human can see.
  • The human reviews/edits the draft PR review, and has the opportunity to remove confabulations, add their own content etc. And to send the agent back to look more closely at some code (i.e. this part can be a loop)
  • When the human is happy they click the “submit review” button.
  • Goal: it is 100% clear what parts are LLM generated vs human generated for the reader.

I wrote this agent skill to try to make this work well, and if you search you can see it in action in a few places, though I haven’t truly tried to scale this up.

I think the above matches the vision of LLMs amplifying humans.

Code Generation

There’s no doubt that LLMs can be amazing code generators, and I use them every day for that. But for any “core” software I work on, I absolutely review all of the output – not just superficially, and changes to core algorithms very closely.

At least in my experience the reality is still there’s that percentage of the time when the agent decided to reimplement base64 encoding for no reason, or disable the tests claiming “the environment didn’t support it” etc.

And to me it’s still a baseline for “core” software to require another human review to merge (per above!) with their own customized LLM assisting them (ideally a different model, etc).

FOSS vs closed

Of course, my position here is biased a bit by working on FOSS – I still very much believe in that, and working in a FOSS context can be quite different than working in a “closed environment” where a company/organization may reasonably want to (and be able to) apply uniform rules across a codebase.

While for sure LLMs allow organizations to create their own Linux kernel filesystems or bespoke Kubernetes forks or virtual machine runtime or whatever – it’s not clear to me that it is a good idea for most to do so. I think shared (FOSS) infrastructure that is productized by various companies, provided as a service and maintained by human experts in that problem domain still makes sense. And how we develop that matters a lot.

Thoughts on agentic AI coding as of Oct 2025

Sandboxed, reviewed parallel agents make sense

For coding and software engineering, I’ve used and experimented with various frontends (FOSS and proprietary) to multiple foundation models (mostly proprietary) trying to keep up with the state of the art. I’ve come to strongly believe in a few things:

  • Agentic AI for coding needs strongly sandboxed, reproducible environments
  • It makes sense to run multiple agents at once
  • AI output definitely needs human review

Why human review is necessary

Prompt injection is a serious risk at scale

All AI is at risk of prompt injection to some degree, but it’s particularly dangerous with agentic coding. All the state of the art today knows how to do is mitigate it at best. I don’t think it’s a reason to avoid AI, but it’s one of the top reasons to use AI thoughtfully and carefully for products that have any level of criticality.

OpenAI’s Codex documentation has a simple and good example of this.

Disabling the tests and claiming success

Beyond that, I’ve experienced multiple times different models happily disabling the tests or adding a println!("TODO add testing here") and claim success. At least this one is easier to mitigate with a second agent doing code review before it gets to human review.

Sandboxing

The “can I do X” prompting model that various interfaces default to is seriously flawed. Anthropic has a recent blog post on Claude Code changes in this area.

My take here is that sandboxing is only part of the problem; the other part is ensuring the agent has a reproducible environment, and especially one that can be run in IaaS environments. I think devcontainers are a good fit.

I don’t agree with the statement from Anthropic’s blog

without the overhead of spinning up and managing a container.

I don’t think this is overhead for most projects because Where it feels like it has overhead, we should be working to mitigate it.

Running code as separate login users

In fact, one thing I think we should popularize more on Linux is the concept of running multiple unprivileged login users. Personally for the tasks I work on, it often involves building containers or launching local VMs, and isolating that works really well with a full separate “user” identity. An experiment I did was basically useradd ai and running delegated tasks there instead. To log in I added %wheel ALL=NOPASSWD: /usr/bin/machinectl shell ai@ to /etc/sudoers.d/ai-login so that my regular human user could easily get a shell in the ai user’s context.

I haven’t truly “operationalized” this one as juggling separate git repository clones was a bit painful, but I think I could automate it more. I’m interested in hearing from folks who are doing something similar.

Parallel, IaaS-ready agents…with review

I’m today often running 2-3 agents in parallel on different tasks (with different levels of success, but that’s its own story).

It makes total sense to support delegating some of these agents to work off my local system and into cloud infrastructure.

In looking around in this space, there’s quite a lot of stuff. One of them is Ona (formerly Gitpod). I gave it a quick try and I like where they’re going, but more on this below.

Github Copilot can also do something similar to this, but what I don’t like about it is that it pushes a model where all of one’s interaction is in the PR. That’s going to be seriously noisy for some repositories, and interaction with LLMs can feel too “personal” sometimes to have permanently recorded.

Credentials should be on demand and fine grained for tasks

To me a huge flaw with Ona and one shared with other things like Langchain Open-SWE is basically this:

Sorry but: no way I’m clicking OK on that button. I need a strong and clearly delineated barrier between tooling/AI agents acting “as me” and my ability to approve and push code or even do basic things like edit existing pull requests.

Github’s Copilot gets this more right because its bot runs as a distinct identity. I haven’t dug into what it’s authorized to do. I may play with it more, but I also want to use agents outside of Github and I also am not a fan of deepening dependence on a single proprietary forge either.

So I think a key thing agent frontends should help do here is in granting fine-grained ephemeral credentials for dedicated write access as an agent is working on a task. This “credential handling” should be a clearly distinct component. (This goes beyond just git forges of course but also other issue trackers or data sources that may be in context).

Conclusion

There’s so much out there on this, I can barely keep track while trying to do my real job. I’m sure I’m not alone – but I’m interested in other’s thoughts on this!