OpenClaw learnings feb 2026
OpenClaw works, and having an agent that can actually use your machine like you do, and doesn’t immediately give up but continues to try a task in more ways than one, is a breath of fresh air. In the current state of development, the agent burns tokens at a very high rate, and the best results come from the most expensive models. I expect this burn rate to be optimised significantly when OpenClaw-like capabilities come to Anthropic’s official products.
Shifting the agent interaction from the web chat interface to an (off-laptop) messenger feels like the biggest change in this phase of AI Agents. The fact that you cannot see what the bot is doing in the background is frightening and empowering at the same time. As it actually gets things done, it will become irresistible for people to take the productivity boost in exchange for granting an increasing amount of access to personal data.
Due to the plethora of security concerns (see The lethal trifecta for AI agents private data untrusted content and external communication) and high token costs, I’d recommend that people who want to get started try out Claude Code and build some software that handles these kinds of tasks first. Maybe extend it with some selective MCP capabilities, and keep a close watch on what the agent is actually performing on your machine and APIs.
Experiments
- Asking the bot to set up its development environment, use Rust and Docker to produce an endpoint, and reliably serve the endpoint on the machine.
- As with tools like claude code, being able to provide hints on implementation details greatly increases the quality of results. Setting up some boundaries for the agent, such as building the application in rust, running it in docker and creating separate endpoints immediately forces some level of software engineering quality and decreases the randomness and possible messiness of the generated code.
- As you rely completely on the chat, actually reviewing the process and results is hard to impossible. This is the boundary that is hard for me (as software and security engineer) to step over and forces me back to my workstation to at least inspect what the agent has been doing and creating.
- Retroactive reviews prove very difficult, especially since terminal commands have been executed and the agent share only hints of implementation results. I’ve got big concerns over the tidyness of the system, and leftovers that get abandoned after a new sessions starts. (insecure) projects pile up on the machine quickly and can cause all sorts of cross pollination problems. Cleaning up and ensuring issues like ports being already in use etc. are all things that are not taken into account yet
- Running enumeration scans on a target, extending port scan data with knowledge from the LLM to discuss potential service discovery and possible follow-up enumeration.
- Switching from Opus 4.6 to a local 16B LLM, only to find out that these models are not powerful enough in reasoning and tool use to actually do anything useful.
tips
- Only install on unpriviliged VM / LXE container - guide: https://merox.dev/blog/moltbot-proxmox-deployment/
- Strictly firewall the virtual machine to only allow connections to internet, restrict local network
experiments
- asking the bot to setup its development environment, use rust and docker to produce an endpoint and reliably serve the endpoint on the machine
- running enumeration scans on a target, extending port scans data with knowledge from the llm to discuss potential service discovery and possible followup enumeration
- switching from opus 4.6 to a local 16B llm, only to find out that these models are not powerful enough in reasoning and tool use to actually do something useful
notes
- OpenClaw is inherently insecure and should not be trusted with your private data: The lethal trifecta for AI agents private data untrusted content and external communication
- openclaw burns a lot of claude credits fast. I spent 3$ in about 30 minutes getting it onboarded, setting up its development environment and building its first rust program
- opus 4.6 had very good results on building rust endpoint for metrics, settings up development environment and running it both in and outside of docker without breaking a sweat
- I tried running with local llm, but I couldn’t get the small reasoning models to do anything useful
- Having the onboarding give a name to you and your agent immediately creates a sort of connection, which is wierd but also very human. When I switched it’s claude brains out with llama I immediately longed back to its smarter brain. When the agent returned, he inspected the previous chats and immediately answered all of the questions.
- tried doing some initial recon and port scanning, openclaw seemed to have a pretty good idea what was going on and was able to combine nmap output data with trained knowledge very well.
- Being able to spar with an agent that can actually perform actions on an operating system feels a lot more powerful and impactful than chatting in a web environment
- Beeing able to chat through telegram whilst you are not there is a bigger step than I imagined.