Plain-Text Ops: Running LLM Agents with Nothing but Bash

June 21, 2025 • 8 min read

Note: these articles are auto generated from my Obsidian notebook by Claude

(and why you probably don't need another shiny "agent platform")

1 · Why a shell script often beats an orchestration layer

Commercial "MCP" dashboards promise observability, but they add their own retry logic, auth scopes, and breaking-change upgrades. By contrast, Bash comes free on every POSIX box, lives in Git, and prints every token to your terminal. Ubiquity and total auditability are hard to beat. Medium-scale users have been scripting OpenAI calls this way since GPT-3 hit the public API, long before today's agent frameworks existed (medium.com).

2 · The ReAct + scratch-pad pattern in two files

Research shows that letting a model interleave "thinking" and tool calls (the ReAct pattern) improves both accuracy and debuggability (arxiv.org). Frameworks like LangChain expose this via an agent_scratchpad variable—the place the model is allowed to scribble intermediate reasoning (langchain-doc.readthedocs.io).

A shell script can replicate this with nothing more than file redirection:

# run_agent.sh (core loop)
AGENT_CMD="$1"               # e.g. python agent.py --task "$2"
PAD="scratchpad.md"

$AGENT_CMD | tee -a "$PAD"   # 1️⃣ stream model thoughts & actions
echo -e "\n---\n# Plan\n" >>"$PAD"   # 2️⃣ give it room to ReAct

3 · Minimal architecture (flow)

LLM/CLI → stdout ─┐
                  ├─ tee → scratchpad.md → model reads back → …
Terminal log ─────┘
      ⤷ on "TASK DONE" ➜ mv scratchpad.md memory_bank/2025-06-21_....md

Every character the model emits is simultaneously (a) visible to the operator and (b) stored in a Markdown file the model can inspect on its next turn.

4 · Memory that fits in Git

Long-lived agents eventually overflow the context window. Hierarchical file banks—current summary, product context, done tasks—mirror the tiered approach popularised by MemGPT for limitless conversations (mlexpert.io). Because they're plain text, you can:

version them,
grep them,
diff them in PRs, and
load only the parts that matter for the next prompt.

No bespoke vector DB is required for many workflows.

5 · Guardrails in environment variables

Hard-code defaults, let ops override at runtime:

DEFAULT_TOOL_ITERATIONS = int(os.getenv("DEFAULT_TOOL_ITERATIONS", 40))
GLOBAL_TOOL_CALL_LIMIT  = int(os.getenv("GLOBAL_TOOL_CALL_LIMIT", 100))

Bumping limits for a heavy migration becomes a one-liner: export DEFAULT_TOOL_ITERATIONS=60 && ./run_agent.sh. Test first, of course—freeze behaviour with characterization tests before you touch the numbers.

6 · Handling the "read-only tools are free now" spike

If your platform recently dropped auth checks on read-only tools, expect call volume to surge. A 12-line Bash cache is usually enough:

key=$(printf "%s|%s" "$TOOL" "$ARGS" | sha1sum | cut -d' ' -f1)
cache="/tmp/rocache/$key"

[[ -f $cache ]] && cat "$cache" || {
  out=$(call_tool "$TOOL" "$ARGS")
  printf "%s" "$out" | tee "$cache"
}

Transparent, cheap, and controllable with rm -rf /tmp/rocache/*.

7 · When you might still want a framework

Team collaboration (role-based UIs, metrics dashboards).
Scaling past a single box (task queues, GPU scheduling).
Complex conversation memory (semantic search over millions of tokens).

But for solo developers or small teams running scripted build-and-test loops, the shell-plus-Markdown pattern is already 90 % of what matters—with none of the lock-in.

8 · Further reading

ReAct original paper for reasoning-and-acting synergy (arxiv.org)
Bash-only ChatGPT scripting walk-through (medium.com)
MemGPT's tiered memory concept for unbounded context (mlexpert.io)
LangChain scratch-pad variable docs (concept applies regardless of framework) (langchain-doc.readthedocs.io)

Spin up a shell, point your model CLI at it, and discover how much "agent ops" you get for free when everything is just text.

The Bottom Line

For most LLM agent workflows, a combination of Bash scripts and Markdown files delivers 90% of what sophisticated frameworks promise—with zero lock-in, complete transparency, and native Git versioning. The shell-plus-Markdown pattern gives you ReAct capabilities, tiered memory management, and operational guardrails using tools that have been stable for decades. Before reaching for another agent platform, try running your LLM with nothing but plain text operations.