First-run with agent skills from Anthropic

I'm staying open minded to the new AI stuff so I gave Anthropic's new Agent Skills a shot this morning. Like with MCP, I found it pretty hard to get through the "benefits-oriented" prose of the blog post to just figure out what the heck it is.

What a skill is

A basic skill is a Markdown file with a YAML header that gives the LLM instruction for how to use some command-line tools. It's not much different than a custom prompt, and the way it is formatted is the same as a blog post.

The point of the YAML header is that LLMs still handle context poorly: if you add a ton of extra guidance to them along with each request, they'll be expensive and inaccurate. Using the YAML header, the LLM can browse for the skill it wants and just load that one skill. In order words: part of this is a workaround for an inherent or at last unfixed problem.

Skills can also include some code and run it. When they're local, when you're using Claude Code, that code runs on your local machine. So if you tell it to use grep, it'll use your system grep. If it's remote, then it'll run on the 'Agent virtual machine'.

My first-run experience

I asked Claude to create a skill that uses ast-grep to find function calls in a codebase, because Claude's own searching can be lackluster and ast-grep is awesome.

Results were mixed: it created this skill, but messed up two critical elements:

It put the skill in .claude/skills/ast-grep.md, when it should have been in .claude/skills/ast-grep/SKILL.md
It didn't add the YAML header with the name and description of the skill

So, the skill wasn't functional. I debugged these issues and got it started, but still: could have been better.

Open questions

Skills seem useful.

My first-run experience was oddly bad: it's surprising that Anthropic doesn't have some really strong prompting for creating skills that tell Claude Code where the skill file should go.

The approach to dependencies in Skills is very haphazard. When skills run in the code execution tool (when you use them from the API or online), they have a minimal environment with Python, a few packages, and no ability to install extra dependencies. When they run on your local machine, they can go crazy and just use any binary in your path willy-nilly. It seems really hard for a "skill" to be a reusable unit if it has to target these two very different environments.

Dependencies and sandboxing is hard, of course - a pedant would want Skills to come with Nix flakes or Docker containers or whatever, but that would dramatically limit their audience.

I fear the security implications of skills. The skills you install via Claude Code have very broad access to your system. The Security Considerations that Anthropic lists are wildly optimistic: really are people who use Skills going to audit all the code in the skills? Will they resist the urge to install skills from arbitrary sources on the internet and your odd GitHub repo? I think it evokes the general feeling of LLM coding in which automation makes you comfortable 95% of the time and then 5% of the time you need your full alertness to make sure that the skill you're copying from GitHub doesn't have a crypto wallet drainer. Seems tricky.

Simon Willison's writeup is very good and I recommend it - he's much more immersed in this stuff than I am.