Make Qwen Code Actually Use Your Skills

Jun 3, 2026

Andrey Pozdnyakov

and

Justin Noel

In part 1 of our series Make AI Code Your Way we walked through how project-scoped SKILL.md files solved most of our "this code does not look like ours" problem on Claude Code. Two skills, a couple of TRIGGER clauses, and a one-line table at the bottom of CLAUDE.md were enough to make Claude reliably reach for the right skill mid-plan.

In this installment, we take the same skills, conventions and expectations to a different agent: Qwen Code CLI running against a locally hosted Qwen3-Coder-80B-A3B on our own local inference hardware.

The skills format is a defacto standard and identical skills should work across coding agents. However, the behavior was not identical in Qwen Code CLI. A little extra effort went into making the Qwen3-Coder-80B-A3B spontaneously call our skills.

Running Qwen Locally

We use Anthropic's frontier models Sonnet and Opus quite a bit, but for some projects we require locally hosted models. Here’s why Qwen3-Coder-80B-A3B is the local model we reach for the most:

Privacy

Some work can’t leave our network. A local model wins by default for any code we can’t send to a third party.

Cost

Frontier models earn their price when the problem is hard. But for the 60% of coding work that’s routine, a local model running on hardware we already own is much cheaper.

Qwen3-Coder-80B-A3B is lightning fast and relatively accurate on our RTX 6000 GPUs. But fast and accurate in the abstract is not the same as fast and accurate on a Qt/QML codebase that uses an aggressively dependency-injected, SOLID-shaped architecture. (Read more about that in Part 1).

Two things had to be true for Qwen to be productive on our project. First, the skills had to be written for it. Second, the skills had to actually get invoked.

Smaller Models Need More Hand-Holding

While Claude Sonnet or Opus could often infer our conventions from one or two existing classes, Qwen couldn’t. We had to spell out the conventions. That’s why our Qwen-facing skills are noticeably longer than the Claude-facing originals. There are more worked examples, more explicit "here is the wrong way, here is the right way" tables, and more discussion of edge cases.

It is said that free isn’t free if your time is worth money. We had to invest time in improving and verifying our skills and managing the model’s context. In the end it is an investment as these skills and techniques can be used on all Qt projects. These patterns helped us a lot:

Toolkit documentation via internal RAG

We run an internal RAG server that indexes Qt and QML documentation. Qwen calls into it through MCP for every Qt API question instead of answering from memory. Without it, Qwen hallucinates Qt signatures and enum values at an intolerable rate. But with it, accuracy on Qt APIs is vastly better.

Verbose, opinionated skills

Anywhere Claude would "follow the existing pattern," the Qwen-facing version needed specific examples. The skill body grew, while the variance in the generated code shrank. Turns out the improved skill is also backwards compatible with Claude so it is a win all around.

What’s the lesson here? A local model is not a drop-in replacement for a frontier model. It’s a different tool with different failure modes. We discovered that skills are a powerful way to close the gap, but only if we’re willing to invest in them more heavily than we do for the frontier model.

Qwen Rarely Invoked Skills On Its Own

What nearly sank our Qwen rollout was not code quality — RAG and careful skill content fixed many of the issues with Qt/QML code generation we saw. Instead, it was the fact that Qwen almost never loaded those skills on its own.

With the same project, the same skills and the same CLAUDE.md-style quick-reference (a QWEN.md), Qwen Code CLI was visibly worse than Claude Code at deciding "this prompt matches a skill, I should load it."

Explicit /skill invocation worked. Direct prompts sometimes worked. Plan execution almost never worked. A plan step that said "add a new QML component and test" would just start writing code without the skill loaded into context.

This may improve in a few versions since Qwen Code ships fast. But we needed something that worked now.

A Brief Word on Hooks

Qwen Code recently added a hooks system very similar to Claude Code's. A hook is a helper (in our case, a shell command) the agent runs at a specific session lifecycle point: before a prompt is submitted, after a tool call, on session start, and so on. The hook can place the contents of that shell command's stdout into context, and the agent folds that output into its next turn. It is a clean escape hatch for "I want to inject some text right here, every single time."

UserPromptSubmit + skill_reminder.md

We hooked into UserPromptSubmit. On every prompt, we injected the contents of a short skill_reminder.json file that tells Qwen which skills exist and when to use each one. It’s essentially the same idea as the QWEN.md table, but instead of relying on QWEN.md to stay salient through a long session, we force the reminder into the top of every user turn.

The hook configuration

In ~/.qwen/settings.json (or the project-level equivalent):

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "cat ./.qwen/skill_reminder.json"
          }
        ]
      }
    ]
  }
}

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "cat ./.qwen/skill_reminder.json"
          }
        ]
      }
    ]
  }
}

Command here should send reminder text in specific json format to stdin:

{
  "continue": true,
  "hookSpecificOutput": {
    "additionalContext": "Reminder text"
  }
}

{
  "continue": true,
  "hookSpecificOutput": {
    "additionalContext": "Reminder text"
  }
}

The skill_reminder.json file

{
  "continue": true,
  "hookSpecificOutput": {
    "additionalContext": "Project Skill Reminder. This project ships Qt/QML skills that encode our architecture and test conventions. Invoke the matching skill BEFORE writing code. 

Skill: creating-qt-classes. Invoke when: adding a new C++ class, interface (I<Name>.h), mock (Mock<Name>.h), service, or exposing a C++ type to QML. Do not invoke for modifications to existing classes or build-only changes. 

Skill: writing-qml-and-tests. Invoke when: adding a new QML component, tst_*.qml test, QML mock of a C++ singleton, or objectName conventions for test discoverability. 
Rules: For any Qt/QML/C++ work, check the skill list above first. If a skill matches, invoke it before writing code or running tool calls. For Qt API questions, use the MCP Qt docs tool; do not answer from memory. If documentation does not confirm an API exists, say so — do not guess."
  }
}

{
  "continue": true,
  "hookSpecificOutput": {
    "additionalContext": "Project Skill Reminder. This project ships Qt/QML skills that encode our architecture and test conventions. Invoke the matching skill BEFORE writing code. 

Skill: creating-qt-classes. Invoke when: adding a new C++ class, interface (I<Name>.h), mock (Mock<Name>.h), service, or exposing a C++ type to QML. Do not invoke for modifications to existing classes or build-only changes. 

Skill: writing-qml-and-tests. Invoke when: adding a new QML component, tst_*.qml test, QML mock of a C++ singleton, or objectName conventions for test discoverability. 
Rules: For any Qt/QML/C++ work, check the skill list above first. If a skill matches, invoke it before writing code or running tool calls. For Qt API questions, use the MCP Qt docs tool; do not answer from memory. If documentation does not confirm an API exists, say so — do not guess."
  }
}

Two points worth highlighting. First, this is the same content as the QWEN.md table. We are not telling Qwen anything new. Second, the file is intentionally short. A long reminder injected into every turn would burn tokens and crowd out user content. The whole point of the hook is to keep this short and reliable.

Here’s What Changed with the Hook Enabled

Qwen immediately began invoking skills on its own for direct prompts and — crucially — during plan execution. The table was not new information. It was information that was now right next to the prompt instead of several thousand tokens up the context window. Two key differences we noticed:

Plan steps that previously generated code from memory now started with the matching skill load, then wrote code that matched our architecture.

Direct prompts hit the right skill on the first turn instead of needing a follow-up reminder. The number of times we had to Ctrl-C and remind Qwen "use the skill" dropped to near zero on the projects where we deployed the hook.

This Applies Far Beyond Qwen

Not Qwen-specific, the lesson here is about how reliably an agent will follow instructions that live somewhere other than the active turn. Frontier models will tolerate a lot of distance between an instruction and the moment it matters. Smaller, faster, cheaper models tolerate a lot less. Hooks let you trade a tiny bit of context window for guaranteed proximity, and for important workflow rules, we’ve discovered that trade is almost always worth it.

If you’re running any small or self-hosted model in an agent harness, you should look at every "the model should remember to do X" rule and question whether it would be more reliable as a UserPromptSubmit injection than as a paragraph in an agent file.

In our experience, the answer is usually yes.

Building a Reliable Second Tier

Qwen3-Coder-80B-A3B is a real workhorse on hardware we already own. It produces good code on our Qt/QML projects once two things are true: the skills are written verbosely enough for it, and the skills actually get invoked. Project-specific skills, which we covered in part 1, handle the first. A UserPromptSubmit hook injecting a small skill_reminder.md file handles the second.

Together they’ve made Qwen a reliable second tier in our agent stack, capable of handling the routine 60% of our coding load while we save the frontier models for the hardest problems. In part 3 we’ll cover in detail when to use which model.

Disclaimer: Behavior described here was observed in our own Qt/QML evaluation harness running Qwen Code CLI against a self-hosted Qwen3-Coder-80B-A3B. Your mileage on a different stack may vary.

About authors

Andrey Pozdnyakov

Andrey is a Qt software developer at ICS with 20+ years of experience building high-performance user interfaces and data visualization systems across industries from agriculture to energy. He also specializes in AI-driven development, using LLMs like Qwen and Claude to create precise, model-aware coding systems that generate production-quality code on demand.

Justin Noel

Justin is a Sr. Software Architect at Integrated Computer Solutions (ICS), with over 20 years of experience in developing and maintaining complex software systems. His strong leadership and technical skills enable him to successfully guide teams and solve challenging engineering problems. As a technical trainer, he has taught Qt/QML, UX/UI and embedded courses as well as presented at numerous international conferences. Justin earned his B.S. in Computer Technology from Northeastern University.

‹ What We Learned Porting to OpenCV 5 with Claude Code

Teach Claude Code Your Project-Specific Skills ›

230 Second Avenue • Waltham, MA 02451 • Phone: 781.552.3715