How Domain Experts Can Build AI Agent Skills
Using a medical monitor review workflow as an example
AI tools are already very good at language, summarization, reasoning, and pattern recognition. But many experts still use them like generic chatbots: copy data into an AI tool, ask a question, adjust the prompt, and repeat.
That works for experimentation, but it does not scale well for recurring professional work. The real opportunity is not just asking AI questions. The real opportunity is turning expert workflows into reusable AI agent skills.
What Is an AI Agent Skill?
An AI agent skill is a reusable instruction framework that tells an AI agent how to perform a specific type of work.
A skill is not just a prompt. It can include review logic, evaluation criteria, expected inputs, output structure, escalation rules, domain-specific reasoning patterns, validation steps, and retrieval of supporting knowledge.
In simple terms, a skill captures how an expert approaches recurring work.
Simple Formula
Stored Skill + User Data + Optional Retrieved Knowledge = Structured AI-Assisted Output
Why Generic Prompting Breaks Down
Many professionals start with a simple prompt such as:
Sometimes the result is useful. Sometimes it is vague, inconsistent, or misses important findings. The user then adds more instructions: focus on operational risks, check compliance issues, summarize key findings, provide recommendations, use bullet points, and so on.
Over time, prompts become longer, harder to manage, inconsistent, difficult to reuse, and difficult to share across teams.
The problem is not the AI model itself. The problem is that the workflow exists only inside scattered prompts and human memory.
The Shift: From Prompting to Skills
Instead of writing the same instructions repeatedly, experts can define reusable skills.
The workflow changes from:
to
Reusable Skill + User Data
For example, instead of explaining the entire medical review process each time, a user can simply say:
The user provides the data. The skill provides the expertise structure. The AI combines both.
Medical Monitor Review as an Example
A medical monitor often reviews adverse event listings, laboratory listings, vital signs, coding consistency, eligibility concerns, and protocol deviations.
The review process usually follows recurring patterns:
- Identify serious adverse events
- Identify Grade 3 or Grade 4 events
- Review liver toxicity
- Assess dose-response relationships
- Identify stopping-criteria events
- Evaluate clinically significant trends
- Determine whether escalation is needed
This is an ideal candidate for an AI agent skill because the workflow is structured, the review criteria are repeatable, the outputs are relatively standardized, and human oversight remains essential.
Step 1 — Identify the Repeatable Workflow
The first step is not AI. The first step is understanding your own workflow.
Ask yourself:
- What recurring tasks do I perform repeatedly?
- What patterns do I consistently look for?
- What outputs do I repeatedly generate?
- What judgment criteria do I apply?
- What findings usually require escalation?
For medical monitor review, the workflow may be:
Input
AE listing, lab listing, vital signs listing, study context.
Review
Serious events, Grade 3/4 findings, ALT increases, safety signals, eligibility issues, coding consistency.
Output
Executive summary, key findings, trends, follow-up actions, human review recommendation.
Step 2 — Define the Required Input Data
Next, define what data the skill expects. A skill cannot reliably review data that is inconsistent, incomplete, or poorly structured.
AE Listing
- Subject ID
- Treatment arm or dose
- AE term
- CTCAE grade
- Seriousness flag
- Relationship to study drug
- Action taken
- Outcome
Lab Listing
- Subject ID
- Visit
- Analyte
- Result
- Units
- Reference range
- CTCAE grade
Vital Signs Listing
- Subject ID
- Visit
- Blood pressure
- Heart rate
- Temperature
- Oxygen saturation
- Weight
- Baseline values
Step 3 — Define the Review Logic
This is where expertise becomes reusable. The goal is to teach the AI what matters, what patterns to look for, and how to prioritize findings.
For medical monitor review, focus areas may include:
- Serious adverse events
- Grade 3 or Grade 4 abnormalities
- Liver toxicity
- Dose-response trends
- Stopping criteria
- Skin reactions
- Protocol deviations
- Coding inconsistencies
Step 4 — Define the Output Structure
Experts often underestimate how important output structure is. A reusable skill should produce consistent outputs.
Recommended Output
- Executive Summary
- Key Safety Findings
- Trends and Patterns
- Potential Safety Signals
- Missing Information
- Recommended Follow-up Actions
- Human Review Recommendation
This makes reviews easier to read, compare, validate, and eventually automate.
Step 5 — Write the Skill File
The skill can be written as a simple Markdown file. It does not need to be complicated at first.
This file becomes reusable. You can improve it over time instead of rewriting prompts repeatedly.
Step 6 — Test with Sample Data
Next, test the skill using synthetic data, sample listings, or de-identified datasets.
The process is simple:
Example:
Then review what the AI missed, what it overstated, whether the output was clinically useful, and whether the structure helped the review.
Where Can You Test and Run a Skill?
A skill does not need a complex platform on day one. You can start testing it anywhere you can provide three things: the skill instructions, the input data, and a clear request to apply the skill.
1. Directly in an LLM Chat
The simplest test is to paste the skill file into ChatGPT, Claude, or another LLM, then provide a sample dataset and ask the model to apply the skill. This is useful for early design and quick iteration.
2. In Claude Cowork
Claude Cowork is Claude's agentic workspace inside Claude Desktop. It brings Claude Code-style agent capabilities to knowledge work beyond coding, so Claude can work across files, instructions, and multi-step tasks without requiring a terminal.
This makes Cowork a natural place to create and test skills. A domain expert can define a skill, attach sample files or data, run the workflow, inspect the output, and refine the instructions until the skill behaves more consistently.
3. Inside an Agent Platform
Eventually, the skill can run inside an agent system that manages users, files, retrieval, permissions, routing, history, and repeated execution. This is where a reusable skill becomes part of a real workflow rather than a one-off prompt.
This is also where i80agent is heading: a place where domain knowledge, retrieval, and reusable expert skills can be tested together and eventually used in real workflows.
Step 7 — Refine Through Real Usage
The first version will not be perfect. That is expected.
Over time, new rules get added, edge cases get captured, escalation logic improves, outputs become more useful, and the workflow becomes more standardized.
This is how expertise gradually becomes operationalized.
Skills Do Not Replace Experts
This point is important: the goal is not to replace experts.
The goal is to reduce repetitive work, improve consistency, accelerate review, identify patterns earlier, structure workflows, and support decision-making.
The expert still reviews findings, validates conclusions, handles ambiguity, and makes final decisions.
This Applies Far Beyond Medical Review
The same process works in many domains.
Legal Review Skill
Identify contract risks, review clauses, flag missing protections, and summarize negotiation concerns.
HR Candidate Evaluation Skill
Summarize interview notes, compare competencies, identify gaps, and generate structured recommendations.
Financial Review Skill
Identify unusual variances, summarize trends, assess operational impact, and generate follow-up questions.
Operations Incident Review Skill
Summarize events, identify root causes, assess severity, and recommend mitigation steps.
Retrieval + Skills
One important lesson from building domain-specific AI systems is that skills alone are not enough. A skill often still depends on trusted knowledge.
For example, a medical monitor review skill may need protocol rules, study context, stopping criteria, coding standards, eligibility criteria, and historical decisions.
This is where retrieval becomes important.
Final Thoughts
Many professionals already have highly structured expertise. They just do not always think of it as something that can be formalized.
But once recurring workflows become explicit, structured, testable, and reusable, they can evolve into AI agent skills.
This is one of the most practical paths toward domain-specific AI: not replacing experts, but helping experts scale their expertise more consistently and efficiently.
I believe people in many different domains can do this. Medical reviewers, legal teams, HR leaders, finance teams, operations managers, educators, consultants, and creative professionals all have recurring workflows that depend on judgment and experience. If those workflows can be described clearly, they can probably become skills that improve consistency, reduce repetitive work, and make expert work more efficient.