Fourteen tools for a ServiceNow MCP

The first version of sndev.io had one tool called search. It took a query string. It searched everything. I was very proud of it for about three days.

Then I actually watched an agent use it.

The agent was trying to find out whether GlideRecord.addJoinQuery existed on the client side. Instead of asking a specific question, it typed "join" into my single-box search and got back a wall of results — encoded query operators, REST endpoints that happened to mention joins, two script includes with “Join” in the name, and somewhere on page three, the method it was actually looking for. The agent then did the thing agents do when they’re drowning: picked the first plausible result and hallucinated the rest.

I sat there with the logs open thinking this is my fault.

One big search is a trap

The problem wasn’t the search. The spec data was fine. The problem was the tool name.

When an agent has one tool called search, it will use it for everything. And when a tool can return anything, the agent has no prior about what it’s getting back. It can’t narrow. It can’t plan. It’s just throwing keywords at a black box.

Tool names are prompts. I’d written one that said “guess what you need” and then acted surprised when the agent guessed.

So I split it.

The ten search tools

There are now ten search tools, one per spec domain. search_scripting goes against the 240+ scripting API classes. search_rest hits the 160+ REST groups. search_schema covers 4,000+ table schemas. search_script_includes searches 4,600+ script includes by name, method, or scope. There’s one for the Fluent TypeScript DSL, one for plugins, one for encoded query syntax, one for real-world code patterns, one for product-to-API mappings, and one for release metadata.

Ten tools sounds like a lot until you realize an agent picking between them is doing almost no work. The names tell it exactly what each one holds. When the agent wants to know if a method exists, it reaches for search_scripting. When it wants a table’s fields, it reaches for search_schema. The decision happens at the tool-selection layer, which is where LLMs are already good.

Each one takes either a query string for keyword search or a code string for something much more interesting.

The code parameter is the whole trick

The code parameter lets the agent send a JavaScript expression that runs inside a QuickJS sandbox against the spec object. Not a query DSL. Not a filter language. Actual JS.

So an agent that wants every server-scope method with “attachment” in the name can write a filter. An agent that wants to know which tables extend task can traverse the hierarchy. An agent that wants every script include in the global scope with a method named log can express exactly that, in six lines, and get back exactly that.

The first time I saw an agent write a five-line Object.entries().filter() chain against spec.classes instead of guessing keywords, I almost closed the laptop and went for a walk. It was working the way I’d hoped but hadn’t believed.

Keyword search is still there for when the agent doesn’t know what it’s looking for yet. code is for when it does. Both matter.

The four retrieval tools

Search tools return summaries. Sometimes an agent needs the whole thing.

get_script_include returns full source by api_name (global.VAUtils, say). get_table_dictionary pulls sys_dictionary metadata for a table, with release-specific lookups — so you can ask what incident looked like in San Diego versus Zurich. get_rest_operation returns a full REST operation by sys_id. get_release_changelog does the cross-release diff.

These are the “I’ve found what I want, now give me everything” tools. Splitting them off from search mattered because the shape of the response is different. Search returns lists. Retrieval returns one big object. Jamming both into one tool would’ve meant building a union type the agent had to reason about. Two tools, two shapes, no confusion.

The three live proxy tools

The last layer is the one I was most nervous about shipping: query_table, execute_script, and discover_schema. These don’t hit the bundled spec — they proxy to a real ServiceNow instance through https://sndev.io/live. Credentials pass through per-request and never get stored.

This is the layer that turns the MCP from a reference library into something that can actually do work. The agent can look up what a table should look like via search_schema, then check what it actually looks like on your instance via discover_schema, then run a script against it via execute_script. Reference, verification, execution. All three in the same conversation.

I kept it to three because anything more felt like reinventing the CLI. The CLI is where opinionated workflows belong. The MCP’s job is to give the agent primitives.

Ten plus four plus three

Fourteen tools, three layers, one mental model: search to find, retrieve to read, proxy to act.

I still think about the day I watched that agent throw "join" at my one-tool search and drown. It wasn’t a bug. It was me designing for myself instead of for the thing using the tool. The rewrite was annoying. Watching an agent actually reach for the right tool on the first try, though — that was worth the whole weekend I spent renaming things.

-D