You're viewing a demo portfolio

Join the waitlist
PRSM

iliad_llm_inference

Active

Tool of AXIS Toolbox — Agentic Commerce Codebase Intelligence

declared in 0.5.3

AXIS-hosted LLM chat-completion via node-llama-cpp + a small GGUF model loaded in-process. Two input shapes accepted: `prompt` (single string) or `messages` (chat-style array of {role, content}). Sampling controls: `max_tokens` (≤2048), `temperature` (0-2), `top_k`, `top_p`, `seed` (for reproducibility), `stop` (string[]). Inference is fully in-process — no upstream provider, no per-call API fee. Operator sets AXIS_LLM_MODEL_PATH to point at a Phi-3-mini / TinyLlama / Llama-3.2-1B GGUF; if missing, the tool returns a `_not_configured: true` envelope. Engineer mode (X-Agent-Mode: engineer — Constrained Inference, $0.10): pass a `json_schema` and decoding is grammar-constrained to it AND the output is validated against it (returns a `structured` block with valid + parsed + schema_errors) — guaranteed-valid structured output. Requires Authorization: Bearer <api_key>.

Parameters schema

{
  "type": "object",
  "properties": {
    "seed": {
      "type": "number",
      "description": "Optional seed for reproducible output."
    },
    "stop": {
      "type": "array",
      "description": "Stop sequences. Generation halts when any string in the array is produced."
    },
    "top_k": {
      "type": "number",
      "description": "Top-k sampling (positive integer). Defaults 40."
    },
    "top_p": {
      "type": "number",
      "description": "Top-p nucleus sampling in (0, 1]. Defaults 0.95."
    },
    "prompt": {
      "type": "string",
      "description": "Single-prompt completion input. Use either this OR messages, not both."
    },
    "system": {
      "type": "string",
      "description": "Optional system prompt (prompt mode only). For messages mode, use role=system entries."
    },
    "messages": {
      "type": "array",
      "description": "Chat-style input. Array of {role: system|user|assistant, content: string}."
    },
    "max_tokens": {
      "type": "number",
      "description": "Max tokens to generate. Defaults 512, hard cap 2048."
    },
    "json_schema": {
      "type": "object",
      "description": "Engineer mode (required): a JSON Schema. Decoding is grammar-constrained to it and the output is validated against it; returns a `structured` block."
    },
    "temperature": {
      "type": "number",
      "description": "Sampling temperature in [0, 2]. Defaults 0.7."
    }
  }
}

What this tool wraps· 0 endpoints

min confidence0.700.50

No endpoints wrapped at confidence ≥ 0.50.

Parent server

AXIS Toolbox — Agentic Commerce Codebase Intelligence

https://github.com/lastmanupinc-hub/Toolbox

1/7 registries
View full server →