GitHub: LLM Prompt Shape Inspector
The LLM Prompt Shape Inspector is an advanced prompt engineering tool that helps you analyze and optimize prompts for large language models. Using principles from mechanical engineering and vector space semantics, it treats prompts as geometric shapes with precise boundaries, enabling you to build more reliable, deterministic prompts.
The tool provides three key features:
The heat map displays your prompt with color-coding to highlight:
Controls:
Located in the sidebar, these sliders let you fine-tune the analysis:
Displayed in convenient tabs, this section shows:
Each word is displayed with its score and color-coded to indicate if it's above the threshold.
Produces an enhanced version of your prompt using four engineering principles:
{definition} placeholders after ambiguous words* and repeats key constraints at the start/endAutomatically calculates important metrics for your prompt:
If your prompt exceeds the polysemy budget, helpful warnings and suggestions appear to guide you toward a more stable prompt.
| Layer | What happens | Why it matters to engineers |
|---|---|---|
| Tokenisation | Uses the same encoder as the chosen embed-model (text-embedding-3-small) so the token boundaries you see are exactly what the model will see. |
No surprises when you copy-paste the optimised prompt into production. |
| Edge-Finder | Computes cosine-similarity between every token vector and a set of constraint vectors you supply (e.g. context: cybersecurity). High-similarity tokens are the "walls" that keep generation on-topic. |
Lets you check that your actual constraints (brand names, legal phrases, etc.) are receiving enough signal strength. |
| Polysemy-Stress | Looks up to four WordNet senses per word, embeds each gloss, and measures variance. High variance ⇒ the word is ambiguous. | Surfaces the words most likely to cause drift ("bank", "port", "lead", etc.). |
| Occlusion Drift | Drops each token in turn, re-embeds, and measures vector shift. | Rough proxy for how much that token steers meaning—helps spot "hidden load-bearers". |
| Contractor / Enhanced Contractor | Inserts {definition} placeholders after high-polysemy words and * after critical edge tokens, plus a constraints recap. |
Gives you a copy-ready scaffold. Fill the braces, keep the asterisks, and you have a tighter prompt. |
| UI/UX | • Heat-map with adjustable gain/normalise • Word-group vs token view • Poly-budget meter / warnings |
Engineers can tweak thresholds until signal–noise balance "looks" right, then copy the prompt with one click. |
**Total extra time per prompt: ~2 minutes once you are familiar.
Replace {definition} placeholders with clarifying information:
Pay attention to tokens marked with *:
Monitor your polysemy budget:
Start with explicit constraints
PCI-DSS, target language: PowerShell) in the right-hand "Constraint phrases" box first. The Edge-Finder heat map instantly tells you whether those words are present and weighted.Iterate the thresholds, don't accept the defaults
Fill the {definition} blanks immediately
{definition} in test pipelines.)Use the heat-map as a diff tool
Link output quality to the Poly-budget
Automate where possible
The app is built on a geometric understanding of prompts:
| Engineering analogue | NLP counterpart | Why it matters |
|---|---|---|
| Design envelope for an aircraft | Concept envelope of an idea | The aircraft must never leave its aerodynamic envelope; a prompt should stay inside its intended semantic envelope. |
| Finite-element mesh with nodes & boundary conditions | Embedding cloud with edge tokens & constraints | The mesh nodes with the highest strain are the places where failure begins; the tokens that most strongly constrain meaning ("edge tokens") are where an LLM will "tear" into ambiguity if you're not explicit. |
| Tolerance stack-up in manufacturing | Polysemy stack-up in a prompt | Ambiguous words add dimensional variance. Past a threshold the stack-up makes the output drift off-spec. |
| Technique | Mechanism | Why it works |
|---|---|---|
| Sense-locking | Immediately follow any high-σ word with a micro-definition, synonym, or role marker. "Bank (financial institution) ledger compliance report …" |
Reduces σ(t) by collapsing the LLM's attention onto the desired centroid; narrows 𝒮 early in the forward pass. |
| Edge reinforcement | Repeat or paraphrase the most critical constraints at least twice, once near the front and once near the end (positional bias). | Ensures high-e(t) tokens dominate global attention heads even if the middle expands creatively. |
| Dimensional dropout | Deliberately omit low-information adjectives/adverbs that merely add orthogonal variance. | Shrinks 𝒮 volume, making generation more deterministic and cheaper to steer during an agent loop. |
| Polysemy budget | Compute a quick heuristic: total σ ≤ σ_max (tunable). Flag the user or auto-expand definitions when the budget is exceeded. | Gives a measurable "tolerance stack-up" limit analogous to mechanical design. |
Token embeddings → point cloud Each token t is a vector vᵗ ∈ ℝᴰ.
Idea = bounded region 𝒮 𝒮 = {x | g_i(x) ≤ 0, i=1..k} – where each gᵢ encodes a linguistic or factual constraint.
Edge nodes Define edge score e(t) = max_j |g_j(v^t)| Tokens with high e(t) exert the strongest bounding force.
Polysemy stress tensor For token t, let {vᵗ¹, vᵗ², … vᵗᴺ} be the sense-cluster centroids. The polysemy stress is σ(t) = var({v^t_s}).
| Symptom | Likely geometric cause | Fast fix |
|---|---|---|
| Output veers into unintended domain | 𝒮 not convex; missing edge tokens on that side | Add concrete examples that live on the missing face of the hull. |
| Repetition / looping | Hull too thin in some dimensions; model stuck in local minimum | Introduce orthogonal but relevant descriptors to widen 𝒮 slightly. |
| Hallucinated entities | Internal walk escaped the hull via ambiguous connector word | Replace connector with a univocal relationship phrase or split the prompt into two stages. |
OPENAI_API_KEY)streamlit run app.py
By instrumenting your prompts with polysemy-stress meters and edge-coverage heat maps, you'll convert the abstract philosophy of "the shape of ideas" into a concrete engineering control system that makes your LLM workflows steadier, safer, and more scalable.
The tool now uses a more targeted approach to sense-locking:
The tool now offers adaptive thresholds that automatically adjust based on each prompt's unique characteristics:
This ensures more reliable identification of edge tokens and polysemous words across heterogeneous prompts without requiring manual threshold tuning for each new prompt.