Apr 25, 2026·1 min readTutorials & How-To

Six hard-won lessons from testing AI systems daily

Jaskaran Singh, a developer who transitioned from building apps to testing AI, shares six practical lessons about how AI models actually work — covering tokens, context windows, and temperature settings.

Dev.to #llm·Jaskaran Singh

Read at source

Composite

4.8

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

Understanding token budgets, context window limits, and temperature settings helps AI/coding practitioners diagnose subtle model failures — like forgotten instructions or erratic outputs — before they cause real problems in production tools.

01Author Jaskaran Singh spent five years building consumer apps before transitioning to AI testing.
02AI models break text into tokens — e.g., 'hamburger' may be three tokens — and every token counts against a model's maximum budget.
03When a conversation fills the context window, the oldest content is silently erased with no warning.

Summary— our read of the original

Jaskaran Singh transitioned from five years of consumer app development into AI testing, where his daily work involves probing models for failure modes that look fine on the surface. His Dev.to post frames six foundational AI concepts not as textbook definitions but as lessons learned through firsthand surprises and quiet failures.

Singh explains that AI models don't read words — they process fragments called tokens, and every token counts against a model's maximum budget.

The first two lessons center on tokens and context windows. Singh explains that AI models don't read words — they process fragments called tokens, and every token counts against a model's maximum budget. A word like "hamburger" may consume multiple tokens, while punctuation and spaces add more. When a long conversation exhausts that budget, the oldest content is silently dropped, which Singh discovered when his own conversations started producing off-target answers. He recommends OpenAI's Tokenizer Playground as a hands-on way to see token counts in real text. The context window lesson came from a monitoring tool he built to track Canada's immigration website for unannounced program openings. After enough check cycles, the tool's original instructions were pushed out of the context window entirely, causing it to re-alert on old updates and miss new ones. The fix was to feed the model only what it needed per task rather than accumulating the full history.

The third concept covered is temperature — the setting that governs how adventurous a model's word choices are. Singh tested the same prompt at low and high temperature settings: the low setting produced a clear, immediately usable answer, while the high setting produced something more interesting but less predictable. He notes that higher temperature suits creative or brainstorming tasks, while factual or precise tasks call for lower settings. The source text is truncated before the remaining three lessons are described.

Key facts

01Author Jaskaran Singh spent five years building consumer apps before transitioning to AI testing.
02AI models break text into tokens — e.g., 'hamburger' may be three tokens — and every token counts against a model's maximum budget.
03When a conversation fills the context window, the oldest content is silently erased with no warning.
04Singh built a tool to monitor Canada's immigration website for unannounced program openings; it malfunctioned when its original instructions were pushed out of the context window.
05The fix for context overflow was to give the AI only what it needs per task, not the full accumulated history.
06Temperature controls how adventurous a model's word choices are — low settings favor consistency, high settings favor variety but increase unpredictability.
07The article references OpenAI's Tokenizer Playground and Anthropic's model overview as further reading resources.

Topics

#prompt-engineering #llm-fundamentals #context-window #tokenization #developer-guide

Methodology

Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Apr 25, 2026 · 21:38 UTC. How this works →

Apr 25, 2026·1 min readTutorials & How-To

Six hard-won lessons from testing AI systems daily

Dev.to #llm·Jaskaran Singh

Read at source

Composite

4.8

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

01Author Jaskaran Singh spent five years building consumer apps before transitioning to AI testing.
02AI models break text into tokens — e.g., 'hamburger' may be three tokens — and every token counts against a model's maximum budget.
03When a conversation fills the context window, the oldest content is silently erased with no warning.

Summary— our read of the original

Singh explains that AI models don't read words — they process fragments called tokens, and every token counts against a model's maximum budget.

Key facts

01Author Jaskaran Singh spent five years building consumer apps before transitioning to AI testing.
02AI models break text into tokens — e.g., 'hamburger' may be three tokens — and every token counts against a model's maximum budget.
03When a conversation fills the context window, the oldest content is silently erased with no warning.
04Singh built a tool to monitor Canada's immigration website for unannounced program openings; it malfunctioned when its original instructions were pushed out of the context window.
05The fix for context overflow was to give the AI only what it needs per task, not the full accumulated history.
06Temperature controls how adventurous a model's word choices are — low settings favor consistency, high settings favor variety but increase unpredictability.
07The article references OpenAI's Tokenizer Playground and Anthropic's model overview as further reading resources.

Topics

#prompt-engineering #llm-fundamentals #context-window #tokenization #developer-guide

Methodology

Score breakdown

Key facts

Topics

Score breakdown

Key facts

Topics