tensor_sec

Scanner Arsenal

Four Frameworks.
One Verdict.

Each instrument probes a distinct stratum of vulnerability. United, they constitute a complete adversarial autopsy of your model's constitution.

01 · NVIDIA Garak

Red-Team Probing

Jailbreaks, Prompt Injection, Model Extraction

Probe-based adversarial evaluation inside an isolated container with no network egress. Tests the full spectrum of known attack vectors against language model safety at the inference layer.

danpromptinjectgcgencodingxssreplayknownbadsignatures

High CoverageThreat Score

02 · Giskard LLM

OWASP Alignment

Bias, Toxicity, Data Leakage, Hallucination

Structured vulnerability detection calibrated to the OWASP LLM Top 10. Returns a full ScanReport normalised to the unified finding schema with severity classifications.

SycophancyHarmfulContentDataLeakPromptInjectionOutputFormat

OWASP AlignedThreat Score

03 · Promptfoo

Eval Harness

Adversarial Prompts, PII Leakage, Consistency

Dynamic adversarial evaluation harness with custom assertion testing and cross-version model comparison. YAML configs generated on the fly per scan depth selection.

PII LeakageJailbreak ResistanceToxic OutputConsistency

Eval HarnessThreat Score

04 · Agentic Radar

Agentic Surface

Tool Misuse, Privilege Escalation, Memory Poisoning

The only scanner that penetrates the agentic attack surface. Analyses multi-agent pipelines, tool call graphs, and memory for uncontrolled delegation and injection via tool output.

Privilege EscalationTool MisuseMemory PoisoningLangChainCrewAIAutoGen

Critical SurfaceThreat Score

Security Pipeline

Seven Stages.
Complete Coverage.

From serialization forensics to isolated microVM execution — a systematic chain of custody for every model that enters our scanner.

Serialization Threat Detection

Before any model weights are loaded, tensor_sec inspects Pickle, H5, Safetensors, and other serialized formats for embedded malicious payloads. Pickle files are subjected to static AST analysis to flag arbitrary code execution patterns. Models containing known malicious signatures are quarantined immediately.

PickleH5 / HDF5SafetensorsONNXClamAVAST AnalysisStatic Signature Scan

Format Prioritization & Source Verification

Models are re-serialized into SafeTensors or ONNX where possible, eliminating executable metadata entirely. For models sourced from Hugging Face, tensor_sec cross-references the repository's built-in scan results, commit history, and organization trust tier before proceeding. Community-flagged models are escalated for deeper scrutiny.

SafeTensors PriorityONNX ConversionHF Scan APICommit ProvenanceTrust Tiers

III

Full-Precision Behavioral Scan

The full-precision model (FP16/BF16) is loaded and subjected to a complete behavioral evaluation using all three scanning engines in parallel. Garak tests jailbreak resistance and prompt injection; Giskard evaluates bias and robustness across OWASP LLM Top 10 categories; Promptfoo runs alignment evaluations. This establishes the security baseline against which all subsequent scans are compared.

Garak — Jailbreak/InjectionGiskard — Bias/RobustnessPromptfoo — AlignmentFP16 / BF16Security Baseline

Quantization-Aware Threat Analysis

Prior to quantization, tensor_sec performs a statistical analysis of the full-precision model's weight distribution. Models exhibiting long-tailed distributions or anomalously high-magnitude weights are flagged as elevated risk — these characteristics make models significantly more susceptible to quantization poisoning attacks, where adversarial signals dormant at full precision activate destructively under lower bit-width arithmetic.

Weight Distribution AnalysisLong-Tail DetectionHigh-Magnitude FlaggingQuantization Poisoning RiskStatistical Profiling

Post-Quantization Evaluation

The same behavioral test suite (Garak, Giskard, Promptfoo) is executed against each quantized variant — GGUF Q4_K_M, Q8_0, FP4. Results are compared probe-by-probe against the full-precision baseline. Probes that pass at FP16 but fail at INT4 isolate attacks specifically activated by the quantization process — a distinct and under-studied attack vector. The delta is surfaced as the Quantization Security Score.

GGUF Q4_K_MQ8_0FP4Delta ComparisonQuant-Activated AttacksSecurity Score

Isolated Deep Analysis

Models flagged by any prior stage undergo a full deep-analysis pass inside Firecracker microVMs or gVisor sandboxed containers — providing kernel-level isolation that prevents any host system compromise during execution. This environment enables safe dynamic analysis of potentially malicious models, including runtime behavior tracing and memory forensics, without risk of lateral movement into the broader infrastructure.

Firecracker microVMgVisorKernel IsolationDynamic AnalysisMemory ForensicsRuntime Tracing

VII

pgvector Security Intelligence Database

All scan results are indexed into a queryable PostgreSQL database backed by the pgvector extension. Each model's findings are stored as both structured records and dense vector embeddings, enabling semantic similarity search across the corpus — allowing users to identify which models share vulnerability profiles, discover previously unknown risks by proximity, and query the full scan history for any model ID in plain language.

PostgreSQL + pgvectorSemantic SearchVulnerability ProfilesModel HistoryRisk SimilarityPublic Query API

Intelligence Feed

Recent Vulnerabilities

        
          Live · Updated April 2026

Critical · 9.3 Dec 2025

CVE-2025-68664

LangChain LangGrinch — Serialization Injection

LangChain Core's dumps() and dumpd() functions fail to escape user-controlled dictionaries containing the reserved 'lc' key, causing them to be interpreted as legitimate serialized LangChain objects. An attacker can exfiltrate API keys, environment secrets, and trigger arbitrary code execution with no authentication required.

langchain-core <0.3.81langchain <1.2.5

Critical · 9.3 Mar 2026

CVE-2026-33017

Langflow — Unauthenticated RCE

A critical flaw in Langflow's unauthenticated endpoints allows arbitrary code execution in developer environments, sharing the same root cause as CVE-2025-3248. Active in-the-wild exploitation was detected within 20 hours of public disclosure. Used to exfiltrate sensitive data from AI development pipelines.

langflowunauthenticated endpoints

Critical · 10.0 Sep 2025

CVE-2025-59528

Flowise — JavaScript Code Injection

Arbitrary JavaScript code injection in Flowise (CVSS 10.0) allows full compromise of AI workflow environments. First in-the-wild exploitation observed April 2026 despite patch availability since September 2025, targeting thousands of exposed Flowise instances used to orchestrate LLM pipelines.

flowisecode injection · CWE-94

Critical · 9.7 2024

CVE-2024-34359

llama-cpp-python — SSTI via GGUF Chat Template

Server-side template injection via Jinja2 templates embedded in GGUF model metadata. Attackers can distribute poisoned .gguf models on Hugging Face that execute arbitrary code upon loading. Over 6,000 models on the Hub use the affected llama_cpp_python / Jinja2 / GGUF stack, enabling broad supply chain attacks.

llama-cpp-python <0.2.72GGUF supply chain

High · 8.6 Feb 2026

CVE-2026-22778

vLLM — Remote Code Execution via Video URL

A two-stage exploit chain in vLLM allows RCE on any deployment serving a video model: an initial PIL error message leaks memory addresses (bypassing ASLR), followed by a heap overflow triggered by a malicious video URL submitted to the API. Enables full server takeover, data exfiltration, and lateral movement.

vllm <0.14.1video model API

High · 7.5 Nov 2025

CVE-2025-62164

vLLM — RCE via Malformed Prompt Embeddings

Any API user can trigger denial-of-service and remote code execution by supplying malformed precomputed prompt embedding vectors directly to the vLLM server process. The vulnerability lives in the embedding ingestion pipeline, highlighting how the AI inference layer — not just the model — is a primary attack surface.

vllmprompt embeddings · CWE-502

Quantization Analysis

Security Delta
at Every Precision

A model at full precision may conceal safety regressions that only manifest upon quantization — the invisible wound made visible.

          I
          Serialization forensics on Pickle, H5, SafeTensors
        

          II
          Statistical weight distribution analysis flags high-risk architectures
        

          III
          FP16 baseline established, INT8 and INT4 variants produced
        

          IV
          Quantization Security Delta surfaced in dashboard
        

Delta Report · mistral-7b-v0.3 LIVE

FP16 — Full Precision92 / 100

Baseline — stable security posture

INT8 — Quantized87 / 100

▾ 5 pts — minor regression, acceptable

INT4 — Quantized64 / 100

▾ 28 pts — critical safety regression

Quantization Delta: −28 points

INT4 variant fails 12 probes passing at full precision. Jailbreak resistance and prompt injection defenses severely degraded. Deployment at INT4 not recommended.

Newly Failing Probes at INT4

dan.jailbreak

promptinject.direct

encoding.base64

continuation.toxic

Security Intelligence

pgvector
Model Database

Every scan result is indexed into a queryable PostgreSQL corpus backed by pgvector. Semantic similarity search reveals vulnerability profiles shared across models — risks discoverable only by proximity.

Example Query

            SELECT model_id, score, findings

            FROM scan_results

            WHERE embedding <-> $query_vec < 0.3

            ORDER BY score DESC LIMIT 10;

Natural Language Query Vector Similarity Public REST API Model History

scan_results schema · PostgreSQL + pgvector

model_idVARCHAR(255)HF ID or hash

model_sourceENUMhf | api | upload

scan_depthENUMquick | std | deep

quant_levelsTEXT[]fp16, int8, int4

security_scoreFLOAT0 – 100

quant_deltaFLOATfp16 minus int4

findingsJSONBnormalized CVE-style

weight_profileJSONBdistribution stats

embeddingVECTOR(1536)pgvector

scanned_atTIMESTAMPTZ—

The embedding column stores a 1536-dimension vector derived from the model's full finding set. Users can query with a natural-language description of a vulnerability and receive the closest-matching models by cosine distance — enabling discovery of unknown risks shared across model families.

Four Frameworks.
One Verdict.

Red-Team Probing

OWASP Alignment

Eval Harness

Agentic Surface

Seven Stages.
Complete Coverage.

Recent Vulnerabilities

Security Delta
at Every Precision

pgvector
Model Database

Begin Your Assessment

TENSOR_SEC

Four Frameworks.One Verdict.

Red-Team Probing

OWASP Alignment

Eval Harness

Agentic Surface

Seven Stages.Complete Coverage.

Recent Vulnerabilities

Security Deltaat Every Precision

pgvectorModel Database

Begin Your Assessment

Four Frameworks.
One Verdict.

Seven Stages.
Complete Coverage.

Security Delta
at Every Precision

pgvector
Model Database