How Pinata Works

A technical deep dive into Pinata's detection engine, scoring algorithm, and AI-powered features.

Architecture Overview

Pinata is a static analysis tool that scans source code for security vulnerabilities and test coverage gaps. It uses pattern matching against a curated database of detection rules, then scores results and optionally generates tests.

Source Code
Scanner
Pattern Matcher
Gaps
Gaps
Scorer
Pinata Score
Gaps
AI Service
Tests + Explanations

Scanning Pipeline

When you run pinata analyze, the following steps execute:

1
File Discovery
Recursively walks the directory, filtering by language (TypeScript, Python, JavaScript) and respecting .pinataignore patterns.
2
Category Loading
Loads 45 detection categories from YAML definitions. Each category contains patterns, severity levels, and test templates.
3
Pattern Matching
For each file, runs all applicable patterns (filtered by language). Patterns are regex-based with negative patterns to reduce false positives.
4
Scoring
Calculates the Pinata Score based on gap count, severity weights, and domain coverage.

Pattern Definition Format

Each detection category is defined in YAML with the following structure:

id: sql-injection
version: 1
name: SQL Injection
description: |
  Detects SQL queries built with string concatenation...
domain: security
priority: P0
severity: critical
applicableLanguages:
  - python
  - typescript

detectionPatterns:
  - id: ts-template-literal-query
    type: regex
    language: typescript
    pattern: "(query|execute).*`.*\\$\\{"
    confidence: high
    description: Detects template literals in SQL queries
    negativePattern: "parameterized|prepared"

testTemplates:
  - id: jest-sql-injection
    language: typescript
    framework: jest
    template: |
      describe('SQL Injection', () => {
        it('uses parameterized queries', () => {
          // Test code...
        });
      });

Confidence Levels

Each pattern has a confidence level that affects filtering and scoring:

By default, Pinata only reports high confidence findings. Use --confidence medium or --confidence low to see more.

Scoring Algorithm

The Pinata Score (0-100) represents your codebase's security health. Higher is better.

Score = 100 - Σ(gap_weight × severity_multiplier × confidence_factor)

Severity Multipliers

Domain Coverage

The score also considers which risk domains have been scanned. If your codebase has no database code, the Data domain won't penalize you for missing data validation patterns.

Diminishing Returns

After 10 gaps of the same category, additional gaps have reduced impact. This prevents a single repeated issue from dominating the score.

AI Features

Pinata integrates with LLMs (Anthropic Claude, OpenAI GPT) for enhanced analysis:

🧠
Natural Language Explanations - Understand vulnerabilities in plain English with remediation guidance.
🧪
Test Generation - AI fills template variables intelligently based on your actual code context.
💡
Pattern Suggestions - Submit vulnerable code samples and get new detection patterns.
📊
Risk Assessment - AI evaluates real-world exploitability of findings.

AI Service Architecture

┌─────────────────────────────────────────────────┐
│                  AI Service                      │
├─────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐               │
│  │  Explainer  │  │  Generator  │               │
│  └──────┬──────┘  └──────┬──────┘               │
│         │                │                       │
│         ▼                ▼                       │
│  ┌─────────────────────────────────────┐        │
│  │         Provider Abstraction         │        │
│  │    (Anthropic / OpenAI / Mock)       │        │
│  └─────────────────────────────────────┘        │
└─────────────────────────────────────────────────┘

The AI service abstracts away provider differences. You can switch between Anthropic and OpenAI by changing your API key configuration.

Performance

Pinata is designed for speed:

Benchmarks

On a typical codebase (10,000 files, 500K LOC), Pinata completes in under 10 seconds. AI features add latency based on provider response times.

Extensibility

Pinata is designed for customization:

Custom Categories

Add your own detection categories by creating YAML files in your project:

# .pinata/categories/my-company-auth.yml
id: my-company-auth
name: MyCompany Auth Standards
domain: security
severity: high
detectionPatterns:
  - id: legacy-auth-function
    pattern: "legacyAuthenticate\\("
    confidence: high
    description: Legacy auth function deprecated

Output Formats

Integrate with any system using standard output formats:

Security Model

Pinata is designed with security in mind:

AI Privacy

When using AI features, code snippets are sent to your configured AI provider (Anthropic or OpenAI). Use the --no-ai flag to disable all AI features in sensitive environments.