Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Smart Filtering

The --smart flag activates Claude Vision integration. Instead of relying solely on rule-based exclusion patterns, Claude analyzes each image semantically and decides what is worth studying.

How Smart Filtering works

Smart Filtering is an OCR-first pipeline. EasyOCR runs first to detect text regions and bounding boxes, then Claude decides which of those regions to keep or discard. The output is always image occlusion cards.

Image → EasyOCR → Claude Vision (keep/skip each region) → Image occlusion card

This means:

This is different from Smart Generation, which is a generation-first pipeline where Claude sees the full content and creates cards from scratch (including cloze and basic card types). To use generation mode with images, add --generate (or -gen). For PDFs, use --page.

Smart FilteringSmart Generation
PipelineOCR first, Claude filtersClaude generates from scratch
Input to ClaudeImage + OCR text regionsFull image, page render, or extracted text
Card typesImage occlusion onlyImage occlusion, cloze, and basic
Best forLabeled diagrams and figuresAny content
How to enable--smart--smart --generate (images) or --smart --page (PDFs)

What Claude does

For each image, Claude:

Pipeline comparison

Without --smart:

Image → EasyOCR → Rule-based filter (exclude.exact / exclude.regex) → Anki card

With --smart:

Image → EasyOCR → Claude Vision (semantic curation + OCR correction) → Anki card

OCR handles precise bounding box coordinates (its strength). Claude handles semantic understanding (its strength).

Usage

Add --smart to any existing Niobium command:

# Single image
niobium -i /path/to/anatomy.png -apkg ./output --smart

# Directory of images
niobium -dir /path/to/slides -deck Pharmacology --smart

# PDF (processes extracted images)
niobium -pin /path/to/lecture.pdf -apkg ./output --smart

Without --smart, Niobium works exactly as before: pure OCR with rule-based filtering.

API key setup

An Anthropic API key is required. Provide it in one of two ways:

Environment variable (recommended):

export ANTHROPIC_API_KEY=sk-ant-...

Config file:

llm:
  api_key: sk-ant-...

The config file value takes priority over the environment variable. If no key is found, Niobium falls back to rule-based filtering with a warning.

Configuration

The llm section in your config file controls smart mode:

llm:
  api_key: null
  model: claude-sonnet-4-6
  max_tokens: 1024
  temperature: 0.2
  instructions: null
KeyDefaultDescription
api_keynullAnthropic API key
model"claude-sonnet-4-6"Claude model identifier
max_tokens1024Maximum response tokens
temperature0.2Response variability (lower = more consistent)
instructionsnullCustom prompt addition (see below)

Custom instructions

The instructions field is the most powerful configuration option. It appends text to the built-in system prompt, letting you steer Claude’s decisions for a specific study context.

Pharmacology:

instructions: >-
  I'm studying pharmacology. Prioritize drug names, drug classes, mechanisms
  of action, receptor types, and side effects. Add hints about drug
  interactions and clinical indications.

Histology:

instructions: >-
  These are histology slides. Occlude tissue types, cell types, staining
  characteristics, and structural features. Add hints about how to
  distinguish similar-looking tissues.

Pathology:

instructions: >-
  Focus on pathological findings. Occlude disease names, morphological
  descriptions, and diagnostic features. Add hints about epidemiology
  and clinical presentation.

USMLE Step 1:

instructions: >-
  I'm preparing for USMLE Step 1. Add high-yield clinical correlations
  and First Aid-style memory aids in the hints.

Text-heavy slides:

instructions: >-
  These images contain mostly text paragraphs. Occlude only the most
  important medical terms, numerical values, and key facts. Skip filler
  words and context sentences.

Set instructions to null (or remove it) to use the default general-purpose behaviour.

Cost

Claude Sonnet processes each image for approximately 0.0050.005-0.01 depending on image size and number of text regions. A batch of 50 images costs approximately 0.250.25-0.50.

Fallback behaviour

If anything goes wrong during a Smart mode run (API error, network timeout, malformed response), Niobium automatically falls back to rule-based filtering for that image and continues processing. You always get your cards.

Caching

Claude responses are cached in ~/.config/niobium/cache.db so the same image is never sent to the API twice. The cache key is derived from the image content, the OCR text list, the model name, and the instructions string. Changing any of these causes a fresh API call. Use --no-cache to skip the cache for a single run. See Caching for details.