The --smart flag activates Claude Vision integration. Instead of relying solely on rule-based exclusion patterns, Claude analyzes each image semantically and decides what is worth studying.
How Smart Filtering works¶
Smart Filtering is an OCR-first pipeline. EasyOCR runs first to detect text regions and bounding boxes, then Claude decides which of those regions to keep or discard. The output is always image occlusion cards.
Image → EasyOCR → Claude Vision (keep/skip each region) → Image occlusion cardThis means:
Claude only sees text regions that OCR already detected — if OCR misses something, it won’t appear in the card
The card type is always image occlusion (rectangular regions hidden on the original image)
Claude’s role is curation: correcting OCR errors, filtering noise, and adding study hints
This is different from Smart Generation, which is a generation-first pipeline where Claude sees the full content and creates cards from scratch (including cloze and basic card types). To use generation mode with images, add --generate (or -gen). For PDFs, use --page.
| Smart Filtering | Smart Generation | |
|---|---|---|
| Pipeline | OCR first, Claude filters | Claude generates from scratch |
| Input to Claude | Image + OCR text regions | Full image, page render, or extracted text |
| Card types | Image occlusion only | Image occlusion, cloze, and basic |
| Best for | Labeled diagrams and figures | Any content |
| How to enable | --smart | --smart --generate (images) or --smart --page (PDFs) |
What Claude does¶
For each image, Claude:
Decides what to occlude: key terms, anatomical labels, drug names, disease names, important numerical values
Decides what to skip: figure labels (A, B, Fig. 1), publisher info, copyright notices, page numbers, OCR noise
Corrects OCR errors: compares garbled OCR text against what it actually sees in the image (e.g., “Glcmerulus” → “Glomerulus”)
Generates study hints: clinical correlations, functional notes, alternative names, mnemonics — added to the Back Extra field
Describes the image: a one-line context description appears at the top of Back Extra
Pipeline comparison¶
Without --smart:
Image → EasyOCR → Rule-based filter (exclude.exact / exclude.regex) → Anki cardWith --smart:
Image → EasyOCR → Claude Vision (semantic curation + OCR correction) → Anki cardOCR handles precise bounding box coordinates (its strength). Claude handles semantic understanding (its strength).
Usage¶
Add --smart to any existing Niobium command:
# Single image
niobium -i /path/to/anatomy.png -apkg ./output --smart
# Directory of images
niobium -dir /path/to/slides -deck Pharmacology --smart
# PDF (processes extracted images)
niobium -pin /path/to/lecture.pdf -apkg ./output --smartWithout --smart, Niobium works exactly as before: pure OCR with rule-based filtering.
API key setup¶
An Anthropic API key is required. Provide it in one of two ways:
Environment variable (recommended):
export ANTHROPIC_API_KEY=sk-ant-...Config file:
llm:
api_key: sk-ant-...The config file value takes priority over the environment variable. If no key is found, Niobium falls back to rule-based filtering with a warning.
Configuration¶
The llm section in your config file controls smart mode:
llm:
api_key: null
model: claude-sonnet-4-6
max_tokens: 1024
temperature: 0.2
instructions: null| Key | Default | Description |
|---|---|---|
api_key | null | Anthropic API key |
model | "claude-sonnet-4-6" | Claude model identifier |
max_tokens | 1024 | Maximum response tokens |
temperature | 0.2 | Response variability (lower = more consistent) |
instructions | null | Custom prompt addition (see below) |
Custom instructions¶
The instructions field is the most powerful configuration option. It appends text to the built-in system prompt, letting you steer Claude’s decisions for a specific study context.
Pharmacology:
instructions: >-
I'm studying pharmacology. Prioritize drug names, drug classes, mechanisms
of action, receptor types, and side effects. Add hints about drug
interactions and clinical indications.Histology:
instructions: >-
These are histology slides. Occlude tissue types, cell types, staining
characteristics, and structural features. Add hints about how to
distinguish similar-looking tissues.Pathology:
instructions: >-
Focus on pathological findings. Occlude disease names, morphological
descriptions, and diagnostic features. Add hints about epidemiology
and clinical presentation.USMLE Step 1:
instructions: >-
I'm preparing for USMLE Step 1. Add high-yield clinical correlations
and First Aid-style memory aids in the hints.Text-heavy slides:
instructions: >-
These images contain mostly text paragraphs. Occlude only the most
important medical terms, numerical values, and key facts. Skip filler
words and context sentences.Set instructions to null (or remove it) to use the default general-purpose behaviour.
Cost¶
Claude Sonnet processes each image for approximately 0.01 depending on image size and number of text regions. A batch of 50 images costs approximately 0.50.
Fallback behaviour¶
If anything goes wrong during a Smart mode run (API error, network timeout, malformed response), Niobium automatically falls back to rule-based filtering for that image and continues processing. You always get your cards.
Caching¶
Claude responses are cached in ~/.config/niobium/cache.db so the same image is never sent to the API twice. The cache key is derived from the image content, the OCR text list, the model name, and the instructions string. Changing any of these causes a fresh API call. Use --no-cache to skip the cache for a single run. See Caching for details.