Image Pipeline
The image pipeline finds relevant, freely-licensed images for generated articles. It searches multiple APIs in parallel, then uses a two-phase AI scoring system to pick the best match.
Flow
Article Title + TLDR
|
v
+-------------------+
| Parallel Search | 5 APIs queried concurrently
| (per query term) | Max 3 results per API
+--------+----------+
|
v
+-------------------+
| Deduplicate | Remove duplicate URLs
| + Diversify | Max 3 per source, cap at 15
+--------+----------+
|
v
+-------------------+
| Phase 1: Text | Score descriptions/tags
| Pre-Filter | using text model
+--------+----------+
|
v
+-------------------+
| Phase 2: Vision | Send actual images to
| Model Scoring | Llama Vision for scoring
+--------+----------+
|
v
Top-scored ImageResult
(URL, Thumbnail, Attribution)Image Sources
All sources provide freely-licensed images. The searcher queries each API in parallel for every search term.
| Source | API Key Required | License | Notes |
|---|---|---|---|
| Unsplash | Yes (free) | Unsplash License | High-quality photos, landscape orientation |
| Pexels | Yes (free) | Pexels License | Good variety, landscape orientation |
| Pixabay | Yes (free) | Pixabay License | Large library, horizontal photos |
| Openverse | No | CC-licensed | Aggregates Flickr, Wikimedia, and other open sources |
| Wikimedia Commons | No | Various CC/PD | Uses MediaWiki API with thumbnail generation |
Two-Phase Scoring
Phase 1: Text Pre-Filter
The text model (same as the critic model) scores image candidates based on their metadata:
- Image description or alt text
- Tags from the source API
- Source attribution
- The search query that found it
Each candidate receives a relevance score from 1-10. Candidates are sorted by score, and the top 5 proceed to Phase 2.
Phase 2: Vision Model
The vision model (meta/llama-3.2-90b-vision-instruct) receives the actual image thumbnail along with the article title and summary. It scores each image on specificity:
| Score | Meaning |
|---|---|
| 9-10 | Shows the exact brand, product, or place mentioned |
| 7-8 | Closely related (correct type but not the specific one) |
| 5-6 | Generic but relevant category |
| 1-4 | Wrong or irrelevant |
If all vision API calls fail, the pipeline falls back to the text-phase scores with no penalty.
Selection Process
- Search all 5 APIs in parallel with the article’s query terms
- Deduplicate by URL
- Enforce source diversity: max 3 images per source, cap total at 15
- Phase 1 text scoring narrows to top 5 candidates
- Phase 2 vision scoring picks the best match
- The highest-scored candidate becomes the article’s image
Stored Fields
The selected image is stored on the generated_articles record:
| Field | Description |
|---|---|
image_url | Full-size image URL for the article page |
image_thumbnail | Smaller image URL for homepage cards |
image_attribution | Credit line (e.g. “Photo by X on Unsplash”) |