Image Pipeline

The image pipeline finds relevant, freely-licensed images for generated articles. It searches multiple APIs in parallel, then uses a two-phase AI scoring system to pick the best match.

Flow

  Article Title + TLDR
          |
          v
  +-------------------+
  |  Parallel Search   |  5 APIs queried concurrently
  |  (per query term)  |  Max 3 results per API
  +--------+----------+
           |
           v
  +-------------------+
  |   Deduplicate     |  Remove duplicate URLs
  |   + Diversify     |  Max 3 per source, cap at 15
  +--------+----------+
           |
           v
  +-------------------+
  |  Phase 1: Text    |  Score descriptions/tags
  |  Pre-Filter       |  using text model
  +--------+----------+
           |
           v
  +-------------------+
  |  Phase 2: Vision  |  Send actual images to
  |  Model Scoring    |  Llama Vision for scoring
  +--------+----------+
           |
           v
  Top-scored ImageResult
  (URL, Thumbnail, Attribution)

Image Sources

All sources provide freely-licensed images. The searcher queries each API in parallel for every search term.

Source	API Key Required	License	Notes
Unsplash	Yes (free)	Unsplash License	High-quality photos, landscape orientation
Pexels	Yes (free)	Pexels License	Good variety, landscape orientation
Pixabay	Yes (free)	Pixabay License	Large library, horizontal photos
Openverse	No	CC-licensed	Aggregates Flickr, Wikimedia, and other open sources
Wikimedia Commons	No	Various CC/PD	Uses MediaWiki API with thumbnail generation

Two-Phase Scoring

Phase 1: Text Pre-Filter

The text model (same as the critic model) scores image candidates based on their metadata:

Image description or alt text
Tags from the source API
Source attribution
The search query that found it

Each candidate receives a relevance score from 1-10. Candidates are sorted by score, and the top 5 proceed to Phase 2.

Phase 2: Vision Model

The vision model (meta/llama-3.2-90b-vision-instruct) receives the actual image thumbnail along with the article title and summary. It scores each image on specificity:

Score	Meaning
9-10	Shows the exact brand, product, or place mentioned
7-8	Closely related (correct type but not the specific one)
5-6	Generic but relevant category
1-4	Wrong or irrelevant

If all vision API calls fail, the pipeline falls back to the text-phase scores with no penalty.

Selection Process

Search all 5 APIs in parallel with the article’s query terms
Deduplicate by URL
Enforce source diversity: max 3 images per source, cap total at 15
Phase 1 text scoring narrows to top 5 candidates
Phase 2 vision scoring picks the best match
The highest-scored candidate becomes the article’s image

Stored Fields

The selected image is stored on the generated_articles record:

Field	Description
`image_url`	Full-size image URL for the article page
`image_thumbnail`	Smaller image URL for homepage cards
`image_attribution`	Credit line (e.g. “Photo by X on Unsplash”)

AI Pipeline