Skip to content

Image Pipeline

The image pipeline finds relevant, freely-licensed images for generated articles. It searches multiple APIs in parallel, then uses a two-phase AI scoring system to pick the best match.

Flow

  Article Title + TLDR
          |
          v
  +-------------------+
  |  Parallel Search   |  5 APIs queried concurrently
  |  (per query term)  |  Max 3 results per API
  +--------+----------+
           |
           v
  +-------------------+
  |   Deduplicate     |  Remove duplicate URLs
  |   + Diversify     |  Max 3 per source, cap at 15
  +--------+----------+
           |
           v
  +-------------------+
  |  Phase 1: Text    |  Score descriptions/tags
  |  Pre-Filter       |  using text model
  +--------+----------+
           |
           v
  +-------------------+
  |  Phase 2: Vision  |  Send actual images to
  |  Model Scoring    |  Llama Vision for scoring
  +--------+----------+
           |
           v
  Top-scored ImageResult
  (URL, Thumbnail, Attribution)

Image Sources

All sources provide freely-licensed images. The searcher queries each API in parallel for every search term.

SourceAPI Key RequiredLicenseNotes
UnsplashYes (free)Unsplash LicenseHigh-quality photos, landscape orientation
PexelsYes (free)Pexels LicenseGood variety, landscape orientation
PixabayYes (free)Pixabay LicenseLarge library, horizontal photos
OpenverseNoCC-licensedAggregates Flickr, Wikimedia, and other open sources
Wikimedia CommonsNoVarious CC/PDUses MediaWiki API with thumbnail generation

Two-Phase Scoring

Phase 1: Text Pre-Filter

The text model (same as the critic model) scores image candidates based on their metadata:

  • Image description or alt text
  • Tags from the source API
  • Source attribution
  • The search query that found it

Each candidate receives a relevance score from 1-10. Candidates are sorted by score, and the top 5 proceed to Phase 2.

Phase 2: Vision Model

The vision model (meta/llama-3.2-90b-vision-instruct) receives the actual image thumbnail along with the article title and summary. It scores each image on specificity:

ScoreMeaning
9-10Shows the exact brand, product, or place mentioned
7-8Closely related (correct type but not the specific one)
5-6Generic but relevant category
1-4Wrong or irrelevant

If all vision API calls fail, the pipeline falls back to the text-phase scores with no penalty.

Selection Process

  1. Search all 5 APIs in parallel with the article’s query terms
  2. Deduplicate by URL
  3. Enforce source diversity: max 3 images per source, cap total at 15
  4. Phase 1 text scoring narrows to top 5 candidates
  5. Phase 2 vision scoring picks the best match
  6. The highest-scored candidate becomes the article’s image

Stored Fields

The selected image is stored on the generated_articles record:

FieldDescription
image_urlFull-size image URL for the article page
image_thumbnailSmaller image URL for homepage cards
image_attributionCredit line (e.g. “Photo by X on Unsplash”)