Jun 29, 20265 min readDev Soufiane

Setting Up RecipeScrape Locally: A Contributor's Guide

How to clone, configure, and run both the scraper and frontend on your machine — plus how to add a new recipe source site.

RecipeScrape is open source and contributions are welcome — especially new source sites, better normalizers, and frontend improvements. Here's how to get everything running on your machine.

The Two Repos

Clone both projects side by side:

bash

git clone https://github.com/your-org/recipescrape-scraper
git clone https://github.com/your-org/recipescrape-web

Scraper Setup (Python + uv)

The scraper uses Python 3.12 and uv (the fast Rust-based package manager):

bash

cd recipescrape-scraper
uv sync
cp .env.example .env  # add your DATABASE_URL
uv run python -m scrape_engine --sources allrecipes --limit 5

The --limit 5 flag restricts the run to 5 recipes — useful for testing a new spider without scraping thousands of rows. Omit it for a full run.

Frontend Setup (Next.js + Bun)

bash

cd recipescrape-web
bun install
cp .env.example .env.local  # add DATABASE_URL (same DB or a local copy)
bun run dev

The frontend starts on http://localhost:3000. The API is available at http://localhost:3000/api/v1/. All endpoints work exactly like production — pagination, full-text search, filters, everything.

Adding a New Source Site

The most common contribution is a new recipe source. Create a spider file in spiders/:

python

# spiders/my_new_site.py
from .base import BaseScraper

class MyNewSiteScraper(BaseScraper):
    name = "mynewsite"
    base_url = "https://www.mynewsite.com"

    def get_recipe_urls(self) -> list[str]:
        # Crawl sitemap or category pages
        # Return a list of individual recipe URLs
        soup = self.fetch_soup(f"{self.base_url}/recipes/")
        return [a["href"] for a in soup.select("article a[href*='/recipe/']")]

That's it. The BaseScraper parent class handles rate limiting, parsing via recipe-scrapers, normalization, and database upsert. Add a test fixture (saved HTML) and a YAML source profile, and you're done.

Running Tests

bash

cd recipescrape-scraper
uv run pytest tests/ -v

Tests use saved HTML fixtures so they're fast and don't hit the network. A passing test suite means your spider extracts the correct number of recipes and the normalizer produces valid output.

What Makes a Good Contribution?

New source sites covering cuisines or regions not yet represented
Better normalizers that handle edge cases (fractional quantities like "1/2 cup", alternative nutrition schemas)
Frontend improvements — accessibility, performance, dark mode, mobile layout
API features — GraphQL wrapper, ingredient substitution endpoint, batch detail lookup

Check the repo issues for the current priorities, or open a new one to discuss your idea before building.