Setting Up RecipeScrape Locally: A Contributor's Guide
How to clone, configure, and run both the scraper and frontend on your machine — plus how to add a new recipe source site.
RecipeScrape is open source and contributions are welcome — especially new source sites, better normalizers, and frontend improvements. Here's how to get everything running on your machine.
The Two Repos
Clone both projects side by side:
git clone https://github.com/your-org/recipescrape-scraper
git clone https://github.com/your-org/recipescrape-webScraper Setup (Python + uv)
The scraper uses Python 3.12 and uv (the fast Rust-based package manager):
cd recipescrape-scraper
uv sync
cp .env.example .env # add your DATABASE_URL
uv run python -m scrape_engine --sources allrecipes --limit 5The --limit 5 flag restricts the run to 5 recipes — useful for testing a new spider without scraping thousands of rows. Omit it for a full run.
Frontend Setup (Next.js + Bun)
cd recipescrape-web
bun install
cp .env.example .env.local # add DATABASE_URL (same DB or a local copy)
bun run devThe frontend starts on http://localhost:3000. The API is available at http://localhost:3000/api/v1/. All endpoints work exactly like production — pagination, full-text search, filters, everything.
Adding a New Source Site
The most common contribution is a new recipe source. Create a spider file in spiders/:
# spiders/my_new_site.py
from .base import BaseScraper
class MyNewSiteScraper(BaseScraper):
name = "mynewsite"
base_url = "https://www.mynewsite.com"
def get_recipe_urls(self) -> list[str]:
# Crawl sitemap or category pages
# Return a list of individual recipe URLs
soup = self.fetch_soup(f"{self.base_url}/recipes/")
return [a["href"] for a in soup.select("article a[href*='/recipe/']")]That's it. The BaseScraper parent class handles rate limiting, parsing via recipe-scrapers, normalization, and database upsert. Add a test fixture (saved HTML) and a YAML source profile, and you're done.
Running Tests
cd recipescrape-scraper
uv run pytest tests/ -vTests use saved HTML fixtures so they're fast and don't hit the network. A passing test suite means your spider extracts the correct number of recipes and the normalizer produces valid output.
What Makes a Good Contribution?
- New source sites covering cuisines or regions not yet represented
- Better normalizers that handle edge cases (fractional quantities like "1/2 cup", alternative nutrition schemas)
- Frontend improvements — accessibility, performance, dark mode, mobile layout
- API features — GraphQL wrapper, ingredient substitution endpoint, batch detail lookup
Check the repo issues for the current priorities, or open a new one to discuss your idea before building.