Donor Scraper for Fairfax Cryobank

autocatt1 pts0 comments

GitHub - tgys/cryobankscraper: scrapes donors from fairfax cryobank and displays them in a gallery · GitHub

/" data-turbo-transient="true" />

Skip to content

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Search

Clear

Search syntax tips

Provide feedback

--><br>We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

{{ message }}

tgys

cryobankscraper

Public

Notifications<br>You must be signed in to change notification settings

Fork

Star

master

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit

History<br>2 Commits<br>2 Commits

cmake

cmake

include

include

src

src

CMakeLists.txt

CMakeLists.txt

README.md

README.md

flake.lock

flake.lock

flake.nix

flake.nix

View all files

Repository files navigation

cryobankscraper

C++ scraper for Fairfax Cryobank donor profile pages. It fetches donor URLs, extracts photos from each profile, and writes a JSON gallery you can browse in the Qt app.

Build

Requires Qt 6 (Core, Gui, Network, Widgets, Concurrent) and CMake 3.20+.

cmake -S . -B build<br>cmake --build build

Binaries:

build/scrapeff — default CLI scraper (Chromium-style HTTP, parallel workers)

build/scrape_requests — same scraper with Python requests-compatible HTTP (single-threaded)

build/gallery_qt — desktop gallery; runs a scrape on launch if gallery_data.json is missing

How it scrapes

Build a URL list in the working directory:

Reads base_urls.txt (default: https://fairfaxcryobank.com/search/donorprofile.aspx?number=).

For Fairfax profiles, also harvests links from the “meet our newest donors” listing page.

Generates candidate URLs for donor IDs 0 … 9999 (override with SCRAPEFF_FAIRFAX_MAX_ID), trying both padded (0428) and unpadded (428) number= values.

Merges any extra URLs from target_urls.txt.

Writes the final list to target_urls.txt.

Fetch each profile page over HTTPS (retries, optional pacing via SCRAPEFF_HTTP_SPACING_MS).

Parse HTML with Gumbo: find images inside div.main → div.foto (or #main / #foto).

Download each image and verify it decodes as a real image. One gallery tile per donor profile ID is kept.

Write outputs to the working directory:

gallery_data.json — main gallery data (page, src, did, profile_number)

gallery.html — static HTML gallery

image_urls.txt, fetch_failed.txt, profile_urls_before_image_extract.txt

childhood_photo_did_counts.json — counts per ChildhoodPhoto.ashx?did= id

Donors with a real photo use ChildhoodPhoto.ashx?did=…. Donors without one show a shared placeholder (search-temp-01.jpg).

Gallery

gallery_qt loads gallery_data.json (or scrapes first). By default it shows only donors with profile pics (real ChildhoodPhoto URLs), sorted by donor id low → high. Use Filter → All donors to include placeholders.

Useful environment variables

Variable<br>Purpose

SCRAPEFF_THREADS<br>Parallel workers for scrapeff (default 4)

SCRAPEFF_FAIRFAX_MAX_ID<br>Upper bound for brute-force donor IDs (default 10000)

SCRAPEFF_FAIRFAX_SKIP_BRUTE_IDS=1<br>Only scrape URLs found on the listing page

SCRAPEFF_TARGET_URLS_ONLY=1<br>Use target_urls.txt only, skip generation

SCRAPEFF_HTTP_PROFILE=urllib<br>Python-requests-style client (or use scrape_requests)

Run the CLI from the directory where you want output files written (target_urls.txt, gallery_data.json, etc.).

About

scrapes donors from fairfax cryobank and displays them in a gallery

Resources

Readme

Uh oh!

There was an error while loading. Please reload this page.

Activity

Stars

stars

Watchers

watching

Forks

forks

Report repository

Releases

No releases published

Packages

Uh oh!

There was an error while loading. Please reload this page.

Contributors

Uh oh!

There was an error while loading. Please reload this page.

Languages

C++<br>95.9%

Nix<br>2.1%

CMake<br>2.0%

You can’t perform that action at this time.

gallery build donor donors search page

Related Articles