I kept seeing people ask Which model i can run on my gpu , will model X fit on my GPU . Thats why I built a filter on whichllmmodel that lets you search models by what will actually fit on your hardware (8GB, 16GB, 24GB, etc.) at a given quantization level.