About a month ago, I heard about petals. Petals is basically a library that lets you run LLMs by loading the weights onto a network of computers that are all running petals. Then it splits up inference allowing you to run larger LLMs with consumer GPUs or just a Google colab.Earlier I was submitting a sequence to an AI protein folding server and had to wait almost 2 hours. This gave me the idea of trying to modify petals for biology models. Rewriting petals completely for different architectures (genome language models, protein language models) would be difficult, so I decided to start off by just rewriting for a biology tuned llama.If u wanna try it out, theres a fully setup Google colab linked in the readme, it might not actually run though since it needs a certain amount of people on the network.