Hardware LLM Taalas Reaches >14,000 TPS on Llama 3.1 8B

nullbio1 pts1 comments

Products | Taalas

Products<br>Log<br>Careers

Products<br>Taalas HC1 Technology Demonstrator<br>Runs Llama 3.1 8B model<br>TSMC 6nm | 815mm2 | 53B Transistor<br>2.5 kW Server<br>Try our chatbot<br>Request API access

Instantaneous Inference<br>HC1 demonstrates the power of Taalas hardcore model silicon technology, delivering 17k tokens per second per user on Llama 3.1 8B model.

Source: Model Llama 3.1 8B, Nvidia Baseline (H200), B200 measured by Taalas | Groq, Sambanova, Cerebras performance from Artificial Analysis | Taalas Performance run by Taalas labs | Input sequence length 1k/1k

CloseJoin our team!<br>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam placerat iaculis porta. Nam id blandit lectus. Vivamus at turpis eu dolor vulputate dignissim.<br>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam placerat iaculis porta. Nam id blandit lectus. Vivamus at turpis eu dolor vulputate dignissim.

Send your CV<br>[contact-form-7 id="c1a6c82" title="Contact form"]By submitting this form: You agree to the processing of the submitted personal data in accordance with our Privacy Policy, including the transfer of data to the United States.

Search<br>Search for:

This website uses cookies to improve user experience. To learn more take a look at our Privacy policy.<br>By selecting "Accept cookies" on this banner, you agree to the use and storage of cookies on your device.

Accept cookies

This website requires a JavaScript enabled browser.

You are using an outdated browser which can not show modern web content.<br>We suggest you download Chrome or Firefox.

taalas llama model dolor cookies products

Related Articles