The engine is capable of running Mixtral 8x7B at a speed of 20+ tokens per second and Mistral at 50+ tokens per second. It supports a dozen different architectures and is capable of handling models up to 100B parameters in size, with plans for further expansion.
Key takeaways:
- Truffle-1 is an AI inference engine designed to run opensource models at home, using only 60 Watts.
- It is built to be used as a home server and can be connected to through BLE, WiFi and USB-C.
- Truffle-1 is capable of running Mixtral 8x7B at 20+ tokens/s, and Mistral at 50+ tokens/s.
- It supports a dozen different architectures and models up to 100B parameters in size, with more on the way.