The article also provides a guide on how to use the API, either via Python or CURL. It includes instructions on how to start the server using either a basic command or Uvicorn. The API supports any model from sbert.net and uses shared key authentication, which is suitable for local use cases. The server downloads the model data to cache on the first use, which is about 400mb depending on the model in use. The article also mentions that the API has dependencies on sentence_transformers, fastapi, and uvicorn.
Key takeaways:
- The Private Embedding Server API is a drop-in replacement for OpenAI's embedding API, designed with privacy in mind to prevent data leakage.
- The server supports modern AI use cases such as Classification, Clustering, Semantic Search, and Recommendations.
- The server can be started using either a direct command or Uvicorn, and supports any model from sbert.net with an environment variable MODEL.
- The server uses shared key authentication and will download the model data to cache on the first use, which can be cleared by removing the torch cache.