GitHub - toshsan/embedding-server: Drop in replacement for OpenAI's embedding API. Self Hosted.

This article provides information about a private embedding server API, which is a drop-in replacement for OpenAI's embedding API. The API is written in Python3 and can be used with official OpenAPI libraries. The motivation behind this API is to support modern AI use cases such as classification, clustering, semantic search, and recommendations while ensuring the privacy of user and customer data. The API is a solution for those who are concerned about their data being leaked to OpenAI and Huggingface APIs, which may use the data for purposes not defined by their privacy policy.

The article also provides a guide on how to use the API, either via Python or CURL. It includes instructions on how to start the server using either a basic command or Uvicorn. The API supports any model from sbert.net and uses shared key authentication, which is suitable for local use cases. The server downloads the model data to cache on the first use, which is about 400mb depending on the model in use. The article also mentions that the API has dependencies on sentence_transformers, fastapi, and uvicorn.

Key takeaways:

The Private Embedding Server API is a drop-in replacement for OpenAI's embedding API, designed with privacy in mind to prevent data leakage.
The server supports modern AI use cases such as Classification, Clustering, Semantic Search, and Recommendations.
The server can be started using either a direct command or Uvicorn, and supports any model from sbert.net with an environment variable MODEL.
The server uses shared key authentication and will download the model data to cache on the first use, which can be cleared by removing the torch cache.

GitHub - toshsan/embedding-server: Drop in replacement for OpenAI's embedding API. Self Hosted.

Key takeaways:

Comments (0)

Newsletter