Spaces:
Sleeping
Sleeping
## Introduction | |
The API follows and extends OpenAI API standard, and supports | |
both normal and streaming responses. | |
The API is divided in two logical blocks: | |
- High-level API, abstracting all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation: | |
- Ingestion of documents: internally managing document parsing, splitting, metadata extraction, | |
embedding generation and storage. | |
- Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt | |
engineering and the response generation. | |
- Low-level API, allowing advanced users to implement their own complex pipelines: | |
- Embeddings generation: based on a piece of text. | |
- Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested | |
documents. | |
## API | |
As explained in the introduction, the API contains high level APIs (ingestion and chat/completions) and low level APIs | |
(embeddings and chunk retrieval). In this section the different specific API calls are explained. | |