Spaces:
Runtime error
Runtime error
title: CTP Slack Bot | |
emoji: 🦥 | |
colorFrom: red | |
colorTo: green | |
sdk: docker | |
pinned: false | |
license: mit | |
short_description: Spring 2025 CTP Slack Bot RAG system | |
app_port: 8080 | |
# CTP Slack Bot | |
## _Modus Operandi_ in a Nutshell | |
* Intelligently responds to Slack messages (when mentioned) based on a repository of data. | |
* Periodically checks for new content to add to its repository. | |
## How to Run the Application | |
You need to configure it first. This is done via environment variables, or an `.env` file based on the template, `.env.template`. | |
Obtaining the values requires setting up API tokens/secrets with: | |
* Slack: for `slack_bot_token` and `slack_app_token` | |
* MongoDB: for `mongodb_uri` | |
* OpenAI: for `openai_api_key` | |
* Google Drive: for `google_project_id`, `google_client_id`, `google_client_email`, `google_private_key_id`, and `google_private_key` | |
* For Google Drive, set up a service account. It’s the only supported authentication type. | |
### Normally | |
Just run the Docker image. 😉 | |
Build it with: | |
```sh | |
docker build . -t ctp-slack-bot | |
``` | |
Run it with: | |
```sh | |
docker run --volume ./logs:/data --env-file=.env -p 8000:8000 --name my-ctp-slack-bot-instance ctp-slack-bot | |
``` | |
### For Development | |
Development usually requires rapid iteration. That means a change in the code ought to be reflected as soon as possible in the behavior of the application. | |
First, make sure you are set up with a Python virtual environment created by the Python `venv` module and that it’s activated. Then install dependencies from `pyproject.toml` within the environment using: | |
```sh | |
pip3 install -e . | |
``` | |
Make a copy of `.env.template` as `.env` and define the environment variables. (You can also define them by other means, but this has the least friction.) This file should not be committed and is excluded by `.gitignore`! | |
If `localhost` port `8080` is free, running the following will make the application available on that port: | |
```sh | |
scripts/run-dev.sh | |
``` | |
Visiting http://localhost:8080/health will return HTTP status OK and a payload containing the health status of individual components if everything is working. | |
## Tech Stack | |
* Hugging Face Spaces for hosting | |
* OpenAI for embeddings and language models | |
* Google Drive for reference data (i.e., the material to be incorporated into the bot’s knowledge base) | |
* MongoDB for data persistence | |
* Docker for containerization | |
* Python | |
* Slack Bolt client for interfacing with Slack | |
* See `pyproject.toml` for additional Python packages. | |
## General Project Structure | |
Not every file or folder is listed, but the important stuff is here. | |
* `src/` | |
* `ctp_slack_bot/` | |
* `core/`: fundamental components like configuration (using pydantic), logging setup (loguru), and custom exceptions | |
* `config.py`: application settings model | |
* `db/`: data connection and interface logic | |
* `repositories/`: data collection/table interface logic | |
* `mongo_db_vectorized_repository_base.py`: base implementation of a repository corresponding to a MongoDB collection with a search index | |
* `vectorized_chunk_repository.py`: repository interface for `VectorizedChunk`s | |
* `models/`: data models | |
* `mime_type_handlers`: parsers for converting bytes of different MIME types to `Chunk`s | |
* `services/`: business logic | |
* `answer_retrieval_service.py`: obtains an answer to a question from a language model using relevant context | |
* `application_health_service.py`: collects the health status of the application components | |
* `content_ingestion_service.py`: converts content into chunks and stores them into the database | |
* `context_retrieval_service.py`: queries for relevant context from the database to answer a question | |
* `embeddings_model_service.py`: converts text to embeddings | |
* `event_brokerage_service.py`: brokers events between decoupled components | |
* `google_drive_service.py`: interfaces with Google Drive | |
* `language_model_service.py`: answers questions using relevant context | |
* `question_dispatch_service.py`: listens for questions and retrieves relevant context to get answers | |
* `task_service.py`: runs periodic background tasks | |
* `slack_service.py`: handles events from Slack and sends back responses | |
* `vectorization_service.py`: converts chunks into chunks with embeddings | |
* `tasks/`: scheduled tasks to run in the background | |
* `utils/`: reusable utilities | |
* `app.py`: application entry point | |
* `containers.py`: the dependency injection container | |
* `tests/`: unit tests | |
* `scripts/`: utility scripts for development, deployment, etc. | |
* `run-dev.sh`: script to run the application locally | |
* `notebooks/`: Jupyter notebooks for exploration and model development | |
* `.env`: local environment variables for development purposes (to be created for local use only from `.env.template`) | |
* `Dockerfile`: Docker container build definition | |
* `pyproject.toml`: project definition and dependencies | |