Spaces:
Runtime error
Runtime error
File size: 5,134 Bytes
5fd100c 906b3be 5fd100c 906b3be 5fd100c 775cc8d 5fd100c 005a292 c6a2a56 3da2136 c6a2a56 6532466 86644e7 bb7c9a3 86644e7 6532466 b9c8796 1fd6030 b9c8796 6532466 64566ca bb7c9a3 6532466 bb7c9a3 3da2136 6532466 3da2136 6532466 3da2136 64566ca bb7c9a3 3da2136 a1a6d79 bb7c9a3 a1a6d79 bb7c9a3 a1a6d79 3da2136 bb7c9a3 3da2136 a1a6d79 3da2136 f0fe0fd 3da2136 bb7c9a3 3da2136 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
---
title: CTP Slack Bot
emoji: 🦥
colorFrom: red
colorTo: green
sdk: docker
pinned: false
license: mit
short_description: Spring 2025 CTP Slack Bot RAG system
app_port: 8080
---
# CTP Slack Bot
## _Modus Operandi_ in a Nutshell
* Intelligently responds to Slack messages (when mentioned) based on a repository of data.
* Periodically checks for new content to add to its repository.
## How to Run the Application
You need to configure it first. This is done via environment variables, or an `.env` file based on the template, `.env.template`.
Obtaining the values requires setting up API tokens/secrets with:
* Slack: for `slack_bot_token` and `slack_app_token`
* MongoDB: for `mongodb_uri`
* OpenAI: for `openai_api_key`
* Google Drive: for `google_project_id`, `google_client_id`, `google_client_email`, `google_private_key_id`, and `google_private_key`
* For Google Drive, set up a service account. It’s the only supported authentication type.
### Normally
Just run the Docker image. 😉
Build it with:
```sh
docker build . -t ctp-slack-bot
```
Run it with:
```sh
docker run --volume ./logs:/data --env-file=.env -p 8000:8000 --name my-ctp-slack-bot-instance ctp-slack-bot
```
### For Development
Development usually requires rapid iteration. That means a change in the code ought to be reflected as soon as possible in the behavior of the application.
First, make sure you are set up with a Python virtual environment created by the Python `venv` module and that it’s activated. Then install dependencies from `pyproject.toml` within the environment using:
```sh
pip3 install -e .
```
Make a copy of `.env.template` as `.env` and define the environment variables. (You can also define them by other means, but this has the least friction.) This file should not be committed and is excluded by `.gitignore`!
If `localhost` port `8080` is free, running the following will make the application available on that port:
```sh
scripts/run-dev.sh
```
Visiting http://localhost:8080/health will return HTTP status OK and a payload containing the health status of individual components if everything is working.
## Tech Stack
* Hugging Face Spaces for hosting
* OpenAI for embeddings and language models
* Google Drive for reference data (i.e., the material to be incorporated into the bot’s knowledge base)
* MongoDB for data persistence
* Docker for containerization
* Python
* Slack Bolt client for interfacing with Slack
* See `pyproject.toml` for additional Python packages.
## General Project Structure
Not every file or folder is listed, but the important stuff is here.
* `src/`
* `ctp_slack_bot/`
* `core/`: fundamental components like configuration (using pydantic), logging setup (loguru), and custom exceptions
* `config.py`: application settings model
* `db/`: data connection and interface logic
* `repositories/`: data collection/table interface logic
* `mongo_db_vectorized_repository_base.py`: base implementation of a repository corresponding to a MongoDB collection with a search index
* `vectorized_chunk_repository.py`: repository interface for `VectorizedChunk`s
* `models/`: data models
* `mime_type_handlers`: parsers for converting bytes of different MIME types to `Chunk`s
* `services/`: business logic
* `answer_retrieval_service.py`: obtains an answer to a question from a language model using relevant context
* `application_health_service.py`: collects the health status of the application components
* `content_ingestion_service.py`: converts content into chunks and stores them into the database
* `context_retrieval_service.py`: queries for relevant context from the database to answer a question
* `embeddings_model_service.py`: converts text to embeddings
* `event_brokerage_service.py`: brokers events between decoupled components
* `google_drive_service.py`: interfaces with Google Drive
* `language_model_service.py`: answers questions using relevant context
* `question_dispatch_service.py`: listens for questions and retrieves relevant context to get answers
* `task_service.py`: runs periodic background tasks
* `slack_service.py`: handles events from Slack and sends back responses
* `vectorization_service.py`: converts chunks into chunks with embeddings
* `tasks/`: scheduled tasks to run in the background
* `utils/`: reusable utilities
* `app.py`: application entry point
* `containers.py`: the dependency injection container
* `tests/`: unit tests
* `scripts/`: utility scripts for development, deployment, etc.
* `run-dev.sh`: script to run the application locally
* `notebooks/`: Jupyter notebooks for exploration and model development
* `.env`: local environment variables for development purposes (to be created for local use only from `.env.template`)
* `Dockerfile`: Docker container build definition
* `pyproject.toml`: project definition and dependencies
|