Spaces:

KingZack
/

ctp-slack-bot

Runtime error

App Files Files Community

ctp-slack-bot / README.md

LiKenun

Refactor #6

f0fe0fd 3 months ago

preview code

raw

history blame contribute delete

5.13 kB

	---
	title: CTP Slack Bot
	emoji: 🦥
	colorFrom: red
	colorTo: green
	sdk: docker
	pinned: false
	license: mit
	short_description: Spring 2025 CTP Slack Bot RAG system
	app_port: 8080
	---


	# CTP Slack Bot

	## _Modus Operandi_ in a Nutshell

	* Intelligently responds to Slack messages (when mentioned) based on a repository of data.
	* Periodically checks for new content to add to its repository.

	## How to Run the Application

	You need to configure it first. This is done via environment variables, or an `.env` file based on the template, `.env.template`.

	Obtaining the values requires setting up API tokens/secrets with:

	* Slack: for `slack_bot_token` and `slack_app_token`
	* MongoDB: for `mongodb_uri`
	* OpenAI: for `openai_api_key`
	* Google Drive: for `google_project_id`, `google_client_id`, `google_client_email`, `google_private_key_id`, and `google_private_key`
	* For Google Drive, set up a service account. It’s the only supported authentication type.

	### Normally

	Just run the Docker image. 😉

	Build it with:

	```sh
	docker build . -t ctp-slack-bot
	```

	Run it with:

	```sh
	docker run --volume ./logs:/data --env-file=.env -p 8000:8000 --name my-ctp-slack-bot-instance ctp-slack-bot
	```

	### For Development

	Development usually requires rapid iteration. That means a change in the code ought to be reflected as soon as possible in the behavior of the application.

	First, make sure you are set up with a Python virtual environment created by the Python `venv` module and that it’s activated. Then install dependencies from `pyproject.toml` within the environment using:

	```sh
	pip3 install -e .
	```

	Make a copy of `.env.template` as `.env` and define the environment variables. (You can also define them by other means, but this has the least friction.) This file should not be committed and is excluded by `.gitignore`!

	If `localhost` port `8080` is free, running the following will make the application available on that port:

	```sh
	scripts/run-dev.sh
	```

	Visiting http://localhost:8080/health will return HTTP status OK and a payload containing the health status of individual components if everything is working.

	## Tech Stack

	* Hugging Face Spaces for hosting
	* OpenAI for embeddings and language models
	* Google Drive for reference data (i.e., the material to be incorporated into the bot’s knowledge base)
	* MongoDB for data persistence
	* Docker for containerization
	* Python
	* Slack Bolt client for interfacing with Slack
	* See `pyproject.toml` for additional Python packages.

	## General Project Structure

	Not every file or folder is listed, but the important stuff is here.

	* `src/`
	* `ctp_slack_bot/`
	* `core/`: fundamental components like configuration (using pydantic), logging setup (loguru), and custom exceptions
	* `config.py`: application settings model
	* `db/`: data connection and interface logic
	* `repositories/`: data collection/table interface logic
	* `mongo_db_vectorized_repository_base.py`: base implementation of a repository corresponding to a MongoDB collection with a search index
	* `vectorized_chunk_repository.py`: repository interface for `VectorizedChunk`s
	* `models/`: data models
	* `mime_type_handlers`: parsers for converting bytes of different MIME types to `Chunk`s
	* `services/`: business logic
	* `answer_retrieval_service.py`: obtains an answer to a question from a language model using relevant context
	* `application_health_service.py`: collects the health status of the application components
	* `content_ingestion_service.py`: converts content into chunks and stores them into the database
	* `context_retrieval_service.py`: queries for relevant context from the database to answer a question
	* `embeddings_model_service.py`: converts text to embeddings
	* `event_brokerage_service.py`: brokers events between decoupled components
	* `google_drive_service.py`: interfaces with Google Drive
	* `language_model_service.py`: answers questions using relevant context
	* `question_dispatch_service.py`: listens for questions and retrieves relevant context to get answers
	* `task_service.py`: runs periodic background tasks
	* `slack_service.py`: handles events from Slack and sends back responses
	* `vectorization_service.py`: converts chunks into chunks with embeddings
	* `tasks/`: scheduled tasks to run in the background
	* `utils/`: reusable utilities
	* `app.py`: application entry point
	* `containers.py`: the dependency injection container
	* `tests/`: unit tests
	* `scripts/`: utility scripts for development, deployment, etc.
	* `run-dev.sh`: script to run the application locally
	* `notebooks/`: Jupyter notebooks for exploration and model development
	* `.env`: local environment variables for development purposes (to be created for local use only from `.env.template`)
	* `Dockerfile`: Docker container build definition
	* `pyproject.toml`: project definition and dependencies