LiKenun commited on
Commit
3da2136
·
1 Parent(s): 4c8a84c

Update documentation and environment variables configuration

Browse files
Files changed (3) hide show
  1. .env.template +0 -3
  2. README.md +41 -43
  3. src/ctp_slack_bot/core/config.py +8 -4
.env.template CHANGED
@@ -1,8 +1,5 @@
1
  # Copy this file and modify. Do not save or commit the secrets!
2
 
3
- # Application Configuration
4
- DEBUG=TRUE
5
-
6
  # APScheduler Configuration
7
  SCHEDULER_TIMEZONE=UTC
8
 
 
1
  # Copy this file and modify. Do not save or commit the secrets!
2
 
 
 
 
3
  # APScheduler Configuration
4
  SCHEDULER_TIMEZONE=UTC
5
 
README.md CHANGED
@@ -10,47 +10,13 @@ short_description: Spring 2025 CTP Slack Bot RAG system
10
  ---
11
 
12
 
13
-
14
-
15
  # CTP Slack Bot
16
 
17
  ## _Modus Operandi_ in a Nutshell
18
 
19
- * Intelligently responds to Slack messages based on a repository of data.
20
  * Periodically checks for new content to add to its repository.
21
 
22
- ## Tech Stack
23
-
24
- * Hugging Face Spaces for hosting and serverless API
25
- * Google Drive for reference data (i.e., the material to be incorporated into the bot’s knowledge base)
26
- * MongoDB for data persistence
27
- * Docker for containerization
28
- * Python
29
- * FastAPI for serving HTTP requests
30
- * httpx for making HTTP requests
31
- * APScheduler for running periodic tasks in the background
32
- * See `pyproject.toml` for additional Python packages.
33
-
34
- ## General Project Structure
35
-
36
- * `src/`
37
- * `ctp_slack_bot/`
38
- * `api/`: FastAPI application structure
39
- * `routes.py`: API endpoint definitions
40
- * `core/`: fundamental components like configuration (using pydantic), logging setup (loguru), and custom exceptions
41
- * `db/`: database connection
42
- * `repositories/`: repository pattern implementation
43
- * `models/`: Pydantic models for data validation and serialization
44
- * `services/`: business logic
45
- * `tasks/`: background scheduled jobs
46
- * `utils/`: reusable utilities
47
- * `tests/`: unit tests
48
- * `scripts/`: utility scripts for development, deployment, etc.
49
- * `run-dev.sh`: script to run the application locally
50
- * `notebooks/`: Jupyter notebooks for exploration and model development
51
- * `.env`: local environment variables for development purposes (to be created for local use only from `.env.template`)
52
- * `Dockerfile`: Docker container build definition
53
-
54
  ## How to Run the Application
55
 
56
  ### Normally
@@ -66,7 +32,7 @@ docker build . -t ctp-slack-bot
66
  Run it with:
67
 
68
  ```sh
69
- docker run --env-file=.env -p 8000:8000 --name my-ctp-slack-bot-instance ctp-slack-bot
70
  ```
71
 
72
  ### For Development
@@ -87,13 +53,45 @@ If `localhost` port `8000` is free, running the following will make the applicat
87
  scripts/run-dev.sh
88
  ```
89
 
90
- You can check that it’s reachable by visiting [http://localhost:8000/health](http://localhost:8000/health).
91
 
92
- ```text
93
- $ curl http://localhost:8000/health
94
- {"status":"healthy"}
95
- ```
 
 
 
 
96
 
97
- In debug mode (`DEBUG=true`), [http://localhost:8000/env](http://localhost:8000/env) will pretty-print the non-sensitive environment variables as JSON.
98
 
99
- Uvicorn will restart the application automatically when any source files are changed.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
 
 
 
13
  # CTP Slack Bot
14
 
15
  ## _Modus Operandi_ in a Nutshell
16
 
17
+ * Intelligently responds to Slack messages (when mentioned) based on a repository of data.
18
  * Periodically checks for new content to add to its repository.
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## How to Run the Application
21
 
22
  ### Normally
 
32
  Run it with:
33
 
34
  ```sh
35
+ docker run --volume ./logs:/app/logs/ --env-file=.env -p 8000:8000 --name my-ctp-slack-bot-instance ctp-slack-bot
36
  ```
37
 
38
  ### For Development
 
53
  scripts/run-dev.sh
54
  ```
55
 
56
+ ## Tech Stack
57
 
58
+ * Hugging Face Spaces for hosting
59
+ * OpenAI for embeddings and language models
60
+ * Google Drive for reference data (i.e., the material to be incorporated into the bot’s knowledge base)
61
+ * MongoDB for data persistence
62
+ * Docker for containerization
63
+ * Python
64
+ * Slack Bolt client for interfacing with Slack
65
+ * See `pyproject.toml` for additional Python packages.
66
 
67
+ ## General Project Structure
68
 
69
+ * `src/`
70
+ * `ctp_slack_bot/`
71
+ * `core/`: fundamental components like configuration (using pydantic), logging setup (loguru), and custom exceptions
72
+ * `db/`: database connection
73
+ * `repositories/`: repository pattern implementation
74
+ * `models/`: Pydantic models for data validation and serialization
75
+ * `services/`: business logic
76
+ * `answer_retrieval_service.py`: obtains an answer to a question from a language model using relevant context
77
+ * `content_ingestion_service.py`: converts content into chunks and stores them into the database
78
+ * `context_retrieval_service.py`: queries for relevant context from the database to answer a question
79
+ * `embeddings_model_service.py`: converts text to embeddings
80
+ * `event_brokerage_service.py`: brokers events between decoupled components
81
+ * `language_model_service.py`: answers questions using relevant context
82
+ * `question_dispatch_service.py`: listens for questions and retrieves relevant context to get answers
83
+ * `schedule_service.py`: runs background jobs
84
+ * `slack_service.py`: handles events from Slack and sends back responses
85
+ * `vector_database_service.py`: stores and queries chunks
86
+ * `vectorization_service.py`: converts chunks into chunks with embeddings
87
+ * `tasks/`: background scheduled jobs
88
+ * `utils/`: reusable utilities
89
+ * `app.py`: application entry point
90
+ * `containers.py`: the dependency injection container
91
+ * `tests/`: unit tests
92
+ * `scripts/`: utility scripts for development, deployment, etc.
93
+ * `run-dev.sh`: script to run the application locally
94
+ * `notebooks/`: Jupyter notebooks for exploration and model development
95
+ * `.env`: local environment variables for development purposes (to be created for local use only from `.env.template`)
96
+ * `Dockerfile`: Docker container build definition
97
+ * `pyproject.toml`: project definition and dependencies
src/ctp_slack_bot/core/config.py CHANGED
@@ -1,17 +1,21 @@
 
1
  from pydantic import Field, MongoDsn, NonNegativeFloat, NonNegativeInt, PositiveInt, SecretStr
2
  from pydantic_settings import BaseSettings, SettingsConfigDict
3
  from types import MappingProxyType
4
  from typing import Literal, Mapping, Optional, Self
5
 
6
- class Settings(BaseSettings): # TODO: Strong guarantees of validity, because garbage in = garbage out, and settings flow into all the nooks and crannies
7
  """
8
  Application settings loaded from environment variables.
9
  """
10
 
11
- # Application Configuration
12
- DEBUG: bool = False
 
 
 
13
 
14
- # Logging Configuration
15
  LOG_LEVEL: Literal["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"] = Field(default_factory=lambda data: "DEBUG" if data.get("DEBUG", False) else "INFO")
16
  LOG_FORMAT: Literal["text", "json"] = "json"
17
 
 
1
+ from loguru import logger
2
  from pydantic import Field, MongoDsn, NonNegativeFloat, NonNegativeInt, PositiveInt, SecretStr
3
  from pydantic_settings import BaseSettings, SettingsConfigDict
4
  from types import MappingProxyType
5
  from typing import Literal, Mapping, Optional, Self
6
 
7
+ class Settings(BaseSettings):
8
  """
9
  Application settings loaded from environment variables.
10
  """
11
 
12
+ def __init__(self: Self, **data) -> None:
13
+ super().__init__(**data)
14
+ logger.debug("Created {}", self.__class__.__name__)
15
+ if self.__pydantic_extra__:
16
+ logger.warning("Extra unrecognized environment variables were provided: {}", ", ".join(self.__pydantic_extra__))
17
 
18
+ # Logging Configuration ― not actually used to configure Loguru, but defined to prevent warnings about “unknown” environment variables
19
  LOG_LEVEL: Literal["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"] = Field(default_factory=lambda data: "DEBUG" if data.get("DEBUG", False) else "INFO")
20
  LOG_FORMAT: Literal["text", "json"] = "json"
21