LeiPricingManager
/

03062025_V2_UMAP_Embedding_Classifier

Text Classification

binary-classification

multi-class-classification

Model card Files Files and versions Community

LeiPricingManager commited on Mar 6

Commit

de0e968

·

verified ·

1 Parent(s): 129c7f2

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +31 -0

README.md ADDED Viewed

	@@ -0,0 +1,31 @@

+# 03062025_V2_UMAP_Embedding_Classifier
+This repository contains two final AutoGluon TabularPredictor models (binary and multi-class) built using UMAP-reduced embeddings from the [Alibaba-NLP/gte-large-en-v1.5](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5) model.
+## Key Details
+- **UMAP for Binary Classification**: Best n_components tuned via Optuna = 11.
+- **UMAP for Multi-class Classification**: Best n_components tuned via Optuna = 43.
+- **Data**: 112 technical questions with tiering classifications (0–4).
+- **Performance Metrics**:
+  - **Binary**: Accuracy ≈95.65%, F1 ≈0.97, ROC AUC ≈0.91.
+  - **Multi-class**: Accuracy ≈56.52%, F1 ≈0.59, ROC AUC ≈0.74.
+## Usage
+1. **Loading the Models**:
+   ```python
+   from autogluon.tabular import TabularPredictor
+   binary_predictor = TabularPredictor.load("binary_final_model")
+   multi_predictor = TabularPredictor.load("multiclass_final_model")
+   ```
+2. **Preprocessing**: Generate embeddings for your input text using the Alibaba-NLP/gte-large-en-v1.5 model and apply the UMAP transformation with the provided reducer files (umap_reducer_binary.joblib and umap_reducer_multi.joblib).
+3. **Prediction**: Use predict() and predict_proba() to obtain predictions.
+## License
+This project is licensed under the Apache-2.0 License.
+## Contact
+For questions or collaboration, please contact LeiPricingManager.