Update README.md
Browse filesUpdated Readme (Key Features, SDK Link, added Disclaimer)
README.md
CHANGED
@@ -1,34 +1,34 @@
|
|
1 |
# Rhesis AI - Ship Gen AI applications that deliver value, not surprises!
|
2 |
|
3 |
-
Rhesis AI
|
4 |
|
5 |
### Key Features:
|
6 |
|
7 |
-
1. **
|
8 |
-
|
9 |
|
10 |
-
2. **
|
11 |
-
|
12 |
|
13 |
-
3. **Domain-
|
14 |
-
|
15 |
|
16 |
-
4. **
|
17 |
-
|
18 |
|
19 |
-
5. **
|
20 |
-
|
21 |
|
22 |
-
6. **
|
23 |
-
|
24 |
|
25 |
### Example Use Cases:
|
26 |
|
27 |
- **AI Financial Advisor**:
|
28 |
-
Evaluate the reliability and accuracy of financial guidance provided by
|
29 |
|
30 |
- **AI Claim Processing**:
|
31 |
-
Test for and eliminate biases in
|
32 |
|
33 |
- **AI Sales Advisor**:
|
34 |
Validate the accuracy of product recommendations, enhancing customer satisfaction and driving more successful sales.
|
@@ -36,22 +36,15 @@ Rhesis AI provides an advanced testing platform tailored for Large Language Mode
|
|
36 |
- **AI Support Chatbot**:
|
37 |
Ensure that your chatbot consistently delivers helpful, accurate, and empathetic responses across various scenarios.
|
38 |
|
39 |
-
###
|
40 |
-
|
41 |
-
1. **How does Rhesis AI contribute to LLM application assessment?**
|
42 |
-
Rhesis AI helps organizations assess the robustness, consistency, and compliance of LLM applications through automated testing. We focus on real-world scenarios, using adversarial tests and domain-specific benchmarks to uncover vulnerabilities and ensure your applications perform as expected.
|
43 |
-
|
44 |
-
2. **Why is benchmarking essential for LLM applications?**
|
45 |
-
Benchmarking ensures that your LLM application performs reliably under various conditions and use cases. Our continuously updated benchmarks, based on industry standards, allow organizations to assess their applications' resilience to threats and evolving compliance requirements.
|
46 |
|
47 |
-
|
48 |
-
LLM applications are dynamic and often undergo changes due to updates or external factors like fine-tuning. Continuous testing helps identify emerging issues, ensuring that your application maintains performance and reliability over time. With real-time feedback and transparent insights, you can stay ahead of risks and improve your application's quality.
|
49 |
|
50 |
-
|
51 |
|
52 |
-
|
53 |
|
54 |
-
|
55 |
|
56 |
### Visit Us
|
57 |
For more details about our testing platform, datasets, and solutions, including the Rhesis AI SDK, visit [Rhesis AI](https://www.rhesis.ai/).
|
|
|
1 |
# Rhesis AI - Ship Gen AI applications that deliver value, not surprises!
|
2 |
|
3 |
+
Open-source SDK for testing and validating LLM applications. Rhesis AI helps you build reliable LLM applications by providing curated test sets, dynamic test generation, and seamless workflow integration. Our goal is to help organizations validate, evaluate, and ensure the robustness, reliability, and compliance of LLM applications, across multiple domains and use cases. Below are the key features of Rhesis AI:
|
4 |
|
5 |
### Key Features:
|
6 |
|
7 |
+
1. **Comprehensive test sets**
|
8 |
+
Test LLM applications rigorously across multiple dimensions, including security, bias, reliability, and compliance. Built on industry standards from NIST, MITRE, and OWASP, ensuring robust and defensible evaluations.
|
9 |
|
10 |
+
2. **Adaptive & context-aware**
|
11 |
+
Automatically generate multi-turn, scenario-driven test cases tailored to your application. Test suites dynamically refine based on real-world usage and expert feedback to improve accuracy and relevance.
|
12 |
|
13 |
+
3. **Domain-specific coverage**
|
14 |
+
Leverage pre-built, domain-specific test benches designed to detect sector-specific vulnerabilities in financial services, insurance and more—ensuring reliability and reducing operational risk.
|
15 |
|
16 |
+
4. **Always up-to-date**
|
17 |
+
Stay ahead of emerging threats with automated test updates. Our SDK helps to continuously integrate new adversarial patterns and business-relevant risks, keeping your evaluation process current and effective.
|
18 |
|
19 |
+
5. **Automated & scalable**
|
20 |
+
Run iterative, large-scale test evaluations with minimal setup. Our SDK integrates into CI/CD pipelines, enabling automated, repeatable testing for robust AI validation at scale.
|
21 |
|
22 |
+
6. **Expert-guided**
|
23 |
+
Enhance collaboration between developers, domain experts, and compliance teams. Our SDK allows human-in-the-loop evaluations, integrating expert feedback to refine test cases and improve Gen AI performance iteratively.
|
24 |
|
25 |
### Example Use Cases:
|
26 |
|
27 |
- **AI Financial Advisor**:
|
28 |
+
Evaluate the reliability and accuracy of financial guidance provided by LLM applications, ensuring sound advice for users.
|
29 |
|
30 |
- **AI Claim Processing**:
|
31 |
+
Test for and eliminate biases in LLM-supported claim decisions, ensuring fair and compliant processing of insurance claims.
|
32 |
|
33 |
- **AI Sales Advisor**:
|
34 |
Validate the accuracy of product recommendations, enhancing customer satisfaction and driving more successful sales.
|
|
|
36 |
- **AI Support Chatbot**:
|
37 |
Ensure that your chatbot consistently delivers helpful, accurate, and empathetic responses across various scenarios.
|
38 |
|
39 |
+
### How to Use Our Datasets
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
+
Rhesis AI provides a [SDK on Github](https://github.com/rhesis-ai/rhesis-sdk) and curated selection of datasets for testing LLM applications. These datasets are designed to evaluate the behavior of different types of LLM applications under various conditions. To get started, explore our datasets on Hugging Face, select the relevant test set for your needs, and begin evaluating your applications.
|
|
|
42 |
|
43 |
+
For more information on how to integrate Rhesis AI into your LLM application testing process, or to inquire about custom test sets, feel free to explore our [Rhesis SDK on Github](https://github.com/rhesis-ai/rhesis-sdk) or reach out to us at: [email protected].
|
44 |
|
45 |
+
### Disclaimer
|
46 |
|
47 |
+
Our test sets are designed to rigorously evaluate LLM applications across various dimensions, including bias, safety, and security. Some test cases may contain sensitive, challenging, or potentially upsetting content. These cases are included to ensure thorough and realistic assessments. Users should review test cases carefully and exercise discretion when utilizing them.
|
48 |
|
49 |
### Visit Us
|
50 |
For more details about our testing platform, datasets, and solutions, including the Rhesis AI SDK, visit [Rhesis AI](https://www.rhesis.ai/).
|