README.md · rhesis/README at eda05be3beb5a48bfdc725dd453dccf1a586c335

Rhesis Logo

Open-source test generation SDK for LLM applications.

Rhesis enables AI developers to access curated test sets and generate dynamic ones for LLM applications. It provides tools to tailor validations to your needs and integrate seamlessly to keep your Gen AI robust, reliable & compliant.

How to Use Our Datasets

Rhesis AI provides a SDK on Github and curated selection of datasets for testing LLM applications. These datasets are designed to evaluate the behavior of different types of LLM applications under various conditions. To get started, explore our datasets on Hugging Face, select the relevant test set for your needs, and begin evaluating your applications.

For more information on how to integrate Rhesis AI into your LLM application testing process, or to inquire about custom test sets, feel free to explore our Rhesis SDK on Github or reach out to us at: [email protected].

Features

The Rhesis SDK currently provides functionality to work with Rhesis test sets through routine operations:

List Test Sets: Browse through available curated test sets
Load Test Sets: Load specific test sets for your use case
Download Test Sets: Download test set data for offline use
Generate Test Sets: Generate new test sets from basic prompts

Example Use Cases:

AI Financial Advisor:
Evaluate the reliability and accuracy of financial guidance provided by LLM applications, ensuring sound advice for users.
AI Claim Processing:
Test for and eliminate biases in LLM-supported claim decisions, ensuring fair and compliant processing of insurance claims.
AI Sales Advisor:
Validate the accuracy of product recommendations, enhancing customer satisfaction and driving more successful sales.
AI Support Chatbot:
Ensure that your chatbot consistently delivers helpful, accurate, and empathetic responses across various scenarios.

Disclaimer

Our test sets are designed to rigorously evaluate LLM applications across various dimensions, including bias, safety, and security. Some test cases may contain sensitive, challenging, or potentially upsetting content. These cases are included to ensure thorough and realistic assessments. Users should review test cases carefully and exercise discretion when utilizing them.

Visit Us

For more details about our testing platform, datasets, and solutions, including the Rhesis AI SDK, visit Rhesis AI.