{ "cells": [ { "cell_type": "markdown", "id": "8ec2fef2", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Software Engineering Applied to LLMs\n", "* **Created by:** Eric Martinez\n", "* **For:** Software Engineering 2\n", "* **At:** University of Texas Rio-Grande Valley" ] }, { "cell_type": "markdown", "id": "02d71e5e", "metadata": {}, "source": [ "## Concerns with Quality and Performance Issues of LLM Integrated Apps" ] }, { "cell_type": "markdown", "id": "60fef658", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "* Applications depend on external APIs which has issues with flakiness and pricing, how do we avoid hitting APIs in testing?\n", "* Responses may not be correct or accurate, how do we increase confidence in result?\n", "* Responses may be biased or unethical or unwanted output, how do we stop this type of output?\n", "* User requests could be unethical or unwanted input, how do we filter this type of input?\n" ] }, { "cell_type": "markdown", "id": "5ef5865b", "metadata": {}, "source": [ "## Lessons from Software Engineering Applied to LLMS:" ] }, { "cell_type": "markdown", "id": "2fc1b19a", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Prototyping\n", "* Develop prompt prototypes early when working with customers or stakeholders, it is fast and cheap to test that the idea will work.\n", "* Test against realistic examples, early. Fail fast and iterate quickly.\n", "* Make a plan for how you will source dynamic data. If there is no path, the project is dead in the water." ] }, { "cell_type": "markdown", "id": "2528a3c9", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Testing\n", "* Unit test prompts using traditional methods to increase confidence.\n", "* Unit test your prompts using LLMs to increase confidence.\n", "* Write tests that handle API errors or bad output (malformed, incorrect, unethical).\n", "* Use 'mocking' in integration tests to avoid unnecessary calls to APIs, flakiness, and unwanted charges." ] }, { "cell_type": "markdown", "id": "d9cdafd2", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Handling Bad Output\n", "* Develop 'retry' mechanisms when you get unwanted output.\n", "* Develop specific prompts for different 'retry' conditions. Include the context, what went wrong, and what needs to be fixed.\n", "* Consider adding logging to your app to keep track of how often your app gets bad output." ] }, { "cell_type": "markdown", "id": "8f7de0be", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Template Languages and Version Control\n", "* Consider writing your prompt templates in dynamic template languages like ERB, Handlebars, etc.\n", "* Keep prompt templates and prompts in version control in your app's repo.\n", "* Write tests for handling template engine errors." ] }, { "cell_type": "markdown", "id": "3987a54c", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Prompt Injection/Leakage\n", "* User-facing prompts should be tested against prompt injection attacks\n", "* Validate input at the UI and LLM level\n", "* Consider using an LLM to check if an output is similar to the prompt\n", "* Have mechanisms for anomaly detection and incident response" ] }, { "cell_type": "markdown", "id": "a0e0c388", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Security\n", "* **Do not:** store API keys in application code as strings, encrypted or not.\n", "* **Do not:** store API keys in compiled binaries distributed to users.\n", "* **Do not:** store API keys in metadeta files bundled with your application.\n", "* **Do:** store API keys in environment variables or cloud secrets.\n", "* **Do:** store API keys in a `.env` file that is blocked from version control. (Ideally these are encrypted with a secret that is not in version control, but that is beyond the scope of today's discussion.)\n", "* **Do:** create an intermediate web app (or API) with authentication/authorization that delegates requests to LLMs at run-time for use in front-end applications.\n", "* **Do:** if your front-end application does not have user accounts, consider implementing guest or anonymous accounts and expiring or rotating keys.\n", "* **Do:** when allowing LLMs to use tools, consider designing systems to pass-through user ids to tools so that they tools operate at the same level of access as the end-user." ] }, { "cell_type": "markdown", "id": "9d3ef132", "metadata": {}, "source": [ "## Production Deployment Considerations" ] }, { "cell_type": "markdown", "id": "a480904f", "metadata": {}, "source": [ "* Wrap LLM features as web service or API. Don't give out your OpenAI keys directly in distributed software.\n", " - For example: Django, Flask, FastAPI, Express.js, Sinatra, Ruby on Rails" ] }, { "cell_type": "markdown", "id": "e6274d71", "metadata": {}, "source": [ "* Consider whether there are any regulations that might impact how you handle data, such as GDPR and HIPAA.\n", " - Regulation may require specific data handling and storage practices.\n", " - Cloud providers may offer compliance certifications and assessment tools.\n", " - On-prem deployments can provide more control of data storage and processing, but might require more resources (hardware, people, software) for management and maintenance\n", " - Cloud providers like Azure have great tools like Azure Defender for Cloud and Microsoft Purview for managing compliance" ] }, { "cell_type": "markdown", "id": "2f9b5cf9", "metadata": {}, "source": [ "- Using Cloud Services vs On-Prem\n", " - Cloud services offer many advantages such as scalability, flexibilitiy, cost-effectiveness, and ease of management.\n", " - Easy to spin up resources and scale based on demand, without worrying about infrastructure or maintenance.\n", " - Wide range of tools: performance optimization, monitoring, security, reliability." ] }, { "cell_type": "markdown", "id": "17111661", "metadata": {}, "source": [ "- Container-based Architecture\n", " - Containerization is a lightweight virtualization method that packages an application and its dependencies into a single, portable unit called a container.\n", " - Containers can run consistently across different environments, making it easier to develop, test, and deploy applications. \n", " - Containerization is useful when you need to ensure consistent behavior across various platforms, simplify deployment and scaling, and improve resource utilization.\n", " - Common tools for deploying container-based architecture are Docker and Kubernetes." ] }, { "cell_type": "markdown", "id": "56890eec", "metadata": {}, "source": [ "- Serverless Architectures\n", " - Serverless architectures are a cloud computing model where the cloud provider manages the infrastructure and automatically allocates resources based on the application's needs.\n", " - Developers only need to focus on writing code, and the provider takes care of scaling, patching, and maintaining the underlying infrastructure. \n", " - Serverless architectures can be useful when you want to reduce operational overhead, build event-driven applications, and optimize costs by paying only for the resources you actually use.\n", " - Common tools to build serverless applications and APIs include Azure Functions, AWS Lambda, and Google Cloud Functions." ] }, { "cell_type": "markdown", "id": "b390fa49", "metadata": {}, "source": [ "- HuggingFace\n", " - Platforms like HuggingFace provide an ecosystem for sharing, collaborating, and deploying AI models, including LLMs. \n", " - They offer pre-trained models, tools, and APIs that simplify the development and integration of AI-powered applications. \n", " - These platforms can be useful when you want to leverage existing models, collaborate with the AI community, and streamline the deployment process for your LLM-based applications." ] }, { "cell_type": "code", "execution_count": null, "id": "72115a67", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" } }, "nbformat": 4, "nbformat_minor": 5 }