{ "cells": [ { "cell_type": "markdown", "id": "8ec2fef2", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Software Engineering Applied to LLMs\n", "* **Created by:** Eric Martinez\n", "* **For:** Software Engineering 2\n", "* **At:** University of Texas Rio-Grande Valley" ] }, { "cell_type": "markdown", "id": "02d71e5e", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### 🔍 Concerns with Quality and Performance Issues of LLM Integrated Apps" ] }, { "cell_type": "markdown", "id": "449db7d4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "❓ **Accuracy** - How do we boost confidence, correctness, and quality in outputs?" ] }, { "cell_type": "markdown", "id": "d86252a4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "❓ **Validation & Verification** - How do we know we built the right thing? How do we know we built the thing right?" ] }, { "cell_type": "markdown", "id": "681ffe8e", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "❓ **Bias and Ethics** - How do we minimize harmful output?" ] }, { "cell_type": "markdown", "id": "441d106d", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "❓ **How** does traditional software engineering practices relate to LLM prompts and systems?" ] }, { "cell_type": "markdown", "id": "5ef5865b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Lessons from Software Engineering Applied to LLMS:" ] }, { "cell_type": "markdown", "id": "651ed117", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 🎯 Agile & Iterative Design:" ] }, { "cell_type": "markdown", "id": "d784a289", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🔸 **Rapid prompt prototyping** - Engage customers/stakeholders early on validate you are building the right prompts" ] }, { "cell_type": "markdown", "id": "907c5e83", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🔸 **Real-world examples** - Test and iterate quickly against examples that simulate real-world inputs from users. Consider edge-cases early" ] }, { "cell_type": "markdown", "id": "3392caa7", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🔸 **Collaborate** - Discuss and plan ahead how you might get access to external data sources or integrate into other systems" ] }, { "cell_type": "markdown", "id": "4aa1c373", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 📏 Testing:" ] }, { "cell_type": "markdown", "id": "d40f58cf", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "✅ Unit test your prompts with traditional test frameworks" ] }, { "cell_type": "markdown", "id": "1e0c1912", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "✅ Leverage LLMs for non-deterministic unit testing and generating example data" ] }, { "cell_type": "markdown", "id": "98b83f99", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "✅ Handle API errors, bad output, and harmful output as part of your testing suite, CI practices, and team workflow" ] }, { "cell_type": "markdown", "id": "be00ab2a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "✅ Use mocking to prevent unwanted API calls in integration tests (and save money!)" ] }, { "cell_type": "markdown", "id": "adb5e77a", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 🔄 Handling Bad Output:" ] }, { "cell_type": "markdown", "id": "aceb00b5", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "💡 **Error handing** - Recover from unwanted output or incorrect output with robust retry mechanisms" ] }, { "cell_type": "markdown", "id": "252c7d5b", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "💡 **Customized retry prompts** - Guide the LLM with custom prompts that include the desired goal, the output, and the error." ] }, { "cell_type": "markdown", "id": "374eb0be", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "💡 **Logging and Monitoring** - Track output quality, malicious input, and harmful output." ] }, { "cell_type": "markdown", "id": "95af337f", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 📚 Template Languages & Version Control:" ] }, { "cell_type": "markdown", "id": "2972ba3f", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "📝 **Dynamic template languages** - Use templating engines like ERB, Handlebars, etc for dynamically building prompts and leveraging existing testing tools" ] }, { "cell_type": "markdown", "id": "4f44c1c4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "📝 **Version control** - Manage prompt templates in the app's repo" ] }, { "cell_type": "markdown", "id": "642bd8ae", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "📝 **Team Policy / Code Review** - Develop expectations and team practices around managing changes to prompts" ] }, { "cell_type": "markdown", "id": "e91111a4", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 💉 Prompt Injection/Leakage:" ] }, { "cell_type": "markdown", "id": "db84eff5", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🔐 Test for prompt injection attacks" ] }, { "cell_type": "markdown", "id": "99a6f711", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🔐 Validate input at UI and LLM level" ] }, { "cell_type": "markdown", "id": "df148645", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🔐 Use LLMs to analyze input/output similarity to known prompt injection attacks and/or the prompt itself" ] }, { "cell_type": "markdown", "id": "d6885dd4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🔐 Implement anomaly detection & incident response" ] }, { "cell_type": "markdown", "id": "e6b3ef98", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 🔒 Security (Do Not):" ] }, { "cell_type": "markdown", "id": "2b79206c", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🚫 Avoid storing API keys in app code, binaries, or metadata files" ] }, { "cell_type": "markdown", "id": "2b33263f", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "🚫 Give LLM access to tools with higher privileges than the users themselves!" ] }, { "cell_type": "markdown", "id": "1b4e2040", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 🔒 Security (Do):" ] }, { "cell_type": "markdown", "id": "5cdf282f", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "✅ Store API keys in environment variables or cloud secrets" ] }, { "cell_type": "markdown", "id": "8093990a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "✅ Block files containing secrets from entering version control by adding them to `.gitignore`" ] }, { "cell_type": "markdown", "id": "8b19f8e4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "✅ Use intermediate web apps/APIs with authentication/authorization for accessing LLM features" ] }, { "cell_type": "markdown", "id": "5ce316c7", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "✅ Implement transparent guest/anonymous accounts & key rotation when apps don't require authentication to use LLM features" ] }, { "cell_type": "markdown", "id": "9d3ef132", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "## Production Deployment Considerations" ] }, { "cell_type": "markdown", "id": "a480904f", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "#### Wrap LLM features as web service or API. Don't give out your OpenAI keys directly in distributed software.\n", "* For example: Django, Flask, FastAPI, Express.js, Sinatra, Ruby on Rails" ] }, { "cell_type": "markdown", "id": "e6274d71", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "#### Consider whether there are any regulations that might impact how you handle data, such as GDPR and HIPAA.\n", "- Regulation may require specific data handling and storage practices.\n", "- Cloud providers may offer compliance certifications and assessment tools.\n", "- On-prem deployments can provide more control of data storage and processing, but might require more resources (hardware, people, software) for management and maintenance\n", "- Cloud providers like Azure have great tools like Azure Defender for Cloud and Microsoft Purview for managing compliance" ] }, { "cell_type": "markdown", "id": "2f9b5cf9", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "#### Using Cloud Services vs On-Prem\n", "- Cloud services offer many advantages such as scalability, flexibilitiy, cost-effectiveness, and ease of management.\n", "- Easy to spin up resources and scale based on demand, without worrying about infrastructure or maintenance.\n", "- Wide range of tools: performance optimization, monitoring, security, reliability." ] }, { "cell_type": "markdown", "id": "17111661", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "#### Container-based Architecture\n", "- Containerization is a lightweight virtualization method that packages an application and its dependencies into a single, portable unit called a container.\n", "- Containers can run consistently across different environments, making it easier to develop, test, and deploy applications. \n", "- Containerization is useful when you need to ensure consistent behavior across various platforms, simplify deployment and scaling, and improve resource utilization.\n", "- Common tools for deploying container-based architecture are Docker and Kubernetes." ] }, { "cell_type": "markdown", "id": "56890eec", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "#### Serverless Architectures\n", "- Serverless architectures are a cloud computing model where the cloud provider manages the infrastructure and automatically allocates resources based on the application's needs.\n", "- Developers only need to focus on writing code, and the provider takes care of scaling, patching, and maintaining the underlying infrastructure. \n", "- Serverless architectures can be useful when you want to reduce operational overhead, build event-driven applications, and optimize costs by paying only for the resources you actually use.\n", "- Common tools to build serverless applications and APIs include Azure Functions, AWS Lambda, and Google Cloud Functions." ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" } }, "nbformat": 4, "nbformat_minor": 5 }