Re-imaging Platform Engineering with AI (2): Platform Engineering for Prompt Engineering

Introduction

In the past two years, Large Language Models (LLMs) have gradually become a powerful strength driving the innovation in the Artificial Intelligence field. As the LLMs continue to enhance their natural language processing capabilities, Prompt Engineering, which can help humans interact with the LLMs through designing appropriate and refined prompts, has been greatly expanding AI’s application in various industries, guiding the LLMs to respond to requests and generate contents in a suitable manner. For example, the recently trending technique of Retrieval Augmented Generation (RAG) heavily depends on high-quality prompts.

Press enter or click to view image in full size

RAG Application Architecture

Issues

As the inputs that directly interact with the LLMs, crafting high-quality prompts also requires certain skills and a considerable amount of debugging. In the course of our practice, we have also found some relatively common issues:

Prompt templates are often hard-coded into AI application codes, which reduces the flexibility of releasing a changed prompt.
The process of developing, testing, and deploying prompts often appears to be fragmented, with a lack of tight connections between each stage and the absence of a unified evaluation standard.
The development of prompts also requires a lifecycle management paradigm that supports team collaboration with the ability of versioning and auditing.

PE × PE: Platform Engineering for Prompt Engineering

Platform Engineering, a technology wave that emerged around the same time as LLM in late 2022, offers insights into solving the aforementioned issues. Platform Engineering emphasizes the separation of concerns and can construct a golden path for application lifecycle management through XaC (Everything as Code), GitOps, and other techniques to achieve the automated deployment, flexible version control, efficient team collaboration, and developers’ self-service.

When Platform Engineering meets Prompt Engineering, we anticipate the emergence of an efficient and intelligent paradigm for developing prompt templates and AI applications. To address the problems with Prompt Engineering mentioned above based on the concept of Platform Engineering, we can appropriately abstract and encapsulate the tools and configurations required for developing, debugging, and deploying a set of high-quality prompt templates, such as Python SDK, LLM parameters, and application types. By exposing only a simple and easy-to-use UI interface or SDK to developers, we can help them save time spent on reinventing the wheel, allowing them to focus more on feature implementation and innovation.

Some open-source projects, such as Dify.AI, Pezzo, and AIConfig, have already taken solid steps in this direction. Dify.AI is a comprehensive AI application development platform that offers a prompt IDE, on which users can compose prompt templates, integrate with business code, and deploying the AI application. Pezzo is also a one-stop AI application development platform, but it focuses more on prompt management, providing users with the capabilities of versioning, observability, and cost monitoring for prompt templates. Moreover, the usage of AIConfig is mainly in the form of Python SDK, where users can utilize in their code to achieve prompt-decoupling and model-agnostic AI applications.

Next, we will provide further introductions to these three open-source projects.

Press enter or click to view image in full size

Dify.AI

Code Repository: https://github.com/langgenius/dify

Dify is an open-source LLM application development platform that integrates the concepts of Backend as a Service (BaaS) and LLMOps, allowing both developers and non-technical personnel to quickly define and build generative AI applications and participate in the related data operations. Dify comes with the technology stack required to build LLM applications, including support for mainstream open-source or commercial models, an intuitive prompt IDE, RAG engines, and a flexible agent framework.

Get KusionStack’s stories in your inbox

Join Medium for free to get updates from this writer.

Dify offers an easy-to-use prompt IDE and supports two common application types, Chat App and Text Generator. Developers can follow the steps below to complete the writing and debugging of prompt templates.

Identify the application scenario and functional requirements.
Design and test the prompt templates and model parameters.
Orchestrate the prompt templates with user input.
Deploy the application.
Observe and continuously iterate.

Press enter or click to view image in full size

Dify.AI Prompt IDE

Pezzo

Code Repository: https://github.com/pezzolabs/pezzo

Pezzo aims to simplify the development process of AI applications, providing capabilities such as version management, real-time integration and deployment, observability, and cost estimation for prompt templates.

Similar to Dify, Pezzo provides AI application developers with a unified workspace, making it convenient for users to develop and test prompt templates. In addition, Pezzo also offers observation and diagnostic tools for LLM API requests, which can be used for cost monitoring and optimization of prompt templates. Pezzo also provides a GitHub-like version management ability for prompts and introduces the concept of Environment (e.g. dev, pre, and prod), similar to application deployment, facilitating collaborative development around prompt templates within a team.

Press enter or click to view image in full size

Pezzo Prompt IDE

AIConfig

Code Repository: https://github.com/lastmile-ai/aiconfig

AIConfig is a configuration-driven tool designed to decouple and abstract LLM parameters and prompt templates from the business codes of AI applications, thereby enabling developers to implement model-agnostic AI applications. The related Python SDK supports several mainstream LLM and flexible parameter configurations, simplifying the development and updates of prompt templates.

AIConfig uses a standardized JSON format to store the model configurations of LLMs, inputs and outputs of prompt templates, and other AI application metadata that may change frequently. At the same time, the AIConfig Python SDK is designed to be independent of the backend LLM it calls, and it supports multi-modal data processing, including text, image, and audio. As a result, AIConfig can help developers implement the concept of “one set of application business code connecting and running on multiple base models.”

Press enter or click to view image in full size

AIConfig Dataflow

AI-Native and Cloud-Native

In the previous section, we introduced the scenarios that combines Platform Engineering in Prompt Engineering. Beyond the management of prompts, Platform Engineering can play a crucial role throughout the entire development, testing, and operational maintenance process of AI applications.

As the capabilities of LLMs continue to enhance and gradually become an important infrastructure, the role of LLMs and their API providers in the internet world may increasingly approach that of current operating systems (such as the LLM OS concept proposed by Andrej Karpathy) and cloud service vendors. More and more applications will require the use of LLMs for processing text, images, audio and video data. Therefore, future AI-native applications may also face the same issues currently seen in cloud-native applications, including:

An increasing number and complexity of configurations related to LLMs (such as temperature, Top-k & Top-p, context window, etc).
Important configurations such as model parameters and application prompt templates scattered throughout the application code, making them difficult to manage and flexibly change.
The need to avoid lock-in by LLM providers, to reduce potential risks related to security and compliance, and to improve cost-effectiveness.

Press enter or click to view image in full size

LLM OS Operating System, as Elucidated by Andrej Karpathy

Platform Engineering provides effective methodological guidance for the lifecycle management of cloud-native applications. In response to the potential issues that may arise with AI-native applications, adopting Platform Engineering principles such as separation of concerns, Infrastructure as Code, and configuration-driven operations can inspire us with some potentials: the providers of LLM infrastructure (algorithm engineers) can focus on model training and fine-tuning, offering CRUD APIs for LLM resources. Platform engineers can then encapsulate these into the atomic capabilities required by AI applications and implement the activation logic based on the backend model APIs. Finally, platform engineers can abstract these into simple, easy-to-use declarative interfaces for developers, such as model types, prompt templates, and other related configurations.

Summary

In this article, we analyzed some issues presented in the currently popular topic of LLM application prompt lifecycle management. By exploring three typical open-source projects, we proposed a solution based on the principles of Platform Engineering. We will also continue to introduce and explore more in the cross-fields of Platform Engineering + AI/ModelOps. We believe that in the future, Platform Engineering and Prompt Engineering, among other AI application development processes, can spark even more innovations.💥

We can be found on Slack here and Github here.