This Y Combinator video discusses cutting-edge prompt engineering techniques used by leading AI startups. The speakers share examples, insights, and best practices for creating effective prompts, including metaprompting, the use of examples, and strategies for handling longer prompts. They also address the importance of evaluations (evals) and the evolving role of founders as "forward-deployed engineers."
In the Parahelp example, the system prompt defines the overall functionality of the AI agent (acting as a customer service manager), the developer prompt would contain customer-specific details on how to handle support tickets (not shown in the example), and a user prompt would be unnecessary because the end-user doesn't directly interact with the prompt. The system prompt sets the high-level API, while the developer prompt adds the specifics of each customer's API calls.
Prompt folding is a metaprompting technique where one prompt dynamically generates better versions of itself. This allows for iterative improvement by feeding the LLM examples where the prompt failed or didn't meet expectations. Instead of manually rewriting the prompt, the LLM itself helps refine it.
For longer prompts, several strategies are suggested. One is to keep a running Google Doc noting areas for improvement. This document, along with the original prompt, is then fed into a model like Gemini Pro to suggest edits. Another strategy is to use Gemini Pro's "thinking traces" during evaluations to understand the model's reasoning and identify areas needing refinement. Directly using Gemini via its website allows for drag-and-drop of JSON files, bypassing special containers.
Different LLMs exhibit different "personalities." Claude is described as more human-like and easily steered, while Llama 2 is more challenging, requiring more precise prompting and potentially more manual RLHF (Reinforcement Learning from Human Feedback)-like adjustments. This necessitates tailoring prompt engineering strategies to each LLM's characteristics.