Transcript: ‘How to Win With Prompt Engineering’

‘AI & I’ with PromptLayer founder Jared Zoneraich

30

The transcript of AI & I with Jared Zoneraich is below.

Timestamps

  1. Introduction: 00:01:08
  2. Jared’s hot AGI take: 00:09:54
  3. An inside look at how PromptLayer works: 00:11:49 
  4. How AI startups can build defensibility by working with domain experts: 00:15:44
  5. Everything Jared has learned about prompt engineering: 00:25:39
  6. Best practices for evals: 00:29:46
  7. Jared’s take on o-1: 00:32:42
  8. How AI is enabling custom software just for you: 00:39:07
  9. The gnarliest prompt Jared has ever run into: 00:42:02
  10. Who the next generation of non-technical prompt engineers are: 00:46:39

Transcript

Dan Shipper (00:01:08)

Jared, welcome to the show.

Jared Zoneraich (00:01:09)

Thanks for having me.

Dan Shipper (00:01:10)

So for people who don't know, you are the cofounder and CEO of PromptLayer. And how do you describe PromptLayer these days?

Jared Zoneraich (00:01:16)

Yeah, we are a prompt engineering platform. So, we came out about a year ago—a year-and-a-half ago, actually. Whenever ChatGPT came out—not too long ago.

Dan Shipper (00:01:24)

I remember that.

Jared Zoneraich (00:01:26)

It was an internal tool, originally. We just put it on Twitter and people liked it. And maybe we'll do this.

Dan Shipper (00:01:33)

Kind of early Slack vibes a little bit.

Jared Zoneraich (00:01:36)

Yeah, exactly.

Dan Shipper (00:01:39)

So I guess you've been around for a year-and-a-half. So the prompt engineer is not dead yet? Have reports of the death of the prompt engineer been exaggerated? Or where are we in the timeline?

Jared Zoneraich (00:01:48)

I think so. I think so. This is a common question we hear a lot: Is prompt engineering going to be automated. What's going to happen here? I'll tell you my take on it. So, we're focused on building the prompt engineering workflow. How do you iterate on these things? How do you make the source code of the LM application better? There's three primitives we see here: the prompt, the eval, and the dataset. You can automate the prompt, but you still have to build the eval and dataset. You can automate the dataset. You need the eval and prompt. You could take out one of these elements from the triangle, but at the end of the day, our core thesis, our core theory is that prompt engineering is just about putting domain knowledge into your LLM system and whether you have to say please and thank you to the AI—that'll probably go away. But you still need to iterate on this core source code, we could call it.

Dan Shipper (00:02:44)

Okay. So probably you're not going to have to be like, hey, like I'll tip you $5,000 in a couple of years, that thing is going to go away. But the sort of core primitives that you're talking about—so the prompt, the eval— Can you make that concrete for me? What do you mean by those things? Give me a specific example.

Jared Zoneraich (00:03:46)

Yeah, let’s talk about it. So let's say I'm building an AI math tutor. Pre-AI, there's a lot of math tutors you can go to—some are good, some are bad. There's almost infinite solutions to this problem. So there's never going to be a one prompt that rules it all where you say, build an AI math tutor and it makes you the only solution. You can imagine 30 companies competing for this. So, when I was talking about those primitives, when you're building this, you have the prompt or multiple prompts on how you respond to each question the student asks. You have the ways to test that and the sample data you're testing that on. But all those are a different kind of a more technical way to say you have the actual knowledge the math tutor has.

Dan Shipper (00:03:57)

That's interesting. You're making me think about like— I love that there's no one prompt to roll them all. There's no one dataset to roll them all. And even in a particular domain. And the thing that made me think of is— Do you know the philosopher Leibniz? I'm sorry to take it to Leibniz.

Jared Zoneraich (00:04:19)

No.

Dan Shipper (00:04:20)

Okay. So he's around Newton's time. He also invented calculus—they fought over that. But one of the things that Leibniz was really into is the idea of creating a language where you could only say true things. So the syntax of the language made it so that like only you can only say factual true things. And so if people had a difference of opinion, they would just say, write it in the language and then calculate, and get the answer. Just like if you have a question about how far is the cannonball going to travel, you can use calculus to figure that out—same thing for statements of fact or statements about the world with this hypothetical language. And obviously he never figured it out.

And the reason is because when you try to produce something that's so all encompassing or totalizing of truth, it gets really brittle, it gets really hard. A lot of early AI attempts were sort of like that, and I think there's a thinking error in a lot of what people think of when they think of, okay, what's the future of AI? Oh, I'm just going to tell it to be a math tutor. It's going to be the best possible math tutor. It's the same kind of thing. What that means is incredibly different and incredible in many different contexts. And so there's a lot of room for many different attempts at finding the truth or representing the truth of what is best in any given scenario.

And I think you're kind of, you're getting at that with your answer of, yeah, we're going to have to do prompt engineering because there's going to be 30 different ways to make a good math tutor and who knows which one's going to be the right one for which person.

Jared Zoneraich (00:06:06)

Yeah, no, I mean, forget about the prompt. In this world of, we have the best AGI possible, it’s— I like that reference a lot because— How do you even define the problem to solve? That's the hard part. Let's assume you had millions of dollars. You just got a big VC round to start a new company. What would you build knowing AGI is going to come? The hard part is the problem. You would start working on what problem you are going to solve, because even with the best tool, you need to define the exact scope of the problem. And that's kind of the irreducible part.

Dan Shipper (00:06:40)

That's really interesting. And I think we're so used to building things being expensive that finding the problem is this invisible thing that everyone has. I don't know if this is your first company. I assume it's not. Every is not my first company. You spend freaking years trying to figure out what to solve. And the building is expensive and hard, but finding the right problem is quite a bit harder. And I guess you could sort of like posit, okay the AGI is going to figure out the problem for you to solve. But, I don't know. That requires a lot of real-world interaction and creativity and a viewpoint. I don't know. That feels quite a bit more complicated than building software.

Jared Zoneraich (00:07:25)

Well, this is my first company, actually.

Dan Shipper (00:07:30)

Oh, it is? Oh, congratulations. You're doing great.

Jared Zoneraich (00:07:32)

We had a slight product before this that probably was like our pivot. But yeah, I like this simple example I like to think about. If I wanted to build an AI secretary that booked flights and I had a conference in Japan, there's a lot of different possible solutions. Do I want an aisle seat? Do I want a window seat? Do I prefer a nonstop over a business class with a stop? And these are the differences that mean there's no one solution. I love the word irreducible here. And I stole it from— Stephen Wolfram wrote a few articles on ChatGPT.

Dan Shipper (00:08:09)

Like, computational irreducibility?

Jared Zoneraich (00:08:11)

Yeah. Yeah. I love that. I love that theory. And I think the irreducible part—

Dan Shipper (00:08:14)

Can you define that for people who are listening and haven't heard that before?

Jared Zoneraich (00:08:19)

Totally. I'm not going to do as good of a job at defining it, so I encourage everyone to look it up and read the original.

Dan Shipper (00:08:25)

Use ChatGPT to look it up, if you're listening to this or watching this.

Jared Zoneraich (00:08:29)

I would define computational irreducibility as: When you're solving a problem, you can collapse a lot of parts and speed up a lot of parts of solving the problem, but there's one part that you'll never be able to collapse. I like to think of when in school in math class, you're— What was it? Factoring, right? When you're kind of taking things out into the parentheses. But there's always that irreducible part that you can't factorize or you can't simplify. And in this case, we’re saying if you have an amazing AGI that can solve any problem, the hard part is: What do you even tell it to solve?

Dan Shipper (00:09:11)

Yeah, I think another way to think about computational irreducibility is if you imagine the world to be like a big computer—or the universe to be a big computer—there are certain operations that you actually have to run the computer to find out what happens. You have to run out the reality of the system in order to know. You don't have calculus to be like, okay, I know where it's going to be in 10 steps because the universe doesn't know yet. And I think that's super cool.

Jared Zoneraich (00:09:35)

Yeah. Halting problem-esque.

Dan Shipper (00:09:37)

Yeah. Okay. So you kind of have this pluralistic framework of the future of AI. How did you come to that?

Jared Zoneraich (00:09:51)

Well, it depends, I guess. What are you referring to as pluralistic—multiple models or multiple stakeholders?

Dan Shipper (00:09:52)

Multiple approaches to any given problem.

Jared Zoneraich (00:09:54)

Yeah, I think it's just observation. Honestly, I'm a hacker. That's my background. I'm not necessarily a researcher and not necessarily— To be honest, and this might be a controversial thing, AGI kind of doesn't interest me—whether AI is going to take over.

Dan Shipper (00:10:15)

I love that. You're the first person to say that on this show. Hot take alert.

Jared Zoneraich (00:10:20)

It is. And I get into it. We have a lot of debates on it. When we interview someone, we usually take them to lunch and ask them what they think about AGI. Just curiously, just to get their takes.

Dan Shipper (00:10:28)

What's the bad response?

Jared Zoneraich (00:10:30)

There is no bad response.

Dan Shipper (00:10:31)

There is no bad response. So why do you ask?

Jared Zoneraich (00:10:35)

I’m curious what they think. We have a lot of different ideas. But what I'm most interested in is not so much: Is AI going to take over the world? Is it going to kill us all? I'm interested in: How do you build with it? I think it's just a really cool technology. And, as a hacker, I am like, my mental model of AI is not the reasoning part as much as language-to-data and data-to-language. And how much does that open for us? So that's how I get here, it's just a tool in the toolbox. We have a lot of tools in the toolbox and what can you solve? In the same way that my non-tech friends will ask me, why does Facebook have engineers? The site works for me, right? Why do we have to hire more engineers? I think it's the same thing. I look at AI the same way. You're going to always be iterating. There's so many degrees of flexibility in these things.

Dan Shipper (00:11:29)

You said earlier before we, before we started recording, that one of the things you're kind of excited about or thinking about a lot is the non-technical prompt engineer. Can you just open it up for me? What does that mean? And how did you start knowing that was a thing you wanted to pay attention to?

Jared Zoneraich (00:11:49)

Yeah, so when we launched the PromptLayer, as I was saying, it was kind of an internal tool we built for ourselves and people started liking it. So, we didn't have this idea when we started, but I'll tell you, there was kind of a light-bulb moment I look back on. It's this team Parent Lab. They're building a very cool app. It's basically an AI parenting coach to help you be a better parent. We got on a call with them for feedback on our product about a year ago and asked them, how do you like PromptLayer? What can you improve? But the most interesting thing is that someone on the call, one of their prompt engineers, was a teacher—15 years as a teacher, not technical at all. And we're like, what are you doing? Why are you using PromptLayer? What's going on? And she had explained to us that the engineers had set it up for her. She would go onto PromptLayer, edit the prompts, and then pull up the app on her phone and it will pull down the latest version of the prompt. And from there, kind of the gears started turning. And where we're at now is the core thesis of what we're building—PromptLayer itself—is that you're not going to win in the age of GenAI hiring the best engineers. You're not going to build defensibility through machine learning for most companies. You're going to win by working with the domain experts who can write down that problem as we were talking about earlier, who could define the specs of what you're solving. And in cases of building AI teachers, AI therapists, AI doctors, lawyers— I'm an engineer. I don't know how a therapist should respond to someone who's depressed, right? I'm not the right person to be in the driver's seat of building that. Does that make sense?

Dan Shipper (00:13:27)

Yeah, I think it makes sense. Can you give me a concrete example of what this teacher was prompting? I don't really understand what she's going in to change, and then what changes she's able to see in the app.

Jared Zoneraich (00:13:36)

Yeah, so when I say she's prompting, she's editing the prompt that powers the app. So maybe they're opening the app and a user could be saying, how do I discipline my child? I need to respond very specifically to that. And maybe it was saying I'm a language model trained by opening. I don't support any discipline and then she needs to go into it and say, okay, in that use case, let me make sure that I respond— I don't have children, so maybe I'm not the best one to say how it should respond, but she makes it respond in the right way.

Dan Shipper (00:14:14)

Smack them.

Jared Zoneraich (00:14:15)

Yeah. Hit them with a belt. 

And you don't want to ruin all the other cases, though, where they're asking about what I should feed them. You shouldn’t also hit them with the belt. So it's kind of it's kind of that systematic process. And it's just an iteration. We really believe prompt engineering is how do you close the feedback loop? How do you iterate as quickly as possible?

Dan Shipper (00:14:38)

Create a free account to continue reading

The Only Subscription
You Need to Stay at the
Edge of AI

The essential toolkit for those shaping the future

"This might be the best value you
can get from an AI subscription."

- Jay S.

Mail Every Content
AI&I Podcast AI&I Podcast
Monologue Monologue
Cora Cora
Sparkle Sparkle
Spiral Spiral

Join 100,000+ leaders, builders, and innovators

Community members

Already have an account? Sign in

What is included in a subscription?

Daily insights from AI pioneers + early access to powerful AI tools

Pencil Front-row access to the future of AI
Check In-depth reviews of new models on release day
Check Playbooks and guides for putting AI to work
Check Prompts and use cases for builders

Comments

You need to login before you can comment.
Don't have an account? Sign up!