Episode 18

You Can't Vibe-Code Trust: Why Real SaaS Still Wins in the AI Era

with Kelly Sutton

You Can't Vibe-Code Trust: Why Real SaaS Still Wins in the AI Era

About Kelly Sutton

Kelly Sutton is the CTO and co-founder of Scholarly, a vertical SaaS faculty information system for universities. He is a Ruby community member and engineering leader focused on building software that meets the trust, security, and compliance demands of higher education.

About This Episode

On the Ruby AI Podcast, hosts Valentino and Joe Leo welcome Scholarly CTO and co-founder Kelly Sutton to discuss building a vertical SaaS faculty information system for universities.

Sutton explains why competitors can't easily replicate Scholarly: higher ed is moving off decades-old homegrown software, and the product must meet trust, security, compliance, and regulatory demands such as SOC 2 Type II. He describes how Scholarly expanded from replacing Excel and Access tracking to sophisticated workflow automation, and how universities recently shifted from AI skepticism to AI FOMO.

Scholarly uses AI in product surfaces, heavily in engineering, and via an admin MCP server that helps ops and customer success rapidly configure workflows from faculty handbooks with human-in-the-loop review. The conversation debates MCP's likely temporariness versus traditional APIs, emphasizes smaller reviewable outputs, and frames AI as an implementation detail focused on customer value.

Valentino also shares an experiment training Claude to build products, including ups.dev and an open-source Ruby uptime-monitoring gem.

Full Transcript

Valentino Stoll (00:02.51)
Hello everybody, welcome back to another episode of the Ruby AI Podcast. I'm Valentino joined by my co-host.

Joe Leo (00:11.097)
Hi, I'm the co-host, I'm Joe Leo. Very excited today, we've got Kelly Sutton who's gonna join us in just a second, but first, Valentino, NASA has developed an AI nose, an E-nose technology that can help to sniff out things, including tracking the health of astronauts on long distance or long duration missions.

What will you do with your first AI nose?

Valentino Stoll (00:46.126)
Bullshit. It's gotta be able to detect bullshit, you know?

Joe Leo (00:49.336)
That's true. It's got to be able to detect bullshit. I love that answer. I thought what I'm going to use it for is I'm going to use it to determine which car on the subway to step into and which to avoid because that is a real bummer getting into the smelly train car. But I like the bullshit answer better. All right. So without further ado, let's welcome the CTO and co-founder of Scholarly, Kelly Sutton. Welcome to the show, Kelly.

Kelly Sutton (01:17.392)
Thanks for having me.

Joe Leo (01:19.422)
It is great to have you on. Know each other and been in similar circles for a long time. Love the work that you do in Ruby. And now we've got this thing called AI, which is kind of taking over. You run some pretty important software that I think could be described as software as a service. Would you describe it as a SaaS product?

Kelly Sutton (01:42.935)
It is a vertical SAS product.

Joe Leo (01:46.441)
a vertical SaaS product, and yet you are still in business. How is that possible?

Kelly Sutton (01:48.005)
And yet we still exist. In fact, we're doing better than ever. Don't believe all the headlines that you read.

Joe Leo (01:57.495)
I don't want to believe all the headlines I believe, but I do want to understand what is the thing that sets scholarly apart, that makes it so that the next person who comes around and says, oh, OK, I can recreate scholarly in 15 minutes, and then I'll charge half the price. What prevents that from happening?

Kelly Sutton (02:20.228)
Yeah, so scholarly is a, we call it a faculty information system and you can also run some pretty sophisticated workflows off of the platform. So you can think of it like a really just an HR platform for Pirate. So think like Gusto, Rippling, Workday for colleges and universities in the US. I think there are two things that really prevent folks from just vibe coding a competitor into existence.

One is that I think colleges and universities are on this current track where they're trying to move away from operating their own software. Last like 20, 30, 40 years. Very capable, talented.

Joe Leo (02:57.878)
Hmm.

Kelly Sutton (03:06.915)
professors, usually in the computer science department, built a tool for managing some piece of the faculty life cycle there. But they didn't understand the maintenance costs of that. And so a lot of customers that switched to us are actually turning off old systems that sometimes predate the internet, which is always a fun challenge there. So universities in general are getting out of, we want to maintain software that we built that is bespoke for us.

Joe Leo (03:19.678)
Mm-hmm.

Kelly Sutton (03:36.77)
And then the second thing is we deal with very sensitive information. We are an HR product, so there is the legal, regulatory, security, and compliance aspects to what we do. And those, you don't necessarily just get a SOC 2, type 2 certification overnight. That requires some time.

Joe Leo (03:49.332)
Hmm.

Joe Leo (04:00.819)
Right.

Right. I think that's true. There's also a trust component there that I want to get into, but a quick sidebar. I think that that is how Ruby Luminary and my coauthor of the well-grounded Rubyist, David Black, got his start. It was building applications that were pre-web and then web applications in the communications department at his university. And, you know, then came across Ruby at some point and the rest is kind of history. But yeah, you're right. That story of

you know, software being developed and managed by just somebody who, you know, is at the university and had an interest, you know, in programming is real. I like that. So you mentioned this piece with SOC 2 type 2 compliance and that's real. You also said, you know, this is a regulated, it's a regulated industry. There's high risk there. And I think because of that, you could start to sort of build your moat as a business.

with the trust that you build up over years. So first, guess I'd ask you to, whether you agree with that statement or not, that's kind of a softball question. And then the easier, then the harder question is, well, how do you do it with your engineering team and with the need to, of course, move fast and service your customers?

Kelly Sutton (05:09.546)
You

Kelly Sutton (05:18.496)
Sure. The answer to the softball question is yes, I do agree with that. And I would also agree with it where I am on the other side. I'm not going to trust a two-person startup running my university's payroll. That's nuts. That's an objectively poor idea. And so think, much like with any B2B SaaS company that is

Joe Leo (05:33.444)
Mm-hmm.

Joe Leo (05:37.606)
Yeah.

Kelly Sutton (05:46.441)
taking on like a big piece of what their customers are trying to do, you need to find customers who are willing to stomach that risk appetite of, okay, we're gonna go with these folks who've only been around for six months, 12 months, because we are so desperate for something, okay? And then as the company matures, we're almost three years old now, we've got more than maybe like two dozen folks working for us, who like in our orbit here.

Those effects compound, right? It's like, well, they've been around for two years now. They've been around for three years now. These other, you have like the social proof of like other folks trust them. Those other folks seem to be doing really well. Maybe we should, maybe like the risk calculus is, we're willing to take a bet on this young company, but there's still institutions out there where.

Joe Leo (06:27.28)
Mm-hmm.

Kelly Sutton (06:43.39)
They're like it doesn't make sense for them right and those are usually like the larger larger institutions or larger systems And that's okay. We'll get there. We'll get them all eventually

Joe Leo (06:52.208)
Hmm.

Joe Leo (06:55.535)
Yeah, that's spoken like a co-founder. Because you make this interesting comment that I think is true, that a company shouldn't trust, a university shouldn't trust a two-person payroll team that just spun up last week. But at some point, you were a two-person team. And so at some point, you need to build that traction and get somebody to make the leap. So what was that story like for you three years ago?

Kelly Sutton (07:00.337)
Yeah.

Kelly Sutton (07:14.533)
Mm-hmm. Yeah.

Kelly Sutton (07:23.837)
It was not starting with payroll. Yeah, we still don't do payroll. Someday we might. It was starting with what are parts of the process that are important, but also something that you wouldn't mind going with a vendor for. So a lot of our earliest customers were switching off of Microsoft Excel spreadsheets or Microsoft Access databases for tracking.

Joe Leo (07:26.38)
Okay.

Joe Leo (07:31.609)
Okay.

Kelly Sutton (07:52.356)
like faculty activity data and like faculty rosters there. And so the ability to show like, okay, like a connected web database for this information is vastly superior because more than two people can work on this at the same time. Yeah, it was there in a pretty easy sell. So that's where we started and then started.

adding to what we did got into like more like workflow management and then we continue to add functionality to like what we call it like our workflow engine. rather than just being able to run a very simple process, you can now run very like sophisticated processes that have like 50 plus people involved for a single like case or a single like, you know, run of this workflow here.

Joe Leo (08:38.189)
Mm.

Kelly Sutton (08:43.555)
And so we just continue to add to that. And all of that really comes from like just sitting down with customers or prospects and saying, okay, what do need? Like what problems are you trying to solve? Explain it to us in plain language. Like don't tell us like, want the button over here and I want a spreadsheet over here. Like tell us like, okay, what is this process? Explain it to us in depth. And then we don't stop until they're happy and their problems are solved.

Joe Leo (08:54.241)
Mm-hmm.

Joe Leo (09:09.312)
Yeah, that's great.

Valentino Stoll (09:11.853)
So I'm curious, an institution perspective, I imagine they have lot of departments and it's not maybe one person you're talking to to set these up. So how do you kind of deal with that from an AI business perspective? To disseminate and connect all of the related parties so you're not relearning how a university works differently, right?

Kelly Sutton (09:22.104)
Mm-hmm.

Valentino Stoll (09:40.589)
when you're trying to onboard somebody new? Do you have systematic approaches? What's your workflow from a business adopting AI and implementing AI? Do you have channels set up for certain workloads to line up your customers to yours? Is it that in depth? Do you have this desire to jump into that?

Kelly Sutton (09:40.92)
Mm.

Kelly Sutton (09:53.401)
Mm.

Valentino Stoll (10:07.881)
Kind of preemptively without knowing it, you know, how do you how do you handle all the trade-offs and balancer?

Kelly Sutton (10:13.941)
Yeah, our platform has been built to be extremely flexible from day one. No two universities are alike, right? And like, we're talking just like hilariously different sets of requirements. Like, just when you think like, this one, this requirement, this is going to be set in stone, like the privacy, like this document coming out of this workflow, it's always going to be this way.

I guarantee you the next customer is going to show up and say like, nope, we do it totally differently. So we've always been, I would say like a very like nimble engineering and product. And then when it comes to like, how do we plug AI into that? There's a things that I think universities were very AI skeptical up until about two to three months now, two to three months ago. Now they're like very much like, don't want to get left behind like.

Joe Leo (10:44.106)
You

Joe Leo (11:10.154)
you

Kelly Sutton (11:10.666)
Extreme FOMO really trying to lean into how they can like responsibly integrate AI into their administration and all of their operations there. When it comes to like how we use AI, there's kind of like three places that we use AI. have surfaces in our product, everything from like a chat surface to like, okay, this complicated thing is actually being offloaded to an LLM for like building like the draft of a workflow.

workflow, example. So you can drop in a document, we put together a workflow. We've got an MCP server, which our administration team is using a lot for setting up new customers and guiding them through implementation. That's been a surprising win in the last few months here. We're building fewer administration dashboards and tooling and more, and just letting the LLMs and...

handle that and then we also use it as an engineering team all the time. Like probably 80, 90 % of the code that we write has Claude somewhere on it.

Joe Leo (12:12.734)
Mm-hmm.

Joe Leo (12:18.408)
So I wanted to zoom in on this and I want to get Valentino's take on it also. think it's, was reading some of the stuff you wrote, which is excellent. Your retrospectives posts are great. And you made a comment about MCP being useful, but likely temporary. And I think, I think that that is, I think you're not alone in that assessment though. Nobody really knows, but I'm curious what, you know, why you came to that, you know, not really conclusion, but why you came to that prediction. And, and I'm also kind of curious how that.

Kelly Sutton (12:34.858)
Mm-hmm.

Kelly Sutton (12:39.443)
Mm-hmm. Mm-hmm.

Joe Leo (12:47.141)
resonates with V.

Kelly Sutton (12:50.397)
V, you want to go first? You want me to take this?

Valentino Stoll (12:53.932)
I want to hear you first.

Joe Leo (12:55.013)
haha

Kelly Sutton (12:55.207)
Sure, I think it's fair to say that the three of us have been around the block a few times, right? Yeah.

Joe Leo (13:04.101)
Yes, often brought up maybe a little bit too much on the show, but thank you, Kelly, for doing that. Yeah.

Kelly Sutton (13:07.957)
Yeah, yeah, I know. keep mentioning it, sorry. You develop a nose for the things where like the... Like a new technology is introduced that solves a problem but it has some echoes of the past. this technology solves this problem, sure.

But is it necessary? No. Okay. Or are there things that already do this or maybe there's just, you're going to run into a constraint or a shortcoming. So I always like to pick on GraphQL. I've picked on GraphQL for a decade now. Similar thing. Like I'm like, I don't think we need this. And I think at this point GraphQL is, I don't know if it's dead, but it's just not, not really used as much as like rewind like five years. Everyone's like all in on GraphQL.

And if you look at what GraphQL was providing, it was some very useful stuff, like partial field sets, related resource fetching, the ability to issue one request and get many things back. But GraphQL didn't have a monopoly on that. MCP, similarly, it's a way of programmatically speaking with the web application. But we've had...

Joe Leo (14:27.717)
Mm-hmm.

Kelly Sutton (14:36.221)
JSON APIs for 10, 20 years now, those are well documented. If you choose a spec like JSON API, it's very predictable how you can interact with something. And I think a lot of just the early MCP stuff of throwing the kitchen sink in of technology when we're kind of just reverting to, OK, this thing is kind of just issuing requests. And the RPC style requests is like a.

All right, think for us as a business, we must adopt this, right? But if we fast forward like five years and someone's like, yeah, we just stopped using MCP because all the models actually could write curl commands. I'd be like, yeah, of course. That seems like a perfectly fine place to learn.

Joe Leo (15:09.732)
Mm-hmm.

Joe Leo (15:19.268)
Hmm.

Kelly Sutton (15:26.876)
That's my thought.

Joe Leo (15:29.676)
All right, Valentino, he's been sitting on his hands. Let's hear it.

Valentino Stoll (15:29.728)
Yeah, I mean...

I'm so torn on MCP because from an efficacy perspective, like, models perform much better with CLI style discoverability. Like, Claude code itself adopts memory, which is progressive disclosed via, like, successive files, right? And it's like, circles back to, you know, the whole, like, hypermedia idea, right? Like, to your point, Kelly, like...

Regular APIs, RESTful APIs are meant for that discoverability and if it needs more information about more, you can link to that in successive documents and kind of reflect that same. But I think a lot is lost on the architecture of MCP from fruitfulness. I think it's one of those things like GraphQL where if you don't adopt it as like, you know.

in GraphQL's example, if you don't want adopt it as your data delivery strategy, then it becomes very problematic in that it's like abstractions and complexity like kind of proliferate throughout the rest of your systems. And so like that does complicate things more. But if you have things that need that complexity, it makes it simpler from all of your downstream services and systems. like that, MCP is the same, right? Like

If you adopt the entirety of the MCP, it's got great resource control, great discoverability. It has a lot of things that agents lack as far as accuracy. But to your point, Kelly, do you need GraphQL for 20 page or 20 route Rails app? Probably not, right? And just automatically dumping it in any repo.

Joe Leo (17:20.769)
Mm.

Valentino Stoll (17:27.328)
Maybe isn't a great idea, right? And I think the same way is true for like MCP, right? Like maybe you don't need MCP to just deliver tools to your agent, like start with like an array or like, you know, something very minimal and then build up as you get there. But I think what we're seeing from like, you know, a scale perspective is that like, you know.

MCPs have a hard time with quality of delivery when the tools scale. If you get beyond the even 50 mark of tools, it starts to fall apart quickly. And it picks the wrong ones, it uses the wrong parameters, and it becomes hard to manage the context and communicate that with the tools.

So there's a lot of stuff.

Joe Leo (18:24.979)
Are you saying this, are you saying this, you know, independent of whether, of how things are connected and whether, whether you're utilizing MCP that, know, there's kind of this threshold of tools.

Valentino Stoll (18:37.717)
Yeah, and you know, if you lean heavy into MCP, then it does that resource restriction and it helps limit the number of tools, but it's still a number of tools. If you have a big domain, even for an institution like a university, it has all its own departments and each department may have 100 tools that it wants to use, but each one doesn't need for every task, right? And so even within the domains...

Joe Leo (18:42.269)
Mmm. I say.

Valentino Stoll (19:07.179)
of the larger organization, it becomes hard to like, well, what do you present to the model for a given user request, right? And you can get so far, but it still has its limitations. And so like, even if you adopted a large scale and embrace it fully, you're still gonna hit those limits. And so you end up having to, you know, build in different mechanisms to get there. But at the same time, you know, there's like a whole like a...

other aspect of MCP land that I think is often lost of like prompt resources and other kinds of resources that you can make available to an LLM that can benefit your application. But more than anything is just consistency. So if you could just implement consistent abstractions like Rails has for a long time, that is better than the actual technical specification, right?

Joe Leo (20:02.3)
Hmm.

Valentino Stoll (20:06.947)
Whether it's MCP or something else, as long as it's all following a convention, it's gonna perform the same. I don't know. I'm torn on MCP.

Joe Leo (20:15.837)
Yeah.

Joe Leo (20:20.316)
Well, Kelly, can you walk us through? said that it's being used to great effect with onboarding and it's replaced kind of building out dashboards, which of course takes time. And as you've already alluded to, every university is paying attention to something different, different KPIs, different requirements. So what does it look like on the ground? A new university or a new, you know, a new prospect decides, yeah, okay, I'm going to take the plunge with Scholar.

Kelly Sutton (20:30.708)
Mm-hmm.

Kelly Sutton (20:37.864)
Yeah.

Kelly Sutton (20:42.058)
Hmm.

Joe Leo (20:46.414)
your engineers, you have some forward deploy engineers that set up MCP connections or what happens and take us through it.

Kelly Sutton (20:46.622)
Mm-hmm.

Kelly Sutton (20:53.578)
So we have an MCP server, which is just for administrators. So this is our operations and customer success folks, right? And so they plug their Claude, yeah, they're Claude. We're using Claude now. Might be ChatTPT in six months. I love being a consumer in a highly competitive market like AI right now.

Joe Leo (21:02.78)
Hehe.

Joe Leo (21:13.581)
Right.

Joe Leo (21:21.691)
It certainly has its benefits, yeah.

Kelly Sutton (21:23.209)
So there's a lot of setup that is required when onboarding onto a platform like Scholarly. And our admin MCP tools help us take a good first draft approach of, OK, we need to configure a lot of things what are known as faculty activities on the platform. So these are things like.

publications, grants, like service, like the things that make up teaching, research, and service in a university that are required to be tracked for things like accreditation or promotion or any sort of like merit, like processes, as well as like the workflows themselves. And good news is like higher ed loves like writing stuff down. And so there are a lot of like faculty handbooks out there. And so using something like Claude

Joe Leo (22:11.705)
Hmph.

Kelly Sutton (22:19.858)
plugged into our admin MCP tools, we're able to take a faculty handbook and drop that into CLOD and say, spin up all of the workflows that you find, right? And the faculty handbook in detail is going to spell out, like, OK, here's what the annual review process looks like for a faculty member. Here's what the annual, or sorry, here's what the promotion, or the tenure process from going from, like, assistant to associate professor looks like at this institution.

Cloud, as it does with our code, it'll sit there and make a plan and then say, OK, I'm going to create these workflows. Someone on our end says, yep, looks good. And then it sets it up in the app. We have a web interface to do all of that, but that does take hundreds or thousands of clicks because on the sixth step of this 13-step process, this

Question on the fourth section of the second survey should only be shown to folks whose last names begin with the letters N through Z, right? Like and so there's just like an incredible amount of detail that we have to like plug into these workflows for on behalf of our customers And that is something that LLMs are great at doing or I guess these harnesses like Claude and Claude Cowork.

Joe Leo (23:30.236)
Right. Yeah.

Joe Leo (23:47.094)
Yeah. And so that, that's interesting. I see this part and I see, you you also do some CV import flow and in both of those, sounds like you're, you know, the human in the loop is by design. And is that, and so, and I like the absolutely. there's very, you know, affirmative thing, like we would never just trust it. is that the case? Do you have, do you see a time where, where you can just trust it?

Kelly Sutton (23:57.286)
Mhm.

Kelly Sutton (24:02.051)
Absolutely.

Kelly Sutton (24:07.142)
Mm-hmm.

Joe Leo (24:14.584)
And I guess what's the failure rate when your humans are going and verifying this stuff?

Kelly Sutton (24:21.061)
I think really in the last three months, just like these models are getting so good that most of the time when I see a model make a mistake and we drill in and we're like, okay, why did it make this mistake? Turns out the source documentation was wrong. Meaning it was doing the right thing based on the information that it had, right?

Joe Leo (24:44.576)
Hmm.

Joe Leo (24:49.257)
Interesting, yeah.

Kelly Sutton (24:50.058)
And so you like that, that's like, that's a little spooky to me because previously it'd be like, okay, got it wrong. And it was, and it just got it wrong. now it's like, no, that there was a mistake in that doc and it just did what it said. And some of them are even highlighting like, Hey, I did this, but I don't think this is right. know? so that's, that's fascinating in that in some respects, it might be better than a human, but.

Joe Leo (25:11.253)
Yeah.

Kelly Sutton (25:19.727)
For the time being, just given the type of software that we build, to help me sleep better at night, want human in the loop, right? As much as possible. I think there's probably some interesting stuff that happens when the human becomes, the human in the loop becomes lazy, and we trust it too much, and we aren't keeping it as critical of an eye.

toward the outputs or what it's asking us to do, but that exists kind of regardless of AI. That's just any kind of management. If a manager isn't verifying the results or the customer isn't doing acceptance testing, or if someone's not doing acceptance testing and just saying, yep, looks good, yeah, you're cutting a corner. So not new to AI there.

Joe Leo (25:51.177)
Mm-hmm.

Joe Leo (26:06.107)
Right, Yeah, agreed.

Valentino Stoll (26:09.352)
Yeah, this topic fascinates me. I have so many questions. So like, how are you?

I guess how are you evaluating that they're doing their job? As much as a human can get lazy, an AI can be lazy as well and kind of just gloss over maybe details or use what instructions that you gave it, maybe overly so. Like you mentioned, it may notice, but maybe it doesn't.

Kelly Sutton (26:23.394)
Hmm.

Mm-hmm.

Kelly Sutton (26:39.874)
Mm-hmm.

Valentino Stoll (26:44.19)
And so like, how are you like tracking, like where does that human sit as far as a quality control perspective for feedback and like, how are you like managing that aspect of things to just make sure like certain, right? Like what do you actually care about and test?

Kelly Sutton (27:01.089)
I take a lot of inspiration from receiving a pull request. You get a pull request that is a 10,000 line change. You are either saying, I'm not reviewing this, or you are going through it and you get through the first three files and you're like, guess this is good enough. So it's much better to have, when the human is in the loop, it's much better to like,

package things in a size that a human can consume, right? So I'd much rather get a 100, 200, 300 line pull request to review whether from an engineer or a tool. I think the same goes for any business process as well. If you can't easily verify that the outputs are correct, then

one strategy that you can take is like, let's make the outputs a little bit smaller, right? So in our example, rather than look at like the whole workflow that might be a 12 month process with, you know, dozen plus steps, it's like, okay, well, let's have it do like a step at a time and verify that as it goes. And then we also, as reviewers, we also get a sense for what does it do well? Where is it likely to make a mistake with it? Because it's all being driven by MCP. Like there's just certain.

capabilities that we haven't built into the MCP server. So our ops or CS folks might file a ticket saying like, hey, I can't set like the, what shows up in a select question on our platform, cause that's not available to the MCP yet, can we build that? So just scoping down what you are reviewing and getting that cycle time up. So you're reviewing smaller things more frequently. I think is a great strategy for.

keeping the human in the loop and keeping them in the loop where they're doing more than just rubber stamping things.

Valentino Stoll (29:02.558)
So are you allowing kind of the humans in the loop to have access to like AI tooling as well to help their work? Or is that like not necessarily part of it? And it's more you focus the tooling around the pipeline that they fall into.

Kelly Sutton (29:20.67)
So the MCP tools look a lot like our APIs, right? And this is why I also don't think like, MCP is like long for this world. It's like, okay, so we build some APIs and then we go build the same things into MCP and they're doing the same things. So me as someone maintaining this stuff, it's like, well, I would rather maintain one thing than two. But yeah, like our ops and CS folks, they're the biggest users of our

like internal MCP tooling for our platform. And it's really given them the superpower of doing things that would previously require a Ruby script to do something programmatic on the platform, like to load data in or stand something up or move like an implementation forward. It gives them like that programmatic power without being programmers themselves. They're just doing that out of a Claude cowork these days.

Valentino Stoll (30:25.319)
It makes me wonder if a .mcp format for rails would be valuable here, where just any rails at endpoint can be an mcp tool.

Kelly Sutton (30:32.764)
That's a good idea, that is.

Joe Leo (30:37.358)
Right.

Kelly Sutton (30:38.876)
Yeah, and if you look at how we're organizing the code as well, it's like controller, MCP, both point at the same service class or business logic doer. so thankfully those layers are pretty thin.

Valentino Stoll (30:43.465)
Great.

Joe Leo (30:57.806)
Yeah, you you've got, you've talked a little bit about, and I think we're touching on it here as well, that the idea of AI becoming an implementation detail versus a feature. And I think that's very quickly where we're shifting as an industry, right? Like it's becoming really kind of...

Kelly Sutton (31:10.62)
Mm-hmm.

Joe Leo (31:22.682)
wrote, I guess, to say, well, you know, we're doing this thing with AI. And like, well, of course you are. If you're not doing it with AI, why would we do business with you? That said, there are, as you've mentioned, you've got a customer base that has been shifting more recently, you know, started very conservative around AI and now is moving forward quickly. So what are, like, I guess, in your words, what's the difference? What's the difference between AI as a feature and AI as implementation detail?

Kelly Sutton (31:52.958)
You got to look at the value. What is valuable to our customers or just any customer? The customer does not care that your Pentium processor clocks at 75 megahertz.

Joe Leo (32:11.99)
You

Kelly Sutton (32:12.41)
customer does not care if your server has 16 megabytes or 32 megabytes of RAM. Your customer does not care if you're using GPT 5.4 or 5.3 nano, right? Or whatever the models are, right? Customer cares like, you helping them make more money or saving them a lot of time? And I think in the last, since GPT 3.5, we've gotten a new tool to our tool belt, which is there's this magical probabilistic computing thing.

Joe Leo (32:23.756)
He

Kelly Sutton (32:41.69)
that we can plug into places and it speaks in plain language. And so being around the block a few times, you kind of see the arc of people being very chat forward of like, we're gonna put a chat, everyone's gonna put a chat into their stuff and we're kind of in this like, yeah, exactly. And then you're like, yeah, and you're like, well, are we a chat?

Joe Leo (33:00.921)
my God, this was a bad time in our AI development. Yes, I agree. This was a real bummer. Right. Right.

Kelly Sutton (33:09.163)
company? And it's like, no, leave that to ChatGPT and Claude. And so we're kind of like settling into like, okay, what are the new things that we can do that are very valuable to our customers using this like probabilistic genie, right? And is that appropriate for the tasks that we're giving it, right? You know, we've been living in like the discrete world for a long time as programmers here.

And even most of us has been trained on like, yeah, it either works or it doesn't. What do you mean it might work? Right? That's a weird thing to get our heads around. And like, how do we need to change the products around that? But like over time, I don't think we're even like going to be adding like the sparkles icon or like a chat surface, like just parts of an application are going to be prob probably powered by a probabilistic LLM to help you save time or accomplish tasks.

Joe Leo (33:42.345)
Mm-hmm.

Joe Leo (33:53.61)
You

Kelly Sutton (34:05.56)
And whether that's, you know, falling in like the fully discrete or like more probabilistic area, like you won't be able to tell. Does that make sense?

Joe Leo (34:14.953)
Mm-hmm. Yeah, yeah, it does. I am interested in this just from a business standpoint because I'm seeing this, Amita, seeing this trend. Same thing you are. know, AI is first, it's too out there, people don't understand it, and then, actually, it's really exciting, so let's everybody make a chat, which is basically just funneling money into.

chat GPT right through an API. then, you know, and predictably smart companies started using it for implementation details early. But I bet it's still hard to resist, you know, from your salespeople and your marketing team's perspective to say, well, you know, we're building this thing with AI, you know, it's really, it's really exciting, which is something that's different than.

Kelly Sutton (35:03.895)
Mm-hmm.

Joe Leo (35:07.516)
Well, we're building this thing at 75 megahertz, right? Which is a thing that nobody even, you know, doesn't appear on anybody's radar. But maybe that's all marketing and sales fluff and, know, internally you guys can just focus on what's actually working.

Kelly Sutton (35:09.93)
Mm-hmm. Yeah.

Mm-hmm.

Kelly Sutton (35:23.134)
Yeah, you kind of, you need to do both. Like you got to embrace it. think something that I, maybe like a mistake that I made in my career, like five ish, five to 10 years ago was like actually not embracing GraphQL to go back to that. Right. Like I should have just embraced it. cause there's a lot, there's a lot to be gained when actually with just like going along with the flow and like hyping people up and like, but still maintaining that like critical, like, what are we doing here? Like what problems are we solving? So.

Joe Leo (35:36.969)
here we are back at GraphQL. Yeah.

Kelly Sutton (35:52.501)
because GraphQL is like a bad example of this, but when it comes to like a customer, they've been told like, whatever you need to do, go figure out how we're using AI, right? In some cases, and so they were gonna go, in our case, they're gonna go look for a vendor that's like very AI forward or AI first. And so we're doing ourselves a disservice if we're not just embracing this. And it is very like, it's a good type of waste where we're just experimenting with stuff, we're...

Joe Leo (36:02.247)
Yeah, that's true.

Joe Leo (36:10.705)
Mm-hmm.

Kelly Sutton (36:21.844)
throwing it against the wall, one out of every 10 experiments hits and we're like, ooh, that is actually really useful. this admin MCP was not something that I even predicted would be useful six to 12 months ago. And now it's like, oh, this is actually the only way that we do certain things at this company now. And you don't get that if you say, well, we're not going to experiment with this at all. You really have to just dive in and.

Joe Leo (36:40.892)
Yeah.

Kelly Sutton (36:50.567)
Figure it out and know that 90 % of the stuff you build or try isn't going to do it, but you at least get to be in that conversation.

Joe Leo (36:58.586)
Yeah. You know, it's interesting because you've said this a couple of times and I've, it's now 1215 on the East coast. I've already had three conversations about this where I've heard anything from September and October, but no later than January of this year, where there has been some kind of sea change. And I wonder if that is, you know, it's, it seems to be among the engineering set, probably this more senior or experienced engineering set, but it's also among, you know, in your case, Kelly, your customers.

So what do you think, what do attribute that to?

Kelly Sutton (37:30.643)
Opus 4.5, 4.6, whichever one came out around that time. I think it's the models. I think, I would hate to be running an AI tooling company right now, because you're just like, is today the day that I wake up and Anthropic publishes a blog post and we're just toast? I think their speed of execution, their focus on

Joe Leo (37:35.193)
Yeah, you think it's specific to the models.

Joe Leo (37:46.916)
Hmm.

Kelly Sutton (38:00.442)
enterprise market, right? And like, they're kind of like the Microsoft of these AI companies where it's like, no, we just build like tools for like, productivity, like, professional use here. We're not trying to build like an advertising company and they've just nailed it with cloud codes. And they're like, okay, let's try like cloud cowork and like, anyone who's not an engineer loves that internally. Yeah.

Joe Leo (38:23.118)
You

Kelly Sutton (38:27.633)
I don't know, it's just a very exciting time and I think the model's getting better. Just obviate all sorts of minutiae that we might have been doing a year ago or two years ago around like orchestration and tooling and Cloud MD files. It's like, well, if you just wait three months, like you can just delete your Cloud MD file and you don't need that anymore. The model's that good, right? Yeah, so I don't know. It's the models.

Valentino Stoll (38:54.888)
So I guess the question is, do you find yourself being more productive as a business owner and producing more value faster for customers?

Kelly Sutton (39:02.961)
I am, for the first time in my career, I am consistently overestimating how long things will take from an engineering perspective. Usually it's like, yeah, we could get that done in an afternoon, like three months later. Like, where is that? Now, you know, I'm chatting with folks on the business side. like, yeah, it feels like a week or two. And then like that afternoon, it's like in production, right? Without a bug, right?

Joe Leo (39:13.571)
amazing.

Joe Leo (39:17.377)
Yeah.

Joe Leo (39:30.754)
Yeah.

Kelly Sutton (39:32.65)
And so that is, I'm still adjusting. Like I am, I still haven't found my ground yet. You're usually not punished for like under promising and over delivering. So I'd stay there for a little while. But it is just a weird world that we're living in here. like what used to be the longest or one of the longest poles in the tent. Like engineering time is basically become zero.

Joe Leo (39:44.291)
Right, right.

Valentino Stoll (39:54.761)
So I guess as that execution timeline tightens, like do you find it your planning strategy and strategy like strategy aspect of things like is that changing the way that you do those because the execution feedback loop so tight?

Kelly Sutton (40:00.622)
Mm-hmm.

Kelly Sutton (40:13.911)
We got really lucky in how we set up the business and the relationships that we built with our early customers. So we're doing like the, every customer gets like a Slack connect, like Slack channel. And we train our customers in like, you know, you're on our board too, right? You are looking at the tickets that are moving through, you get to prioritize them. You are kind of doing some acceptance testing, depending on like the, like what.

the task is in front of us here. And we're gonna train you to get really good at a high cycle time here, or I guess a low cycle time. So things that our competitors would say like, yep, we'll put a version of that in front of you in three months. It's like, okay, we need you to test this like this afternoon and tell us if this is meeting your needs or not. So like from the early days, we've been like optimizing for that really quick cycle time.

Joe Leo (41:10.335)
And do your customers do that? Do they jump in and test things? Excellent.

Kelly Sutton (41:13.122)
Yeah, yeah. because like, it's a good, and if they're not, you know, there's a chance that we're not building something valuable. So it's like a nice like, you know, lowercase a extreme program, lowcase a agile, extreme programming, whatever you want to call it. Like, we're just in it with our customers, whole team building, solving, solving their problems.

Joe Leo (41:26.006)
Hehe.

Kelly Sutton (41:38.365)
Yeah, so the LLMs have shortened that. OK, it's going to take a few days for engineers to put that together. So it's shortened that part of the process. But there's still everything that kind of goes into adding a feature or something into the platform here. So that's always existed. But we've always tried to keep that pretty short. That's it.

Joe Leo (42:04.324)
we've got a couple of minutes left and you did just mention this around, you know, you got lucky with how you structure the business. And by the way, I really love that. I love that approach, because I've, you know, I've done a lot of work with, customers that expect a three to six month timeline. And there is, I think you've framed it perfectly. There is a teaching moment there to say, no, no, we're going to release very, very frequently, but you gotta be, you gotta be part of it or else we can't. Right. And that.

Kelly Sutton (42:33.069)
Mm-hmm.

Joe Leo (42:34.246)
I think that's great to build that in from the beginning. So I have another question for you, which is just zooming out. I love the timeline of three years. I don't love it. I think it's interesting that you've been around for three years and that is sort of like very beginning slash predating the release of chat GPT to now where, you know, so much has changed. What, if anything, would you have changed knowing that we were going to end up where we are right now?

Kelly Sutton (43:09.862)
I feel like we just got so lucky. I don't know. I don't know. Yeah.

Joe Leo (43:19.006)
Well, all right, let me ask you this. What are you glad that you did knowing where we are?

Kelly Sutton (43:25.798)
Yeah, so I think I was listening to the episode of Mr. Searles on this show before coming on. And I'm like, I consider myself a pretty particular programmer with some like strongly like, like held ways of like doing things, but he might be the person who's more, more, yeah, he might be more particular than me. And so I think early on, you know,

Joe Leo (43:34.555)
Yeah.

Joe Leo (43:45.39)
Yeah, until you talk to him.

Kelly Sutton (43:54.282)
when we were just starting this company, GPT 3.5 or 3 was like the model du jour. And it couldn't really like do anything, right? And so we were still like copy pasting Ruby code in and out of like the chat GPT interface. But we were able to set up the Rails code base exactly how I wanted it, right? And the model is very much like junior or mid-level engineer. It's just gonna copy what it sees, right?

And so we got to be very particular in how we set this thing up. And so now it just follows the patterns and follows some of the conventions that we have in our code base. So it's just an extension of us. I think some of the shortcomings around AI that I hear people complain about is it kind of doesn't know what to do still with the zero to one. And it's like, well, yeah, doesn't know your way of doing things. But we've got, you know.

Joe Leo (44:27.057)
Mm-hmm.

Joe Leo (44:47.228)
Mm-hmm.

Kelly Sutton (44:50.152)
12, 18, 24 months of history of like, here's how we do things here. So plug in and get with the program and that's been great.

Joe Leo (44:58.594)
Yeah, that's a great answer. And, and little teaser for our next episode, we have somebody who's coming in who found the, the exact opposite. In other words, went to try to use AI to do a bunch of code gen on a legacy application and said, wait a second. We don't want to copying these practices. This we got to clean stuff up first. right. And so maybe that is also a benefit of timing for you, right? You started in a place where, it was not mature, but you were able to get something out of it.

Valentino Stoll (45:19.975)
You

Joe Leo (45:27.947)
And then subsequent iterations of LLMs could build on a good foundation.

But in five to 10 years when you've had enough turnover and things are really atrophying, Dev Method is great at modernizing and de-risking applications. Let's throw that in there. Hey, we've only got a couple of minutes, but wanted to, but Valentino, if you don't mind, I want to hear a little bit about ups.dev.

Kelly Sutton (45:49.544)
You

Valentino Stoll (45:49.627)
Ha ha ha.

Valentino Stoll (46:02.137)
man, I have so much to say about this, but the TLDR of it is in January I decided to spin up my own OpenClaw and as you do, and you know, what does it do? And I was very underwhelmed by like code output that it made. I was just like, hey, make a, you know, make a Rails app that does this stuff and use all the latest stuff. And I produced like very lukewarm results. And I'm like, okay.

Joe Leo (46:05.283)
Yeah

Joe Leo (46:12.341)
as did we all.

Joe Leo (46:22.041)
Hmm.

Valentino Stoll (46:31.355)
Like, clearly this is a context problem, Like, you know, garbage in, garbage out, it's probably just like making up a ton of stuff and it has a limited context window and XYZ. How can I condense like the knowledge chunks that it takes on each iteration of its loops and make more use of it? so I went through, I'm like, okay, let me trade, all the main models are making training boot camps, right? And that's how they improve their models. And so I'm like, okay.

What does that look like on a smaller scale? Can I just train it on Ruby? Can I just train it on Rails? And so I was like, okay, know, Claw, go, what does, use the latest educational science and what does a program look like to train you on using Ruby? And I was like, here's what I do, the Rubik's, the pipeline. I was like, okay, that sounds great.

I was like, so what do you need? And it was like, well, I need a list of materials to build the course and stuff like that. So I went out and I purchased all the materials for that course that I would want. So that way, at least like all of the authors are like getting paid for me using like the content. So now it's like a distribution problem. Like, can I share this like knowledge modulator? I don't know. But a future problem. So then I like it actually worked and it produced much better results. And so I just gave it more domains. I was like, all right, try Rails now.

Joe Leo (47:43.754)
Heh.

Valentino Stoll (47:54.504)
Can you like build in a rails app and like make something useful? And so it did and it produced better results, and I'm like, okay, this is great I'm sitting like I was you know I get a bill all the time for these stupid domains that I own and they're just like years sitting like Burning money and I'm just like I I'm either just gonna like get rid of them You know and just say eat the eat that burden and just like that's it or I'm gonna like start making use of it I'm like, well, this seems perfect right like maybe I could train this model to

Joe Leo (48:08.151)
Me too.

Kelly Sutton (48:10.627)
You

Valentino Stoll (48:24.229)
make use and monetize these domains for me. And so I set up like a product building training camp. And as a result of this training camp, it produced the very first product built entirely by a claw, I think, maybe not, but ups.dev. And it basically took the business model of statuspages.com and undercut their pricing.

and then added a nice little feature I thought, which was agent status pages. So it built a Ruby gem to make it easy to monitor the health and uptime of an agent. And I went and added it to another site I built called Daily Vibe. And I said, okay, add this heartbeat to all of your agents that you built for the app so I can know what the status is of like.

Kelly Sutton (48:55.107)
You

Joe Leo (48:59.325)
Yeah, yeah, I think so too.

Valentino Stoll (49:21.445)
all the agents running and operating this site. And sure enough, we have, you know, daily vibes ups.dev page that will show you the uptime of all the agents that are running and operating daily vibe.ai. And it's fantastic. I love it. I hope it's successful. Because if not, I'm going to have to decommission it and get rid of the domain. That's part of the project. But I think

Joe Leo (49:39.763)
Me too.

Joe Leo (49:44.085)
That's part of the pact,

Valentino Stoll (49:47.439)
I think better than that, it's more of like, okay, does this experiment work? Can I distribute these knowledge modules to others? And how might we be able to create these pipelines where we can start sharing and distilling knowledge in general, right? So we'll see. I have an article started. We'll see where it goes, but it's promising.

Joe Leo (50:08.211)
And you open sourced a gem as a result of this too, the Ruby LLM UPS gem.

Valentino Stoll (50:14.33)
Yep, so there's now a simple one line change you can add to your Ruby LLM agents to just include monitoring and it will automatically give those heartbeat updates to the status pages. So your customers can know that your operations are operational.

Joe Leo (50:22.259)
Mm-hmm.

Joe Leo (50:32.147)
Awesome.

Joe Leo (50:36.179)
All right, yeah, I love it. I think the best way to get people to use this is just threaten them with, Valentino's products so that we don't have to put ads in our show. Money's gotta cut, we gotta make money somehow.

Valentino Stoll (50:36.198)
Super fun.

Valentino Stoll (50:53.35)
Yeah, please do My monthly bill is $10 a month. So I think just like a few customers to that would make it stay afloat. So If you have $10 to spare Throw it at it and it'll stay up. That's right. That's right

Joe Leo (51:00.317)
Mm-hmm.

Yeah, let me go.

And you got some agents that need monitoring unlimited status pages come on Well, okay, we're we're run up here at a time Kelly anything that you'd like any parting words you like to give us

Kelly Sutton (51:21.947)
No, thanks for having me on the show. I really enjoyed it.

Joe Leo (51:26.831)
Yeah, it's great having you, great talking about the details and getting into the weeds about what's working on a startup that is maturing like yours is. And of course, we wish you all the best with Scholarly as a Ruby product. We're all rooting for you. yeah, and I guess we'll see you around the conferences sometime soon.

Kelly Sutton (51:50.505)
Sounds good.

Joe Leo (51:52.025)
Alright, take care everybody.

Valentino Stoll (51:52.186)
Yeah, thanks for coming on, Kelly.

Want to modernize your Rails system?

Def Method helps teams modernize high-stakes Rails applications without disrupting their business.