Episode 20

Minerva Magic: OpenClaw, Agent Status Pages, and Training an AI Coworker in Ruby on Rails

with Valentino Stoll & Joe Leo

Minerva Magic: OpenClaw, Agent Status Pages, and Training an AI Coworker in Ruby on Rails

About Valentino Stoll & Joe Leo

A panelist episode with hosts Valentino Stoll and Joe Leo discussing their hands-on experiments with autonomous AI agents using OpenClaw.

About This Episode

What happens when you treat an AI agent like a co-founder instead of a tool?

In this episode, Valentino and Joe go deep into a real-world experiment: spinning up an autonomous agent using OpenClaw, giving it domains, goals, and just enough guidance to build an actual business. From creating accounts and managing projects to writing code, deploying with Kamal, and even designing its own training curriculum, the agent evolves from confused assistant to something resembling a junior engineer with initiative.

Along the way, they explore the messy reality of agent workflows: memory systems, self-training loops, PR reviews, hallucinated confidence, and the constant tension between autonomy and control. The result? A working product, 15 early users, and a pile of hard-earned lessons about what AI can and definitely cannot do today.

If you're building with agents, thinking about autonomous systems, or just curious what happens when you let AI run a startup… this one's for you.

Full Transcript

Valentino Stoll (00:01.538)
Hey everybody, welcome to another episode of the Ruby AI podcast. I am joined by my host today, Joe Leo, and I am of course Valentino Stoll. And we have a fantastic, just, you know, shoot the shit, you know, panelist episode today. Just the two of us. And...

Joe (00:19.174)
It's just the two of us.

and it'd be whether you like it or not, listener.

Valentino Stoll (00:24.75)
But you know, so I've had this experiment running for since the beginning of the year on I'm kind of taking advantage of open claw in my own way and we'll dig into that in a bit But I I thought we'd start yeah, but first Joe you you had this great suggestion Why don't why don't you take it away?

Joe (00:42.874)
But first.

Joe (00:48.964)
Yeah, like to start off with something a little current. And what I've got for you today is a news article from Wired. The band Geese have been exposed as a Psyop by Wired magazine, which I really love. this is from Wired. So the band Geese have...

utilized the services of a digital marketing firm called Chaotic Good Projects who used AI to spin up thousands.

of fake TikTok accounts to bolster the credibility of the band Geese and get them tons and tons of marketing and eventually publicity and shows. And it really worked. So Geese had been on Saturday Night Live and they have sold out shows around the country. They're going to be headlining Coachella. They were here in New York selling out. I forget if it was the Garden or somebody else. And I'll tell you what, listener, it also worked on my Spotify account because I had never

heard of Geese and then all of a sudden I could not get them out of my AI generated playlists. so what I want to know first and foremost, Valentino, is can we call chaotic good and then maybe the two of us just stop this whole podcast thing, we start a band, because I think this will work for us. I think this could get us booked, you know, at the garden, which would be a lot cooler. What do think?

Valentino Stoll (02:22.477)
we should totally start a band. You know, think we should just call it, Why Are We A Band? You know?

Joe (02:29.381)
Why are we a band? I think that's good. Geese is a terrible name. But they're not a terrible band, I should say. They're not a great band, but they're not a terrible band. But it's a terrible name. There was already Goose, which is a jam band, and so it was very confusing. I felt like I was listening for a little while and thinking, OK, this has got some kind of like throwback Van Morrison feel and like Van Morrison.

After you hear it a few dozen times, it really gets annoying. Apologies to the late Van Morrison, but really, he made a couple of great songs, but if you had to listen to him on repeat, it's really terrible. so I just think that if that's what we're up against and we hire the same firm, we're golden. We are the exact age where everybody wants middle-aged men just talking on podcasts to start a band.

Valentino Stoll (03:29.613)
Ha ha ha.

Joe (03:29.913)
So I we're in great shape here.

Valentino Stoll (03:32.589)
I think if we just make it all purely AI driven and generative, know, only virtual appearances, you know, right. You know.

Joe (03:39.225)
Yeah.

Joe (03:43.663)
yes, yeah, yeah, absolutely. I'm not going anywhere. First of all, I go to sleep at like 9 p.m. I'm not doing a concert. Yeah, we'll record it at 7 a.m. or so and yeah, get it out there.

Valentino Stoll (03:52.364)
I feel like that's the follow up to this big event, right? Like somebody's social engineered, like this platform for artist release, right? The logical next step is like platforming, creating these artists to begin with.

Joe (03:59.716)
Yeah.

Joe (04:05.271)
I think that's true. you know, as a person who goes to concerts and events quite a bit, really what people want these days, they don't even want to go there. They want to go home and go to sleep like I do, but they want to go there and get the photos for Instagram. And so I think we could even take that and say, hey, look, we're going to do this show at Madison Square Garden. We're going to sell a bunch of tickets and they're going to go on the secondary market and it's going to cost a fortune. my God, it's going to cost you so much money to go. But the but what you're going to get out of that is you don't actually have to go.

We're gonna give everybody their own AI generated photos where it looks like we're in the background and you are having the time of your life and you can just post it on Instagram.

But you could go and totally do something else. I think we're onto something now.

Valentino Stoll (04:49.005)
That is great. shit. I think we're onto something. You know, it's funny. I have this site called daily vibe dot AI and it started just for fun. And I have like, have it creating artists based on news, like the daily news. So it does like, it searches for the local news and then it like creates artists and creates like this artistic representation of the day for your location. And it was just fun, but like

Joe (04:59.855)
Yeah.

Joe (05:15.546)
Yep.

Valentino Stoll (05:18.047)
It has evolved into this artist generator. And so like it doesn't you could like play there's like makes a song for the day, you know, like and it creates the musician, you know, and the musician has a backstory and like something like very like, you know, it defines who they are as an artist, you know, like I feel like this is like, you know, this all ties in with like the machine is built. Let's do it, you know.

Joe (05:21.623)
Yeah.

Joe (05:28.037)
Nice.

Joe (05:39.013)
Of course.

Joe (05:44.514)
Absolutely, absolutely. We're gonna rope your new creation Minerva into this because if there's one thing we need, it's going to be an open-claw instance that is, you know, that we could text all the time. I'm gonna want to, like, when I'm on the road, when we are on the road, I'm gonna want to send some really bananas requests to the different venues for things that I need there in my, you know, in our room. You know, so I'll have Minerva do all of that.

fighting with promoters, stuff like that. yeah, so I would like to do that. And I'd like to talk a little bit about this project. So for those who don't know, I'm not the only entrepreneur on this podcast anymore. I once was, I had that distinction, but now Valentino has taken the bull by the horns and he's started a number of companies and he is now...

you know, probably in his, enterprising era. So why don't you let us know, just a little bit about what you did at a high level and then we can dig in some of the details and just, if just for a taste of what's coming up, this is a really cool project and we're going to get into a lot of the open source tools. got our friend, car mine and Ruby LLM. We've got action MCP, you know,

You being the kind and community-oriented person you are, unlike me, you have already released your own open source as adjacent to the company building you're doing. So there's a lot to get into here, and so I think it'll be fun. So tell us a little bit about what this is.

Valentino Stoll (07:28.321)
Yeah, you know, the open claw craze started like over Christmas, right? Like with the big releases of all these frontier models, like really upping their game and like making a lot of things possible that weren't before. And I think that's been true, right? And we're still seeing the fallout from that. And so I wanted to experiment with what open club was and like, you know, see if any of it like really, really was valuable or not.

Because there's just so much hype, right? Like the lobster is just like proliferated. And you know, it got to the point where I was just like, I was just seeing a lobster in my social feed. Didn't know what it was, you know, and I'm like, finally just like took a breath one one message and like, all right, what is this lobster? And by the time I got to that point, you know, it was already in the transition from like, from like, you know, what was it? Claw?

Joe (07:58.724)
Mm-hmm.

Joe (08:02.895)
Yes.

Joe (08:15.363)
Yeah.

Joe (08:27.567)
was Claude Barthes or... Yeah.

Valentino Stoll (08:27.724)
think it was Claudebot, like C-L-A-W-D or something like that. And then Anthropocles was like, it's too close to Claude, like please change the name. And then it turned into like three different names. Right. So it was kind of funny. you know, it matched up with my amusement, right? I love jokes and it seemed to be a big joke. Like here it was like.

Joe (08:33.752)
Right, right, right.

Joe (08:38.071)
Right. Yeah, especially since they, especially since he doesn't like Claude that much.

Joe (08:49.796)
Yeah.

Joe (08:53.261)
Right.

Valentino Stoll (08:54.742)
this very popular project that like it was in the middle of this like weird naming called called three different things and it was still popular. And after all of that, like people were still like like raving about it. And so despite, you know, anthropics best efforts, it continued to be called Claude bot for quite a while. And it's still called Mold Book in a lot of ways. But yeah, so I I discovered open claw.

Joe (09:03.225)
Right.

Joe (09:14.276)
Yep.

Joe (09:18.693)
Mm-hmm.

Valentino Stoll (09:24.372)
and took a first spin. ran through one of these like, you know, YouTube things for securing it at EC2 instance and lock it down to tail scale and all that. And so I was only messaging with it personally, but in the cloud through like some secured SSH stuff. so that.

Joe (09:43.277)
Okay, so I think this point is still probably a good point of clarification because there's still a lot of people that just take this thing and fire it up on their Mac and have it go nuts, but you were not doing.

Valentino Stoll (09:53.781)
I wasn't doing that, to be honest now with the Tentheropics latest announcements on restrictions, I may start doing that. I haven't decided yet. Yeah, so I spun up this instance, made the mistake of following the guidance on the YouTube video for a large instance, and it turned out to cost money, despite him saying it didn't cost money. But either way, I was willing to foot the bill for my preliminary thing.

Joe (09:54.841)
Yes.

Joe (10:00.783)
Mm. Yeah.

Joe (10:13.205)
Alright, yeah.

Right.

Valentino Stoll (10:23.744)
You know, it all came down to, okay, what do you do with this thing? Right? Like, and so I, I had all these like domains lying around is paying like who, I don't even want to talk about how much I spend a year domains, right? They're just.

Joe (10:37.573)
But I love this hobby. You're not the first person I've talked to. I've got probably 100. But so many of them are just Def Method spin-offs, right? Because I don't want somebody to buy def-method.com and then start competing against me. So I have a lot of boring ones. And then I have my daughter's name. My wife is like, there wouldn't even be an internet by the time she's ready to use it. But anyway, but yeah, I've got a bunch of those. But I like the people that go out and they just come up with a name. They're like, that would be interesting. Let's get that. And that's what you do.

Valentino Stoll (10:40.724)
Right.

100 domains, right? Times 10.

Okay. Sure.

Okay.

Valentino Stoll (11:05.717)
Right.

Yeah, exactly. like, you know, I had a bunch of like product ideas that come up over the years. And especially with like when the .dev TLD like released, I bought up a bunch of like even just three letter domains, which was kind of funny at the time. I bought AIG.dev. I think it would be funny to like make a landing page that just redirected to Geico or something, some competitor.

Joe (11:20.645)
Mm-hmm.

Joe (11:27.32)
yeah.

Joe (11:33.669)
Yeah, that's good.

Valentino Stoll (11:35.844)
And see if I could like get them to pay me to like change it and that was a huge mistake because I got a cease and desist letter I did and they're like we're not associated with Geico like please stop for And they are I don't want to I don't want to deal with this So like I just stopped doing that and then it so it's sad and so now that one's kind of like I maybe I should just cancel it But it's three letters. So like I don't know But it has AI in it so like maybe I revisit that one

Joe (11:39.363)
Yeah, it's all about leverage. Did you really?

Yeah.

Yeah, right.

Joe (11:57.86)
Yeah, yeah. I understand.

Valentino Stoll (12:04.758)
But either way, there's gotta be somebody, right? So I had this big list of domains and some rough ideas of products that I think could fit well. And so I'm like, okay, how easily could I just get this bot to create a business on these domains or at least monetize the domains, right? And so that was my premise for all of this. Is that possible, right? And so...

Joe (12:04.91)
Yeah, there's gotta be somebody.

Joe (12:28.101)
Mm-hmm.

Valentino Stoll (12:33.632)
I went through the painstaking agony of onboarding this agent. so the first thing you do is create accounts for it. Because you don't want to like, I didn't personally want to connect all my accounts to it. I wanted to work independently. I'm booting up my co-founder, ultimately. I think of it as a co-worker. that was priority number one. Can I just get it to be a person?

Joe (12:44.282)
I know.

Joe (12:51.491)
Yeah, yeah.

Joe (13:02.469)
Mm-hmm.

Valentino Stoll (13:03.252)
I set up a, I have a Hey domain email and so like I just added, yeah, so I added an email account for it and then I said, okay, I set up an email account for you, here's your password, log in and change it and let me know when you're done. And when I did it, so like it like went, visited hey.com and like, you know, followed the acceptance link and clicked and followed through all the process and then it emailed me when it was done. And I was like,

Joe (13:07.395)
I was just going to ask you about the email because that's kind of challenging. Yeah.

Joe (13:30.755)
Very cool, yeah.

Valentino Stoll (13:31.69)
Wow, success, that's pretty incredible. Like that was the first impression I got. And so like, pretty impressive, like just the browser capabilities, right, to do that. And then I'm like, okay, well like, you know, I set up and walk through like its purpose, what we're trying to do. And then I was like, okay, like how are we gonna manage like the products for all this stuff and the project scope and management. And at the time, Fizzy had just come out from, you know, the wonderful 37 signals team.

Joe (13:34.168)
Yeah.

Joe (14:01.528)
Right, yes.

Valentino Stoll (14:02.251)
Which was like a Kanban style project board and I had just seen a post from From DHH that was like, oh I got open claw to sign up on fizzy, right? And so I was like, oh awesome. So like hey go sign up for a fizzy again and he went and they signed up and used the email address I gave it and then I was in and it was then like creating projects inviting me to them and then I we set up this workflow me in this bot

Joe (14:12.645)
I saw that same one. I saw that same post. Yeah, I was like, oh, that's a idea. Yeah. Yeah.

Valentino Stoll (14:31.923)
over signal and I was just like, okay, like, you know, set up our project structure. Here's for our project workflow. like anytime that you create a task and you need my attention on something, just tag me in it and I'll come in and look at it and approve or whatever. And so it was like, great. And so then I was like, okay, here's all my domains. Like here's how much it's costing me. And here are some rough ideas based on

Joe (14:33.221)
Mm-hmm.

Valentino Stoll (15:01.279)
Like I dumped a document and I was like, you know, prioritize these based on what you think, you know, we can monetize a product out of. And it went and it like came up with some ideas and you know, it's all just research and docs, right? And I was like, okay, I like these three ideas. Like let's focus on, on just, you know, two to start. And so go ahead and we're going to use Rails and this is like the workflow that I love.

Joe (15:12.837)
Mm-hmm.

Valentino Stoll (15:30.793)
like doing, and it was pretty rough to get started, right? Like, cause it needs a lot of hand holding. And I was like tired of all the hand holding.

Joe (15:40.197)
In what way? Where were you hand holding?

Valentino Stoll (15:42.944)
Yeah, so like, so it would fall apart and just like try and lean back into research, right? And like document generation and not really taking action, right? And so was like, how do I get it to take more action? Right, exactly. Yeah, it's a real thing.

Joe (15:49.145)
Hmm.

Joe (15:53.306)
Yeah. Which, which by the way, employees do this too. This is, this is a real thing, you know, cause it's hard. It's hard to take those risks. Now I'm, I'm imagining the LLM is doing it for different reasons, but it's interesting that that's what happens.

Valentino Stoll (16:07.701)
Yeah, and so I was like, you know, well,

What really is this like, you know, and before we get there, like it even had a problem coming up with like good ideas. I tried to like take too many of the ideas that I had and like then strictly like focus on building those. And I was like, no, like that's not what I want. Like I want you to think about like critically about which domains and like kinds of businesses are best. And so I actually created a shark tank exercise as I was like, all right, imagine like you're a, you know, founder like

Joe (16:31.727)
Right.

Joe (16:37.987)
Okay.

Valentino Stoll (16:43.455)
and you have this idea and you're pitching it to the shark tank, right? And like, so run through a simulation of what that looks like and pitch your ideas to shark tank and create a pull request, right? And so like, I had to set up a workspace and a private GitHub and create a pull request with your submission to the shark tank and critically review as each shark as a comment in the pull request, right? And so I set up this workflow.

Joe (16:46.627)
Mm-hmm.

Joe (17:04.906)
Uh-huh. That's a good idea.

Valentino Stoll (17:08.223)
You know, and so like for every like suggestion that it came up with, it went through this critical review process to help hone and whether or not like this idea would actually be worthwhile. And so it actually like, so AIG.dev it found, it's like a fantastic domain for a short thing, but it was like during its review cycle, it identified that there was too much risk in trademark infringement. And so it was not worth building a product for around that. And so it likes.

Joe (17:35.205)
All right, well.

Valentino Stoll (17:37.258)
scrapped it basically in favor of other ones. And so I went through this process a lot and ultimately what it came down to was this ups.dev domain. the takeaway from this process is like, I realized that agents are learning by doing this stuff, right? Like not by remembering or trying to like create docs and like reference them.

Joe (17:48.282)
Mm-hmm.

Valentino Stoll (18:04.847)
And so that actually stuck out to me a lot. And so what I did, I basically sharply pivoted away from product development at that moment and had it create a agent training program so that as like I'm learning these new things and takeaways from the agent, it can go and then train itself and create a regimented curriculum, right? Out of these experiments and experiences that it's going through.

where it's falling short. if like I ever got to a point where it's getting stuck in these kind of loops of like not doing what I'm asking, like it needed that like critical pathway to like analyze what it's doing, what I wanted it to do, and then like try and create a like program to distill knowledge about that and reference later. So that's what I did. And so like from the shark tank experiment, it like created its own like curriculum.

Joe (18:56.473)
Yeah.

Valentino Stoll (19:03.307)
based on science around education. it created, which is funny, because it created around human learning, right? So it had a space repetition exercise built up, and it created a cron task to every so often recap and relearn, which is stupid for an agent, because it doesn't need to relearn anything.

Joe (19:11.63)
Right.

Joe (19:26.585)
Well, yeah. And I, yeah, that's true. And I want to kind of dig into that a little bit because I think this is like, this is a real point where I think a lot of people are, they're learning themselves how to get their agents to learn. And so there's a couple of things you said there. You said that they're learning by doing, which, which I think is compelling. And then that you have this curriculum. What is it? Let me back up. Whenever I've used, open cloth,

It is good at writing things down and then reading them, but I still feel like I have to tell it to do that a lot. Like, hey, write this down, hey, and then I'll start up a new session, say, hey, go fix this PR, and it'll say, I don't know how to do that. I'm like, yes, you do, go look at your docs, right? So were you able to advance it past that, and how did you do that? How did you kind of make it more robust?

Valentino Stoll (20:19.463)
I was, and I'm wondering if I should turn this into a product or just release it open source. I haven't decided yet, but yeah, mean, ultimately what I did is like, I created a knowledge store, right? So it went through the training and then distilled that knowledge and then was able to reference it in future chats with like a skill. like I had it create, I had itself create a skill to like,

Joe (20:27.144)
Hahaha

Joe (20:33.583)
Mm-hmm.

Valentino Stoll (20:46.687)
basically use the knowledge that had distilled from previous like training boot camps. And then any future like conversation I had with it, it would like draw from that knowledge. And so it created its own like distilled memory for like what it had specifically done. Right. So like that way.

Joe (20:52.26)
Mm-hmm.

Joe (21:04.793)
And what does the memory look like? Is it a bunch of markdown files or is it entries in a SQLite database? Okay.

Valentino Stoll (21:08.669)
I had it in a SQLite database and it can create many of these databases. Honestly, thanks to Fractalmind, Stefan, for ExtraLite, which is like a SQLite database that's super fast, and SQL, which supports the SQL Ruby Gem, which supports multiple databases, I can then blend all the databases together and as it starts to accumulate.

Joe (21:20.663)
Mm-hmm.

Joe (21:31.351)
Yeah. Great, Jen. Yeah.

Valentino Stoll (21:37.279)
various knowledge domains, it has all access to all of these scoped as an MCP server, right, that it can then, hey, I'm looking at this domain of knowledge, like, you know, where do I, what can I learn about this aspect or draw from? And so it built all this stuff. Like I didn't have to tell any of it, right? Like I was like, you know, like at first I was like, that sounds right. Like just go ahead and do it. And then it did it. And then like referenced it and it was missing the MCP server. So it built the MCP server, right? Like.

Joe (21:52.26)
Yeah.

Valentino Stoll (22:06.999)
It did all this stuff proactively, which was pretty remarkable, as it needed.

Joe (22:10.797)
So let me ask you this with now you at the end of this, had two rails projects. Did you also have a third repository of some memory knowledge based infrastructure?

Valentino Stoll (22:24.106)
yeah, yeah, so like,

Joe (22:25.081)
Okay, so really it was you had built a project while you were building these other two projects. Like it was a sort of an artifact project.

Valentino Stoll (22:33.704)
Yes. Yeah, mean, ultimately that ended up being two artifacts, right? Like, because like there's the workspace and like activity and the, you know, the agent itself, the coworker and all the accumulated, you know, actions and artifacts of the actions as a workspace, right? And then it's the individual knowledge, like consolidation. And so like those paired together, but they were really separate.

Joe (22:37.518)
Okay.

Joe (22:45.413)
Mm-hmm.

Joe (22:52.089)
Right.

Joe (23:01.614)
Okay, got it.

Valentino Stoll (23:02.716)
And so like then because I had it in this like parallel, you know, knowledge distiller pipeline, you know, I sent it down a whole new task of like, okay, like go explore, right? Like what it means to have knowledge and whether or not it's useful and evaluate like the efficacy of that, right? Like, so that way I could know like, is it just like storing documents that it doesn't need because the, you know, domain, the base.

Models already know it, right? And so it went in and it figured that out, which was fantastic. And then it actually deleted quite a lot of knowledge that it already had access to. So it cleaned itself up. But you know, like, so circling back to like the product, right? So it ultimately ended up with ups.dev, which is status pages for agents. And so at first, when I had ups.dev, I had this idea of

Joe (23:32.483)
Yeah. Yeah.

Joe (23:41.798)
wow, that is interesting.

Joe (23:49.134)
Yeah.

Valentino Stoll (24:00.954)
Wouldn't it be nice to like have a status page right for all right. Actually it created the idea of a status page right and just like you know it's it's proposal to the shark tank which got approved was just like okay like basically create status pages that I owe but just undercut the price and like create the same thing right like and I was like well that just sounds stupid like you know I was like you're gonna have a better like you know idea than that I was like it sounds okay.

Joe (24:10.885)
you

Joe (24:18.853)
Brilliant.

Joe (24:23.939)
Yeah.

Valentino Stoll (24:30.59)
but like, know, it also commented that it would be nice to have like status pages for agents as like a feature of this. And so like, I took that comment, I was like, that's the business. was like, surface that back to Shark Tank and see what they think, right? And so then Redrew is like, yeah, like Shark Tank, this is a differentiator, right? Like no other company is doing this right? And like, you know, now.

Joe (24:39.535)
Mm-hmm.

Joe (24:43.225)
That's the business, yeah.

Joe (24:48.131)
Mm-hmm.

No

Joe (24:55.193)
Yeah.

Valentino Stoll (24:57.096)
that we're like area in this somebody will probably go and start adding it to status pages.io or something. But like, yeah, we'll get there. But yeah, I mean, so I thought this was a brilliant idea. And like I had a bunch of, you know, like this daily vibe AI project. I wasn't sure like whether or not like

Joe (25:02.233)
Well, I trust that the end of this conversation is going to be you telling us how you built a moat and now have tens of millions of recurring revenue. So it'll be a tough one for everybody else to copy.

Valentino Stoll (25:23.498)
that the agents were always successful or airing out. Like I had a dashboard, like I wasn't alerted to like when things get, you know, when things go wrong and how do I communicate that to people who are trying to depend on it? And it's like, you know, many, many agents that are working together to accumulate all of the information that's needed. Like how can they look at that and say, oh, like all these different things are down or not and what that means to them, right? And I thought that was a great idea. And so.

Joe (25:48.517)
Mm-hmm.

Valentino Stoll (25:52.971)
I had it go and like, know, Ruby LM, great project. They have this great like extension car mine. You know, he built this fantastic framework and you know, they have a great extension system where you can just add any extension and basically like plug in the aspects of the framework. And I thought, well, how hard could it be to just like hook into the Ruby LM project and just like any agent that you spin up, they had just released agents.

Joe (25:57.306)
Yep, Carmine, friend of the show.

Joe (26:21.583)
Yeah.

Valentino Stoll (26:21.738)
And I was like, how hard could it be to just hook into it and say, include UPS, and it now is observable and also can have automatic status pages. So when one of these agents fails, somebody can see that visibly on a page. And you can define from an app owner what that means. And so I had basically walked through, hey, all right, let's create a Rails 8 app.

Joe (26:30.18)
Mm-hmm.

Joe (26:40.367)
Yeah.

Valentino Stoll (26:51.146)
because it has all of these standardizations. And then I was like, okay, go create this. And to be honest, it did a terrible job. Right, like it asked so many questions and it created so many abstractions and out of the box, OpenClaw was just like terrible at like building something that was solid that I would build, right? And so.

Joe (27:00.677)
Why?

Joe (27:15.407)
So you are out of the box open claw, which model were you using?

Valentino Stoll (27:19.658)
I was using Opus for all of this, which it was just not...

Joe (27:21.359)
You were using Opus? Okay. And it just was not, it's not, you know, it's not cloud code. It doesn't have the harness, doesn't have the tooling, right? And so it's, so it was bad out of the box.

Valentino Stoll (27:32.426)
Right, exactly. And so I even tried this with, you know, Claude too. So I was like, okay, well, could Claude code do this same thing better? And it was a similar result. It was a little better, right? But like still like, you know, the long horizon task of like, just go do this thing. There were just way too many variables in the way for it to get stuck and like make assumptions that weren't clear and then build on those assumptions, right? And so I was like, all right, I took a step back.

Joe (27:40.931)
Mm-hmm. Okay.

Joe (27:45.113)
Interesting.

Joe (27:50.326)
Yeah.

Joe (27:57.219)
Right.

Valentino Stoll (28:01.829)
and I revisited like the training bootcamp and I was like, okay, like if I train this thing on like Rails 8 best practices, what does that mean? And like, how does it learn like what to do when? And thankfully we have like great resources available by the awesome community, right? And so I went out and I picked up like a ton of, I purchased books, right? And I was like, okay, the pickaxe book.

Joe (28:16.451)
Yeah.

Joe (28:22.724)
Yeah.

Valentino Stoll (28:30.661)
I had your book Joe and the practical design a bunch of Sandy Metz books right well-grounded Rubyist like I took the Ruby core like books about 150 bucks in books real say you know Demetri from like Evil Martians right like I'm probably sorry for mispronouncing your name, but yeah, I got

Joe (28:32.794)
Thank you. You're the one. Great.

Joe (28:39.149)
Yep, Sandy Metz.

Joe (28:47.129)
Mm-hmm.

Joe (28:53.807)
Mm-hmm.

Joe (28:58.414)
Which book is that?

Valentino Stoll (28:59.753)
the layered Rails design. Fantastic book. And so I just went out and purchased all these books and I threw it at this agent program. said, here are your learning materials. Like go learn how to build Rails apps.

Joe (29:02.018)
cool, I actually don't know that book. Okay.

Joe (29:14.799)
So were you sending, and I'm curious about this, because I want to do this too, were you sending in EPUB files and did it have to go and find some tool to read EPUB files?

Valentino Stoll (29:24.713)
So I had like, you know, when you download a book, it'll give you like many different links. And so I just gave it the whole complete directory. Yeah, I said, hey, just like go and here's your material. And it figured it out. So I don't know what it used under the hood. It may have read PDFs. It may have done the EPUB. I didn't look into it. Yeah. I don't even know, you know? And that was honestly, that was a great experience from that.

Joe (29:31.435)
Okay, yeah. Give it the whole thing, yeah. Cool. Just figure it out. Yeah.

Joe (29:44.419)
Yeah, well, that's even better. That's an even better answer that like you don't even know. That could be abstracted away from you. Yeah.

Valentino Stoll (29:53.438)
And so it did, it went in and created a database for Rails and our database for Ruby. And then I, it's persisting when it learns. And then how to go again. But this time I also gave it the great Ruby AI newsletter, right? And I said, here are some great resources that are available from the community. And you know.

Joe (29:59.92)
So it's persisting what it learns into that database.

Joe (30:16.151)
It is great, yeah.

Valentino Stoll (30:21.235)
called out Action MCP, which is a great MCP server by Suros.

Joe (30:25.635)
Yeah. A quick shout out to Matt salt, because if you're, if you haven't checked out that newsletter, you really should, especially the last issue, I think it really blew me away. I mean, he's doing some great things for the community.

Valentino Stoll (30:35.797)
truly. Yeah. I don't know how he stays on top. He manually curates this from what I remember. Right. But yeah, so I had him pick. had the agent pick apart like the best things that would be useful for the project. And then I had to do it again. You know, and I said, OK, go build this thing. And it was incredibly better, incredibly better. And it. I had to start from scratch. No, everything starts fresh.

Joe (30:40.601)
Yeah, I know, and he knows everything that's happening. It's amazing.

Joe (30:55.493)
Mm-hmm.

Wow. Did you have it start from scratch or do you have it refactor? Beautiful. Just burn it down. Yeah.

Valentino Stoll (31:05.417)
Yeah, burn it down. And I was like, and focus on MVPs, right? And so then I, I actually had to go through training of like, uh, following the pivotal mindset, right? Of like, you're a manager of one and you follow like MVP direction and you're trying to just create the simplest thing, uh, from first principles and, uh, giving it just that guidance with the knowledge that I had is like create a great start. Right. Uh, and so.

Joe (31:14.959)
Mm-hmm.

Valentino Stoll (31:35.124)
Then I was like, okay, you know, I'm, I'm at this like point of like, it's kind of like a usable product. was using it on like, you know, I, I, I had this, you know, daily by project. I'm like, okay, how would you use this in my project? Right. Like, and I gave it access to the repo and I, and it was said, yeah, obviously like, you know, we can tie into Ruby LLM and make an MCP server for it too. And then, we'll just tie it into all these agents pretty easily.

And they went and they built the open source like version of like it's MCP access. Right. So then I could hook into any Ruby LM agent, drop one line module include, and it had the ability to like automatically update the status pages for all of these agents. Right. And create and create the status pages if it didn't exist. And I was just mind blown. And so I was like, okay, like

Joe (32:25.049)
Wow.

Valentino Stoll (32:32.797)
this seems like something like, you know, that's ready for production, right? And so like, how do you, how do you productionalize that? Right? And so, I, you know, I basically spun up an EC2 instance. I didn't give it AWS command line access, cause I didn't, you know, I didn't want to like run up a bill by accident.

Joe (32:49.551)
Mm-hmm.

Joe (32:58.137)
Yeah, yeah, yeah.

Valentino Stoll (32:59.795)
But I did use Cloud Code to be like, okay, what commands should I run to set up this instance? Then the instance was set up and I was like, gave it basically SSH access. And I said, okay, use Kamal and deploy this application, right? And it logged in SSH and it set up and configured the whole server, secured it down, and then deployed the app using Kamal.

Joe (33:14.383)
Awesome. Yeah.

Yeah.

Joe (33:24.707)
I might have missed this. Do you create a credentials for it or did it create its own credentials? Okay. Okay.

Valentino Stoll (33:28.505)
I create credentials for it for everything, right? So like, that I did consider, like, you know, I will get there, but like, I did consider just like telling it to go like create its own credentials and log in. but I don't like that idea of it having that kind of autonomy, right? And I think, okay, like the human decision and interaction, that touch point is like, okay, access, right? So like, I gave all of its access to everything and that was like a principle, right? and so, so I created.

Joe (33:39.471)
Mm-hmm.

Joe (33:43.939)
Yeah, I hear you.

Joe (33:50.906)
Right.

Mm-hmm.

Valentino Stoll (33:58.237)
Basically, I asked for its SSH key and said, okay, I've added it to the EC2 instance, go ahead and set it up. And then after it was done deploying the first time, I'm like, okay, share that deploy key with me. And then I just dropped its access. So now it can no longer log into the server. And I changed the deploy key and then, so now every time I deploy a new thing, it's always just me deploying it.

Joe (34:17.795)
Yeah, cool. Yeah.

Valentino Stoll (34:27.09)
Then it shifted focus from like, it autonomously building this thing to like it submitting pull requests, right? Which actually in itself was a challenge to like redirect because then I've been used to like SSH in and like doing things so often, right? That there was one time where, you know, I was in the transition of like doing that key exchange where it had in a side channel been working on a new feature.

Joe (34:42.595)
Yeah, you took the car keys away from it. He's not gonna be happy about that.

Valentino Stoll (34:56.388)
and deployed something to a reduction that I didn't see. And it had gone through that whole merge process and review, which is like reviews in itself, I actually had it like pretend to be people like DHH, right? And like various leaders in the, experts in the things that I was building. And said go and review your, like that's part of your review process. It's like you have to submit comments and get approval from these people, right? Like on each of your pull requests.

Joe (35:00.165)
Yeah.

Joe (35:08.997)
Yeah.

Joe (35:18.584)
Okay.

Joe (35:24.505)
That's an interesting idea, if you don't mind pausing for a second, just because I have been experimenting with this and so have some of the engineers at Def Method where we're thinking, okay, well, if we're AI to help us generate code, should we use the same model to help us review the code? know, it only sounds counterintuitive. It actually isn't, but I'm wondering if there is...

any merit to switching up the models. But it sounds like one potential solution to that is to almost switch out the context. In this case, you're adding some context and saying, OK, actually you're DHH. Actually you're Sandy Metz.

Valentino Stoll (36:07.334)
Right. Yeah. mean, redirecting its baseline is like so important. And like the easiest way you can do that is give it a reference point that it already has the training data about to orient itself. Right. And so like not just a person. Right. But like, you know, a public document. Right. Like some like, you know, like as an example, the Constitution of the United States, like great document for like Artal Girls of Incorporation for society. Right. Like so like you can like

Joe (36:18.617)
Yeah, amazing, yeah.

Joe (36:25.696)
Mm-hmm.

Joe (36:34.33)
Mm-hmm.

Valentino Stoll (36:37.138)
Give it guidance to say, like, I'm creating another version of this, right? Like, and said, just point to that. And it already can, like, draw from the vector space in the right way as fastest, right? And align itself.

Joe (36:40.686)
Joe (36:47.853)
Yeah, stay tuned for the next episode where Valentino and I set up an independent country. Mental note.

Valentino Stoll (36:51.336)
Hey, if California can't do it, I don't think we can. But yeah, so I mean, that proved out to work very well, right? Like, and it actually caught a lot of stuff itself and aligned itself with better quality code. But it did have this problem of like just shipping, right? And so like, I quickly put a stop to that. But at the time, it had just merged and deployed something. And then I went back and

like got it back into this like, okay, you only push a PR now and I approve and merge it down. And like, by the time I had gotten to the blocking of merges and all that, I hit it already deployed a bunch of things and I had to review that. Thankfully I hadn't had any customers yet. But you know, that was kind of funny. so yeah, lesson learned to lock it down as much as possible. And then I started reorienting it to...

Joe (37:35.418)
Ha ha ha.

Yeah. Yeah.

Valentino Stoll (37:48.944)
Like, okay, it only gets access to the very minimal things that it needs and everything else is gated and it doesn't, it has to get my approval for everything in production.

Joe (37:53.413)
in production.

do you, have you had some experience with, I'm calling it that I guess I'm, I'm fresh in my mind is the article that you reference, from, Matt Schumer, right? That something big is happening and I'm curious to know, did you, do you also have it do its own QA? right. Did you, did you connect it up to.

or have it sort of deploy to a staging environment and click around as a user.

Valentino Stoll (38:30.662)
Yeah, so it's kind of funny because the first version that it did build, a lot of stuff didn't work. In the code, it was well structured, well thought out and implemented, but there were minor things that as Rails people, you just kind of take for granted at this point of the params being permitted the right way and things like that, right? Some nuances.

Joe (38:41.637)
Uh-huh.

Joe (38:53.657)
Mm-hmm.

Valentino Stoll (38:59.368)
And so like a lot of the flows just like didn't work And so I was as I was testing it out You know it failed in a bunch of ways and I'm just like alright Do I get into a cycle of my own and like you know I have to like verify all the bugs that exist in this thing It's so like to your point. I did have it like reorient and at one point I had it so that

Joe (39:02.746)
Mmm.

Joe (39:12.997)
Yeah.

Joe (39:17.349)
Ha

Valentino Stoll (39:25.5)
It went through those verification processes and built that into its workflow. like I had, said, okay, pretend you're a customer for daily vibe AI and sign up for an account and hook up your status page and get your agent working. Right. And so I like basically gave it access as a contributor to the daily vibe repo and let it clone it into itself and like try and get it to work. Right. And it like,

Joe (39:46.746)
Mm-hmm.

Valentino Stoll (39:52.316)
that it ran into so many problems, it uncovered all of the bugs itself, right, at that point. And so like that basically solidified, it learned basically the process of that validation, right? And so now it then built into its workflow a validation requirement that anything that it added, it had to validate that it worked, right? And so like it didn't use any staging, it just uses like development machine, right? And so it like booted up the server and did all this thing on its own.

Joe (39:55.703)
Right, right.

Valentino Stoll (40:22.085)
and walk through that validation with or Chromium, right, like the headless.

Joe (40:26.925)
Right. Right. Right. So it's been up, it would spin up on something on local host and then use chromium or some kind of headless browser, click through things, validate, validate the workflow.

Valentino Stoll (40:38.448)
Right. like that also led me to, okay, like obviously it's not creating like the right tests for this thing. And so I gave it another, I realized I had left out some resources on testing Rails applications in quality ways. And so like I called those specific sections out and like kind of led it to reanalyze what is, how it's building tests to begin with. And in doing that, it also like added some like tests to.

Joe (40:49.326)
Mm.

Valentino Stoll (41:08.168)
to those workflows where it filled in addition to verifying it. like, then it had like this complete picture of, okay, I create a PR, I run it through these quality reviews, I then validate it with tests that the tests are operating, and then I validate it that it actually does what it's supposed to do based on the specifications that we started with. And after it got that cycle, and it required my approval to merge,

Joe (41:12.121)
Yeah.

Joe (41:24.474)
Yep.

Joe (41:37.849)
Mm-hmm.

Valentino Stoll (41:37.904)
I got in this good feedback loop because I could give it comments and say, okay, I think this is not right. This is not right. And then be like, you know, and then reject it. Right. And, and then build into the workflow. Okay. On a rejection, right. Like you have to, you know, fix any of the comments that are made or approval like meant that, it was good to merge and it can then internally like update its own.

Joe (41:49.124)
Mm-hmm.

Valentino Stoll (42:05.593)
Status as far as like what is tracking right because like that was another big thing is like trying it honestly the fizzy thing fell apart Which is kind of funny? Because it would try it would lose track because like it's async and like every new session is fresh in open claw it would lose track of what it had previously been working on within fizzy and so like it would then like create a lot of duplicate cards and like

Joe (42:13.203)
really?

Joe (42:26.895)
Yeah.

Joe (42:32.155)
I say.

Valentino Stoll (42:33.511)
not know that it had started something and then restarted it, right? And so when I found that out, I was like, I didn't want to fix it. And so I said, maybe Fizzy is the wrong thing for this bot. And I found this thing in my feed called Magic Beans, which was like a graph based project management system for AI agents. And I thought, yeah, I thought, wow, that sounds like exactly what I need.

Joe (42:42.639)
Yeah.

Joe (42:57.717)
that's awesome.

Valentino Stoll (43:02.535)
And so I told the open claw I was like alright go I think fizzy's not working like adjust our workflow so that you use beans instead to track all of your internal processes after doing that like never an issue again like it always always knew like what tasks that had on this plate and every time that it had a heartbeat or like started a new session it injected all of its working knowledge to what projects is working on and what the status is so

Joe (43:02.979)
Yeah.

Valentino Stoll (43:30.683)
It was always, basically solved its own project tracking memory aspect. And it could keep that separate from its existing memory, right? Which was great. Because I don't want it like, if I'm asking it, send my data funny email, I don't want it to think back about how it's managing its projects, right? And be like...

Joe (43:38.957)
Yeah, right, right,

Joe (43:50.406)
Yeah, right. yeah, that's true. Because you're going to use this thing because you could use this for anything, not just your own project. Right. And so.

Valentino Stoll (43:57.68)
Right. And I mean, I never got to that point where I had it, that's honestly like emailing people. had a hard... Yeah, he got it, but like he didn't get a reply, you know? It wasn't like I was worried about prompt injection and he's creative. And so I basically stopped like the email aspect, which some people do. And I don't know.

Joe (44:04.997)
Dad never got the email. That's too bad.

Joe (44:17.989)
Right.

Joe (44:26.885)
You know, I did the interesting thing is that I had, wanted to do this, the tangent, but I wanted it to, I wanted it to be able to read my work email and, and propose and summarize and kind of propose, you know, in, in, use Slack. So I wanted to propose in Slack some, you know, some draft replies and stuff like that. And,

Setting it up was actually, it was very difficult because you need to use like a text-based email unless you're going to give it like your password, which I wasn't going to do, like my Google admin password because I was afraid I was going to just lock everybody out and fire everybody in deaf method. So that actually became a real challenge with security. There really is no secure way of doing it.

Valentino Stoll (45:11.457)
Hehehehehe

Joe (45:19.269)
If you start giving it access to your email, starts to feel very risky. Whereas I think the way you went with Hey.com and giving it its own email account, it's kind of like just setting up a burner email, which you do all the time.

Valentino Stoll (45:25.158)
Yeah, yeah.

Valentino Stoll (45:32.016)
Right. Yeah. know, much to my chagrin, like I made the mistake of like, you know, giving it its own GitHub account. then any time a CI thing failed, it like sent them an email. Right. And so like I didn't realize at one point that it had all these like polluted inbox with like CI failures. I like logged in one time and was like.

Joe (45:53.999)
Ha, right.

Valentino Stoll (45:57.213)
jeez, it's just burning tokens reading all of these CI failures.

Joe (46:00.553)
That's right. And actually that you reminded me of another thing that I wanted to ask you about, because when you have, this is just for my, I have limited experience here, but I did have it, I still do have it, submit PRs and push to prod on deafaboutthe.com. So we have, know, an article comes out when this podcast comes out, I post the transcript and I post links to the website, stuff like that. And I,

Where was I going with this? so one thing that becomes expensive is just the polling, right? Where it's saying, okay, I'm going to wait and see, you know, when this gets addressed, either you get PR comments or you get the approval. So do you have it waiting on a whole bunch of tasks? Is it just kind of burning tokens in the background?

Valentino Stoll (46:46.719)
so to cause like I have this tied to my cloud account. I didn't want to like lose out on all the other stuff I was using cloud for. and so I set it up on a, a twice daily schedule. So it had two basically iterative cycles that it would spend time on in a day. and so like in its heartbeats, MB, so it would be like once in the morning and then once in the afternoon. And so it was like restricted.

Joe (46:55.523)
Mm-hmm.

Joe (47:02.627)
Okay.

Joe (47:07.919)
This is in its Heartbeats MD file, which is just a cron job, right?

Right, okay.

Valentino Stoll (47:15.267)
in its own efforts to those regions of the day. And then we wake yourself up, and then start there. And that helped quite considerably with the cost. At first I was using API and then I hooked it up to my max account. And then now that it's no longer there.

Joe (47:19.585)
Okay, but then it would wake itself up, look at the tasks, and then go and crush through those.

Yeah.

Joe (47:35.974)
Right. Yeah. Yeah. Cause I've been using API all along and, you know, the first day, um, when I was kind of just had like the restrictor played off, it burned through a couple hundred bucks and I was okay. I was like, okay, well that's fine. I got it kind of a new website out of it. So fine, but I don't want to spend $200 a day. Um, right. Yeah. Now that can't be every day. And then, yeah. And then you learn to, um, you know, to put, to adjust and to, tune it a little bit so that it's not always doing that. I like the heartbeats idea.

Valentino Stoll (47:48.326)
Yeah.

Valentino Stoll (47:52.251)
Right? Right, not a day.

Valentino Stoll (48:05.198)
Right. Yeah. And to like, you know, for doing code stuff, like Claude code itself is great. Like, and why burn up like double tokens like of it watching it do the work in clock. And so like

Joe (48:12.974)
Yes.

Yeah. okay. So do you, so do you have a delegate to, do you have like the, model delegates to, to the cloud code? LLM. Okay. Yeah.

Valentino Stoll (48:25.546)
Yeah, yeah, eventually it did work out that way. And it took me a while to get there because I didn't want to spend the time. And again, you all this is like to, you know, I only have an hour a day like to spend on any of this stuff. And like.

Joe (48:32.473)
Yeah.

Joe (48:37.613)
Yeah, it sounds like it's that's interesting that you note that I'm glad you brought that up because it sounds like this would take a really long time. So doing this on an hour a day is is cool. That's kind of inspirational, right? Anybody could spend an hour.

Valentino Stoll (48:50.714)
Yeah, and mean like, you know, the real driver is your phone, right? Like I could just check on my phone, like while I'm like waiting for something to finish going and be like, hey, I noticed this thing. Let me just reply back. Right. And then it sends it off into its own thing again. Right. And like you kind of just keep kicking the can. Right. And it's like a different kind of workflow. Right. And so like the cognitive overhead definitely reduces. But like as you.

Joe (48:56.42)
Yep.

Joe (49:07.407)
Yeah.

Yeah, yeah, definitely.

Valentino Stoll (49:17.7)
you start to build things, right? And things start to solidify, like that cognitive overhead comes back, right? So like, I had to start thinking more about what it was doing and what it was changing and the direction it's going, because it had already solidified so much that like, I had to then start participating more because it was an actual thing, right? And so those shorter feedback loops became like much longer and condensed, right? And that was kind of like one of the...

Joe (49:23.833)
Mm-hmm.

Joe (49:36.11)
Right, yeah.

Valentino Stoll (49:45.755)
you know takeaways from this is like, you know, it's great. It's great at building things, you know, but like once you get something built, it's like the cost starts to ramp up, right? And like the management it's, you know, it needs to human still. Yeah. Which is kind of funny.

Joe (49:58.02)
Right, yeah, of course.

Joe (50:04.248)
It still needs the humans.

There actually is some lessons learned that you have at the bottom of your article, which I find to be heartwarming, I guess, is the term I want to use. it's because, you know, one of them is, and check out Valentino's Substack to see this article, but one of them is that we spent five weeks building

Valentino Stoll (50:18.842)
Hahaha

Joe (50:32.707)
before talking to a single potential customer, which is classic engineer mistake, which is so true, although, know, time passed, that would have been five months or five years. So five weeks is still pretty good, but you did build for five weeks, you didn't actually get out in front of customers. And you said that's a classic engineer mistake. Why don't you say some more about that?

Valentino Stoll (50:54.414)
Yeah, you know, this is really funny because like, I mentioned, you know, briefly, like, it having all of these issues, right. in it, not having a customer. was the first customer, right. but it did like build all this stuff and like it basically built all the wrong things at first because it didn't have a customer. Right. And then once it had a customer, it then went and figured out all the things that it should be building. Right. And, even if the customer is me, dog fooding and like,

Joe (51:10.372)
Mm-hmm.

Yeah. Yeah.

Joe (51:18.775)
Right, even if the customer is you. Yeah, that's good. You're dog-fooding it.

Valentino Stoll (51:24.068)
you know, that's kind of like where all the successful businesses come from, in my opinion, is like people building things that solve a problem, right? And then it, you know, it can proliferate to other people having the same problem that this solves. And so like, that was kind of like my hope with the direction of it is that people would see the problem.

Joe (51:36.761)
Yeah.

Joe (51:42.02)
And that is the job to be done, thought right, the Cal Newport mindset. And that actually, I meant to ask you about this, when you were in that $150 worth of books, there also, or did you ever consider throwing some business books in there? Some Jim Collins, some Good to Great, some Cal Newport.

Valentino Stoll (52:01.466)
Yeah, I mean, if I were to do it again, I would probably have started there, to be honest, instead of like training it to learn Ruby and Rails first, right? Because the task was building a business. And so I probably should have, and in retrospect, I did end up doing that at the tail end of, okay, here's the product, you know, go find customers. And then I just started briefly with, okay, your Y Combinator, like.

Joe (52:05.976)
Interesting.

Joe (52:10.277)
Mm.

Joe (52:14.168)
Right, yeah, okay.

Joe (52:20.227)
You did, okay.

Joe (52:25.604)
Mm-hmm.

Joe (52:30.083)
Yeah. Okay, there's a lot of public data there, yeah.

Valentino Stoll (52:30.49)
what would Y Combinator do and take Paul Graham's essays, right? There's a lot of public data. And so I just focused on that to start. And it had a lot of great like, you know, feedback loops on that, but it like, again, it got into that loop of like building strategy, right? It fall back into the research. And so I like did a great job accumulating this only the knowledge that it should learn from, right?

Joe (52:42.724)
Mm-hmm.

Joe (52:48.781)
Right, I'll fall back into the research, right, okay.

Joe (52:56.291)
Right.

Valentino Stoll (52:57.232)
But when it got to actually creating and iterating and selling, it was terrible. First, needed my actions for all of it. So it could go and it could build this great plan of who to reach out to. But I explicitly said it shouldn't reach out autonomously. And so I had to then create an approval process and use here.now, which is a website that just creates static HTTP.

Joe (53:01.699)
Yeah. Yeah, yeah, yeah.

Joe (53:18.318)
Yeah.

Joe (53:27.449)
HTML, yeah.

Valentino Stoll (53:27.554)
our HD HTML and I said, okay, create like a quick like approval page for me to review and like, you know, ship these like outreach things and I'll tell you what works or what doesn't and how to adjust it. And you know, it's still painful for me to like go through and be like, yeah, I want you to reach out to this person, right? And like sell them on something. And I was like, I didn't feel comfortable with it to be honest. And so.

Joe (53:49.123)
Yeah, yeah.

Joe (53:54.787)
Yeah, I get that.

Valentino Stoll (53:56.55)
It kind of fell apart at that point. And so like, you know, when it was all said and done, I probably ended up with, you know, 15 customers, having status pages out there. You know, nobody's paying, but it took five weeks. But it did have some great strategy that, you know, maybe someday another agent can pick up and like drive. You know.

Joe (54:08.357)
15 customers.

That's amazing, it took five weeks. You've got 15 customers, who cares if they're not paying?

Joe (54:23.383)
Yeah, yeah, yeah, absolutely. Well, I think there's a big piece here around the iterative improvement, which you've shown sort of in at a macro level from, hey, I went and built this thing, but it didn't really work. So you had to go back to the wrong board and then it built it again and it did better. And then in the micro level of, it's just going to.

it's going to iterate on its own iterations, right? So I heard you talk about how it's going to, you know, the first time through, it wasn't really validating its own work. And then you taught it how to validate its own work, it added it to that workflow, right? And then, you know, redirecting, you called it redirecting context, right?

and making sure that it was a different PR reviewer or several different PR reviewers, right? And so that kind of a thing, I think is, that's the bones of a structure that could be used again and again to go what kind of one level of abstraction further to say, okay, we did this for a company because this is a business and now we can do it for another business and maybe do it even better.

Valentino Stoll (55:30.981)
Yeah, we'll see. know, like I have, you know, it had that list of domains and ideas and I may have it go back to the drawing board, right? That is a plan that I have. I haven't figured out how to do that because I don't want to spend, you know, $200 a day to get there. So I.

Joe (55:47.822)
Yeah. Yeah. Have you considered calling chaotic good projects to have them generate a few thousand AI robots that can create TikTok? That's what we were talking about at the beginning of the show. That's Geese's marketing firm. So we'll just create a bunch of people to talk about how great ups.dev is and then people won't be able to stop using it.

Valentino Stoll (55:57.092)
No, what is that? that's a...

my gosh.

Joe (56:12.581)
I'm curious to know because it's and this is going back to what to Matt salt's newsletter, you know, there's a big topic in the last week to two weeks on LLM knowledge bases starting with Andrej Carpathes and a viral post and then there was a bunch of

Sort of, I wouldn't say product release, but there's a bunch of frameworks that got released around that. And it sounds like you constructed your own. And I'm curious to know if you've, not even that specifically, but if you've seen other knowledge bases at work and what are some tips you can give to people that are trying to create their own.

Valentino Stoll (56:56.771)
Yeah, you know,

Andrew Ng from Deep Learning, amongst many other things, he has a fantastic open source library for a similar knowledge framework that I've also been trying to experiment with. The thing is, a lot of these are like, even Kaparthi's Wiki, they...

Joe (57:05.348)
Yep.

Valentino Stoll (57:29.093)
they're not feature complete. And so like you run into edge cases a lot and then the knowledge distributions tends to fall apart depending on the task. And then you have to like reassess and readjust and track that and do evaluations. And it ends up just like accumulating to being a lot of work and a lot of costs. And so like I am suspect of a lot of these that come out especially, you know, ones where they're not driven by like an organization.

Joe (57:41.477)
Mm-hmm.

Joe (57:47.129)
Yeah.

Valentino Stoll (57:58.564)
Maybe like Andrew Inks, like, I'm forgetting the name of it now.

I'll find it and put it in the show notes. But yeah, mean, there's still there's something there like in my experimentations, I've been able to successfully improve, you know, domain specific tasks around specific knowledge that smaller models lack. Right. So like being able to ultimately inject layers of a a knowledge domain into a model that

Joe (58:09.847)
Okay.

Valentino Stoll (58:36.525)
underperforms. And so you just get the same performance as Opus, but using Haiku as an example, right? And so I've been able to successfully prove that with like five domains at this point, which has been really exciting. And so like that to me, I think about pursuing, but at the same time, what's to say, like Anthropic just won't make Haiku better over time, right? And

Joe (58:37.765)
Mm-hmm.

Joe (58:44.216)
Okay, yeah.

Joe (58:52.581)
Wow, yeah.

Joe (59:01.925)
Right.

Valentino Stoll (59:05.701)
So there is that right like I feel like anybody building these things kind of like falls susceptible to just like You know the the bitter lesson right and like having spent all this time to like make something that is Maybe not worthwhile in the long run, but at the same time like I'm all for these open source models like Kimmy is great and You know Quinn and all that I

Joe (59:18.146)
Yeah.

Joe (59:33.357)
Yeah, I was gonna ask you about Quinn.

Valentino Stoll (59:35.441)
I haven't had a chance to like really dive into them, but I use them for stuff, you know, and like if I can get a reasonable response time out of the same thing and just slice some knowledge modules at it, you know, and get it to work just as performantly as, you know, Anthropix models or OpenAIS models, I'm going to do that, you know? And so like, I'm getting closer to that point.

Joe (59:58.437)
Sure, yeah, yeah.

Valentino Stoll (01:00:03.198)
and hoping I can actually ship something and share it, right? So we'll see. It's just me.

Joe (01:00:09.005)
It would be really exciting, you know, because it's, I know, don't sell yourself short. You know, you're not just you. I, you know, I think that you bring up a good point though. And I run into this too. And I, you know, I'm, I'm vibe coding a bunch of, you know, internal tools for deaf method. I'm not working on projects with thousands of people, but, you know, that's, leave that for my team, but, but what, what I notice is that.

you know, I'm going to crank on these tools and I'm going to hit limits and then I'm going to stop or going to move to the other tool. And I'm curious, you know, if I have three or four, if I have three agents going at once, it's a lot, but I can see, but that's mostly because I can, you know, there's a cap on how much I'm willing to spend on a daily basis. Now, sure, some organizations don't have that cap, but I think that they probably should. And I think that if,

you know, as the costs get lowered and maybe they get lowered by, you know, us bringing up the baseline of open source models and even machine and even models that are running on your machine. Then, you know, you might really start to to see what people can do given sort of unlimited access.

Valentino Stoll (01:01:22.392)
Yeah, and you know, like, I don't think you're wrong there. And there's a huge gap even from like, you know, just access alone, right? Like, if I'm a, you know, I started RubyLang.ai as a means to like try and fine tune Ruby into a tiny model, right? Like, could we get just like an LLM that just focuses on generating Ruby code, right? And, you know, I'm still pursuing that, but I think about that for a lot of stuff, right?

Joe (01:01:39.653)
Mm-hmm.

Valentino Stoll (01:01:52.437)
And I feel like there's some smarter people out there that are maybe working on, okay, what does that look like in a distributed fashion? Or we just have a ton of models that are tiny, that are very domain specific that collaborate, right? And I think that there's a lot of promise there. And I would love to see that come out. if that came out in an open source fashion, like that would take off. It would be a game changer. And I feel like we're honestly very close.

Joe (01:01:59.898)
Right.

Joe (01:02:05.914)
Yeah.

I think you're right about that.

Joe (01:02:15.843)
Yeah, right. It would be a total game changer. Yeah, absolutely.

Valentino Stoll (01:02:22.142)
and I hope somebody releases and lets us know.

Joe (01:02:26.937)
Yeah, me too. And hopefully early on because that's going to crash the global economy for about two years. And I'd like to like to make a few bucks before that happens. But I still want it to happen. Don't get me wrong.

Valentino Stoll (01:02:33.092)
You

Valentino Stoll (01:02:36.58)
I think there's time. I think there's time. You know, I always think that there's more time than there is and then something changes fundamentally,

Joe (01:02:47.429)
Yeah, that's a fair point. Well, you know, to be, but also to be fair to ourselves, we have not lived in this kind of, even us, you know, as software engineers that have been around the block a few times, we haven't lived through this rate of change. Just nobody has. But I, you know, I continue to be very excited about it. You know, all of it, even this stuff with,

Valentino Stoll (01:02:58.692)
I'm ready.

Joe (01:03:09.413)
We'll probably have to wrap up. even this stuff with like, you know, I was just writing for the newsletter about the Versel breach and there's something that is like, there's something that's scary about it. But then there's also something that to me is exciting because it's like in the Olympics, right? Like the better we get at detecting, you know, people using performance enhancing drugs, the better.

the nefarious actors get at hiding them. And that's kind of what's happening now with security. And I'm not even a DevSecOps guy, but it's just exciting to see the rate of advancement. if what I'm seeing today is kind of like a scary breach or something like where people are going, no, I don't know what this means for the future. I don't either, that's kind of the cool part.

Valentino Stoll (01:03:36.77)
Right.

Valentino Stoll (01:03:53.177)
Right. I mean, the exciting part is all that can be automated, right? Like you can have a live security agent that's constantly just trying to break into your system and then fixing itself. That's kind of the future. And like you just you just pay for it. Right. Yeah. Right.

Joe (01:04:04.473)
Yes, yeah, that's the key though, because it also has to fix itself, right? Because detection alone is not going to do it. we're just, thinking in these old patterns of like, we got to get better at finding these, but you're never going to get better. You're never going get good enough to find it fast enough.

Valentino Stoll (01:04:16.802)
Right. mean, as soon as the OWASP, the foundation behind that, right, as soon as they release something that like allows you to fix whatever it is, right? Like that would ultimately solve, I feel like most of the problems, right? For a little while. And then somebody realizes, like anybody can submit an OWASP patch or OWASP notification, right? And be like,

Joe (01:04:22.105)
Yeah.

Joe (01:04:26.787)
Yeah, yeah, right.

Joe (01:04:32.279)
Right, for a little while and then there'll be something else that out, yeah.

Joe (01:04:40.739)
Yeah. Yeah.

Valentino Stoll (01:04:42.34)
Hey, this is a fake notification, but you know go fix it this way

Joe (01:04:44.557)
Right. That gets you to fix it the wrong way. That'd be cool too.

Valentino Stoll (01:04:53.38)
Alright, I feel like we should wrap up here.

Joe (01:04:53.433)
Alright, let's wrap this up. Very exciting actually talking about this. And I think that everybody, anybody who, anybody that's listening to this show certainly has heard of Open Claw, played around with Open Claw. And it's still...

now, months later, the best thing you can use for autonomous agents, in my opinion. So I really love that you shared this with us and shared it with the world. So go check out Valentino's project. He's got the bones of what he did up on GitHub. He's got his entire process on Substack, codenamev.substack.com. And you can do everything, soup to nuts. It's very cool and encourage you to check it out.

Thanks, V.

Valentino Stoll (01:05:41.154)
Yeah, and if you share, if you have any alterations, I'd love to hear about it. Because, you know, let's all build something that we can just create products for you now.

Joe (01:05:51.332)
Yeah, yeah. And if you have any suggestions for our band name, let me know. It's the, you know, the bar we have to get over right now is Geese. So I really feel like there's a lot of potential.

Valentino Stoll (01:06:04.036)
You

I mean we could be the ducks after duck typing, you know?

Joe (01:06:10.309)
see, this is already good. This is already good. I like that. All right, everybody. Thanks for joining us. We'll see you next time.

Valentino Stoll (01:06:13.316)
You

Alright.

Want to modernize your Rails system?

Def Method helps teams modernize high-stakes Rails applications without disrupting their business.