Houses on the sand
I decided to go all in on vibe coding recently, I try to stay objective on topics until I have my own experiences with them. So I did my best to put my initial bias’ aside and jumped in, so this post is about that experiment.
I’ve always jotted down ideas, usually things I would never get around to for one reason or another. Usually the ideas aren’t good enough to spend the time on, even if they are a good idea. I know that sounds like an oxymoron, but I work full time and I have an 18 month old son, and he’s my priority, I’d rather be present for him than hacking away on something that might be useful but ultimately inconsequential.
My first project was a Gym Clock, similar to one you might see in a CrossFit gym, looks like an old school alarm clock with the lcd letters. My idea around this was more around an idea I had for an interaction I thought would be pretty cool; In portrait mode you would see all the settings that you could configure, with a small preview of the clock. In landscape it would be a fullscreen display of the clock. I booted up VS Code, I think I chose Opus 4.5 for this, wrote out a big prompt for exactly what I wanted and to my surprise it more or less one shotted this app. I did suggest technology in the prompt, I decided on React Native and Expo, and nudged it towards some better practices e.g. don’t use useEffects for data, use dumb components and make use of hooks etc. The code was okay, not perfect, it wasn’t spaghetti, it was easy enough to work with and if I was to refactor it. The scope of this app was very small and code footprint matched it.
A few other honorable mentions before I talk about the main project I focused on.
- RSS App, I wanted to make a single feed RSS app that resembled what you might see in a modern social media app. The goal was to have something that could work completely offline and wouldn’t require a login. This obviously comes with it’s own limitations, but it is something I’ve wanted for a while. This went similarly to the Clock app, the code was okay, managable and if I wanted to progress with it, it would be simple enough.
- “Fitness” Competition app, basically there is a gap in the market for a good CrossFit/Hyrox/Fitness competition application. I identified a number of issues that could be solved with good software, and a native application to tie them all together. A bit part of this was going to be WebSockets, and this was where i wanted to put Laravel through it’s paces, as a library called Reverb had emerged which seemed pretty promising. You might ask why I didn’t use something like Go or even NodeJS/Bun, my main thing was security here, I didn’t trust the LLM with this, and Laravel has policies included in the framework so I saved a lot of time and tokens by making use of those. The backend of the project itself actually came out really well, it relied heavily on Laravel primitives. The Native aspect of this, again was fine, would need work.
- Server Hardening Cli tool, I always forget what I need to do when provisioning a server and making it secure, it’s part of the reason why I will rely on services like Railway. I am primarily a Front End developer so it’s not often day to day I would be interacting with servers directly, so this is something I always just forget about. So I used Go with bubbletea and other charm libraries to make an interactive cli tool that I can run on a server to basically make sure it is setup correctly, and it’s presented in a friendly way so I can see exactly what is happening, but also there are a few things that need input during setup so I wanted it to be not as scary to someone who may not have done this before. I am not a Go dev so I am not the best person to Judge the code out put, but similarly to the React native code, it was fine and inoffensive.
- CMS, this I could rant about for hours on, not because what the AI generated was bad, but because the CMS landscape is hell. There was the whole Wordpress Drama, Payload is tied to NextJS (for how i’d like to work with it), Headless CMS’ are overkill for most small sites, and similarly to Payload require a meta FE framework to render content properly. Anyway, my requirements Were to have custom post types, be able to create forms that can be embedded on the FE, have “blocks” that you could build up content with on the page, and also to be able to preview updates “live” from the editor. I again used Laravel for this and I was surprised at how well this came out, I again focused on only requiring JS when it was needed vs using something like React from the get go for the UI. This is def a project I will return to because it just ticked all the boxes I think most people would want.
I think I got through thirteen projects in total during this time, and I don’t want to ramble on about all of them, but I plan to write about some of the more interesting ones in more depth on here once I have the “experiments” section added. I felt great being this productive, and approaching vibe coding with a rapid prototyping mentality, it meant I wasn’t precious about the output, it was just enough to demo functional concepts to people.
The more “serious” project I undertook was very much inspired by how much I dislike Jira, the main reason is context switching. Every team I have ever worked on has their own setup for Jira and it’s just another thing to contend with when working. On top of really disliking writing tickets using their UI. At the company I worked for at the time I found that Atlassian had an MCP server and to my surprised it wasn’t terrible to use. I started interacting with the tickets directly in my IDE and this was a game changer. Being able to ask for what ticket was next inline to work on, being able to create tickets for bugs, creating documentation (confluence), but all with the context of the codebase being right there it was great. So I decided to make my own self hosted version of this, I also thought it would be practical for all these projects, as I could track where things were, as the LLM could full interact with the tickets it would be taking on.
The main issue with MCP is security, but it seems as of this experiment and the time of writing this OAuth 2.1 is the way of authenticating yourself for a service. So this meant, yet again, Laravel was the answer, as there is a library called Passport that supports this out of the box. This meant time and tokens weren’t wasted on writing this part of the application, it just needed to configure it. There is also a first party MCP Laravel library to, so getting the first POC of this working was relatively fast, it may have taken 3 hours to get to something where I could confidently use it through my IDE. If I stopped here things may have been different, like the other projects I kept asked to keep the code “simple” using blade templates, and minimal JS (mostly just used for drag/dropping on the board). However the scope changed as I got more excited about the project.
I started to think this could be a product people might want, and not just a toy project you could spin yourself. The scope/spec fundamentally changed, to keep this short, I wanted to bring some common things together under one roof, communication, documentation & idea generation. This exploded the scope of the app, a desktop app would be needed down the line, which meant I had to decouple the blade templates into a dedicated front end application, as this allowed me to reuse that code for the electron app in the future.
This became hell, what I will say the backend part of the application was able to change well and the code quality remained the same (good). If you’re using an established tool with opinions, be that Laravel, Rails, Django, NestJS or AdonisJS (more could be named) you’re probably going to have a good time with specs changing. LLM’s are already trained on these frameworks, so it knows how to say make a controller/model/view, it doesn’t need to read your code to figure out what the hell is going on. The front end however…
This is where the pain really started. Front end applications are known for not having a standard structure, it doesn’t really matter what method or framework you use, there are just an infinite amount of ways of doing it. LLM’s are prediction machines and if you give them chaos to train on, chances are you will get chaos out. React especially. I am not going to dunk on React to much, as I’d probably still reach for it 9/10 times (this is mostly because of React Native/Expo), but it isn’t without fault. LLM’s seem to love using useEffects, Tailwind and creating giant hooks if you leave it to its own devices.
During this journey I had spent very little time looking at the output, as I wanted to keep to the ethos of what vibe coding is supposed to be, and also for the majority of the time I was using it to rapid prototype, not ship everything it built to production. Once I started taking project seriously I needed to start looking under the hood more, and it got noticeably bad when I migrated from the blade templates to a dedicated React app. The complexity in some parts was staggering, even though I got it to use Tanstack Query, during mutations for dragging/dropping tickets as one example, it had nested logic statements, with nested loops. Compared to the JS version which was only 14 lines, that optimistically updated the list of tickets for the column and then made a patch request to the server. The LLM didn’t take any inspiration from the simplicity that existed prior to the migration.
I decided that I wanted to remove Tailwind and opt for CSS, why? I wanted to play with some “modern” things like popovers, anchor positioning, :user-input & container queries, and I found sticking to traditional ways of writing CSS was more efficient as less tokens were needed for the component changes overall. This did end up being the most painful part and where I burnt the most tokens, as LLM’s love Tailwind and really struggle with modern CSS (nesting, not using BEM etc).
I wasn’t sure what the final thing was but last week I threw in the towel. I just wasn’t confident in the output that was being produced and ended my experiment with vibe coding. I found myself spending more time telling the LLM to fix what it had done than working on the fun parts. I feel like LLM’s are like a genie or a monkey paw, you can wish for what you want and you will get it, but will have some weird twist. You can ask a genie to build a house on the beach, and it would, but the chances of it having the right structural integrity to be sound in that location would be questionable. I am lucky enough to be technical enough to know it’s bad and what to fix, but others don’t have this knowledge and it worries me.
Overall I will be returning to a more trad way of coding, and making use of LLM’s where they excel, which I found was small well scoped tasks. This also comes at a time when the prices seem to be increasing, I did rely heavily on Copilot’s for most of this, as you could get away with a lot on their plans which (predictably) is changing on June 1st.