Over the past few years, the evolution of AI-driven tools like GitHub’s Copilot and other large language models (LLMs) has promised to revolutionise programming. By leveraging deep learning, these tools can generate code, suggest solutions, and even troubleshoot issues in real-time, saving developers hours of work. While these tools have obvious benefits in terms of productivity, there’s a growing concern that they may also have unintended consequences on the quality and skillset of programmers.
I’m a 10+ (cumulative) yr. experience dev. While I never used The GitHub Copilot specifically, I’ve been using LLMs (as well as AI image generators) on a daily basis, mostly for non-dev things, such as analyzing my human-written poetry in order to get insights for my own writing. And I already did the same for codes I wrote, asking for LLMs to “Analyze and comment” my code, for the sake of insights. There were moments when I asked it for code snippets, and almost every code snippet it generated was indeed working or just needing few fixes.
They’ve been becoming good at this, but not enough to really replace my own coding and analysis. Instead, they’re becoming really better for poetry (maybe because their training data is mostly books and poetry works) and sentiment analysis. I use many LLMs simultaneously in order to compare them:
explode
function? “Sorry, can’t comment on texts alluding to dangerous practices such as involving explosives”, I mean, WHAT?!?!)As you see, I tried almost all of them. In summary, while it’s good to have such tools, they should never replace human intelligence… Or, at least, they shouldn’t…
Problem is, dev companies generally focus on “efficiency” over “efficacy”, wishing the shortest deadlines while wishing some perfection. Very understandable demands, but humans are humans, not robots. We need our time to deliver, we need to cautiously walk through all the steps needed to finally deploy something (especially big things), or it’ll become XGH programming (Extreme Go Horse). And machines can’t do that so perfectly, yet. For now, LLM for development is XGH: really fast, but far from coherent about the big picture (be it a platform, a module, a website, etc).
That’s not a problem, nor Claude’s main problem.
Claude’s main problem is that it is frequently down, unreliable, and extremely buggy. Overall I think it might be better than ChatGPT and Copilot, but it’s simply so unstable it becomes unusable.