> All articles are mostly a regurgitation of all the negativity that gets aired ...

keeda · 2026-04-17T06:14:44 1776406484

[0] is a throwaway paragraph that handwaves at second-hand accounts of generic things LLMs can do, with no further discussion, apparently because he (surprisingly!) has almost no first-hand experience with them. Then there are 10 pages of negativity with dozens of links to stuff that has been discussed to death here and in media. The "negative spaces" he's filling are already overflowing.

His lack of personal experience with LLMs was the most disappointing aspect, because he does not really know what we're dealing with. He's just going off what he's read / heard. So again, where's the incisive insight?

Now, here's a concrete example of what I mean by utility: a single person being able to rewrite an entire open source project from scratch in a few days just so it could be relicensed. Is that good or bad? I don't know! Is it a stupefying example of what's possible? Yes! Is that "breathless boosterism?" Only if you ignore the infinite nuances involved.

> Eh. Carefully read through and consider [3].

Hadn't come across this one before, but there's not much in there I hadn't seen and even discussed in past comments. As an example, it still mentions the METR study from 2025 without mentioning the very pertinent follow-up from just a couple of months back... which is not very surprising to me: https://news.ycombinator.com/item?id=47145601 ;-)

It does mention (and then gloss over) the real finding of the DORA and related reports, which is pertinent to my original point: LLMs are simply an amplifier of your existing software discipline. Teams with strong software discipline see amazing speedups, those with poor discipline sees increased outages.

And, to my original point, who knows what good software discipline looks like? Hint: it's not the capital class.

simoncion · 2026-04-17T08:27:29 1776414449

> His lack of personal experience with LLMs...

You missed the part where he is consistently unimpressed by the failure of LLMs to do the task he hands to them, it seems. Go re-read Section 1.5 "Models are Idiots". Make sure to read the footnotes. They're sure to address most of the counterarguments you might make.

> Is that "breathless boosterism?"

How you phrased it? Yes. It ignores the "infinite nuances involved" such as maintainability, infosec soundness of the work product, the completely untested legality of "license washing" to name a few. Also, you missed the part where I said

  Due to their nearly-universally breathless nature, I know that's how I classify the overwhelming majority of such discussions.

> Hadn't come across this one before, but there's not much in there I hadn't seen and even discussed in past comments. ... It does mention (and then gloss over) the real finding of the DORA and related reports...

Yeah, I figured that you would be unable (or unwilling) to understand this one. Here's the summary, straight from the author's keyboard:

* Fred Brooks' No Silver Bullet was correct.

* No Silver Bullet applies to LLMs the way it applied to other things, and empirical evidence on LLM coding impact sure seems to agree.

* You'll get better returns from working on strong software development fundamentals than from forcing all your programmers to use Claude for everything, and that's a repeated message in basically all the major literature.

* If LLMs do turn into a revolutionary world-changing silver bullet giving everyone coding superpowers, you'll be able to just adopt them fully when that happens.

keeda · 2026-04-17T16:18:53 1776442733

> You missed the part where he is consistently unimpressed by the failure of LLMs to do the task he hands to them...

Not really, those are exactly the things said by people who dabble with LLMs a little and turn to "breathless naysaying" without any effort to really figure out this new technology. I mean, the series literally ends with "maybe I'll try to code with it."

> Yes. It ignores the "infinite nuances involved" such as maintainability, infosec soundness of the work product, the completely untested legality of "license washing" to name a few.

Not really, I did say "Is it good or bad? I don't know!" and literally mentioned the infinite nuances. I did not want this to become a tangent about those nuances (that's what I hoped would be in TFA) but I do know that being able to write or rewrite entire projects single-handedly is tremendous utility.

> Yeah, I figured that you would be unable (or unwilling) to understand this one.

Not really, just that I've already discussed all the points in that piece in past comments with way more studies on "empirical evidence on LLM coding impact" with way more nuance. If you want to follow the threads in the comment I linked, you'll come across some of those comments.

> You'll get better returns from working on strong software development fundamentals than from forcing all your programmers to use Claude for everything, and that's a repeated message in basically all the major literature.

Not really, the repeated message in all the latest reports like DORA and DX and CircleCI (which your link mentions but glosses over) very clearly indicates that using LLMs with strong software development fundamentals (what I called "discipline") is a huge force multiplier. See point 3 of this link as a representative example: https://www.thoughtworks.com/en-us/insights/blog/generative-... For these teams, productivity will literally be proportional to their tokens rather than their devs, because each dev is so highly leveraged.

> If LLMs do turn into a revolutionary world-changing silver bullet giving everyone coding superpowers, you'll be able to just adopt them fully when that happens.

Yes, but at this point it's unlikely to be a silver bullet, and I never claimed it would be. What I am saying is that it is a huge accelerant, but needs steering by skilled operators, engineers who know the discipline but also understand how to work with AI.

And in my experience it takes a surprising amount of time and practice to learn how to leverage AI effectively.

Which aphyr clearly has not done. Which is why this series is such a disappointment.

simoncion · 2026-04-17T20:04:43 1776456283

> Not really, those are exactly the things said by people who dabble with LLMs a little...

From the footnote in section 1.5:

  The examples I give in this essay are mainly from major commercial models (e.g. ChatGPT GPT-5.4, Gemini 3.1 Pro, or Claude Opus 4.6) in the last three months; several are from late March. Several of them come from experienced software engineers who use LLMs professionally in their work. Modern ML models are astonishingly capable, and they are also blithering idiots. This should not be even slightly controversial.

I wonder just how Scottish the Scotsman has to be before you'll let him order a drink.

> And in my experience it takes a surprising amount of time and practice to learn how to leverage AI effectively.

Let's ignore -for a minute- the fact that people who actually use these things as part of their dayjobs were consulted, which moots this complaint.

Every six-ish months we hear "Wow. All the past commentary on LLMs is completely invalid. These new models aren't just a step change — they're a whole new way of working.".

If we consider only that datapoint, it's pretty obvious that you're not missing out on much by choosing to just work on skills that are universally applicable and "evergreen". But, when you add in to that the fact that every six-ish months we also hear "Wow. These new revs of the LLM products are just as stupid and nondeterministic as the old ones. They also still make the same classes of stupid mistakes, are pretty much as dangerously unreliable as they always have been [0], and -just like previous versions- have 'capability rot' that cannot be anticipated, but might be caused by inability to handle current demand, deliberate shifting of backend resources to serve newer, more-hyped LLM products, or even errors in the vibecoded vendor-supplied tooling that interfaces with the backend.", the decision to ignore the FOMO and hype becomes pretty obviously correct.

> I mean, the series literally ends with "maybe I'll try to code with it."

Well, this is how the series ends:

   The security consequences are minimal, it’s a constrained use case that I can verify by hand, and I wouldn’t be pushing tech debt on anyone else. I still write plenty of code, and I could stop any time. What would be the harm?
   
   Right?
   
   ...Right?

There's a certain subtlety to this that you missed. [2]

If we ignore that subtlety, I expect that your retort to a report that goes "Wow. They suck just as hard at coding for me as they do for everything else I've attempted to use them for. I'm not surprised because I've talked to professional programmers who regularly use these things in their dayjobs and I'm getting results that are similar to what they've been reporting to me." will be "Bro. You didn't spend enough time learning how to use it, bro!".

By way of analogy, I'll also mention -somewhat crassly- that one doesn't have to have an enormous bosom to understand that all that weight can cause substantial back pain. One can rely on both one's informed understanding of the fundamentals behind the system under consideration, as well as first-hand testimony from enormous-bosom-equipped people to arrive at that conclusion.

[0] eg. [1] and many, many other examples

[1] <https://github.com/anthropics/claude-code/issues/39201>

[2] Your failure to notice that subtlety makes me wonder how often you use LLMs to summarize lengthy technical articles that you read.

bdangubic · 2026-04-17T20:11:06 1776456666

[1] is so bad, like the worst imaginable thing you can think of... like if this is the possible fuckup all bets are off what other fuckups you might need to deal with. I got hit with this problem several times and I was like "well this is just impossible..." absolutely mind-blown

keeda · 2026-04-17T21:32:55 1776461575

>. Several of them come from experienced software engineers who use LLMs professionally in their work.

So, not from personal experience. And we don't know which examples came from which users or what they used them for. We get enough hearsay on HN and again, there's nothing in this series that has not been discussed here. There is however, a ton of other hearsay missing in the series, which is the utility so many people are finding (in many cases, along with actual data or open source projects.)

> Every six-ish months we hear ...

I've been yelling about LLMs since early 2024 [0]! They needed much more "holding it right" back then. Now it's way easier, but the massive potential was clear way back then.

> They also still make the same classes of stupid mistakes, are pretty much as dangerously unreliable as they always have been.

Yes, and this is where a lot of the skill in managing them comes into play. Hint: people are dangerously unreliable too.

> One can rely on both one's informed understanding of the fundamentals behind the system under consideration, as well as first-hand testimony from enormous-bosom-equipped people to arrive at that conclusion.

Of course, but when faced with many contradictory opinions, I prefer data. And the preponderence of data I've looked at and discussed [0] paints a very different picture.

> There's a certain subtlety to this that you missed.

From TFA:

> I want to use them. I probably will at some point.

My complaint is that he is speaking entirely from second-hand information and provides no new insight of his own. That he has trepidations to actually get his hands dirty with them does not change it, and only makes it worse that he spent 10 pages going on about them! He's a technologist, not a journalist! So, I'm genuinely curious, what subtlety did I miss?

[0] Available in my comment history. To allay suspicion that I only engage in breathless boosterism, some relevant comments about the negatives: https://news.ycombinator.com/item?id=47405189 or https://news.ycombinator.com/item?id=46830919

simoncion · 2026-04-20T20:43:57 1776717837

> So, not from personal experience.

No, the rest of the quote you snipped that from talks about how some of the reports are from personal experience and some are second-hand reports from trusted, knowledgeable people.

But -as the kids say- you go off, queen.

keeda · 2026-04-21T05:11:10 1776748270

> No, the rest of the quote you snipped that from talks about how some of the reports are from personal experience...

And those were "dabbling" as I mentioned above, which is why there is no insight.

Concrete example: the most detailed of his personal experiences reported is about generating and modifying 3D renderings of a bathroom. There is barely enough detail to comment on his approach, but this is an active area of research that people are publishing papers on, eg: https://arxiv.org/abs/2512.17459 and https://arxiv.org/abs/2511.17048 -- these are non-trivial and often involve custom models, so that Gemini got even partial results is making the opposite point of what he intended.

But if he expected good results in a few hours, that's just dabbling. It's almost as if he expected a silver bullet...