• hedgehog
    link
    fedilink
    arrow-up
    1
    ·
    8 months ago

    If you’re using Claude on the web, sure. If you use Claude Code or another tool that’s intended for that purpose, then you can have it write code against a test suite you wrote and to keep iterating until it works. It can read your other source code files, run unit tests, linters, build and run the app, install dependencies, etc., and can analyze the outputs of those commands to see if they’re valid and exist.

    If you don’t even have search enabled when you query it, then how is it supposed to validate whether a theoretical library actually does exist? It’s basically whiteboard programming at that point.

    And that’s just how all these LLMs have been built. they MUST provide a solution so they all lie. they’ve been programmed this way to ensure maximum profits.

    In a roundabout way, sure, but only because they’d be worse if they were more capable of saying “I don’t know.” Because LLMs operate on probabilities and relationships between words, if they were trained with a large number of responses of “I’m sorry, Dave. I’m afraid I don’t know that,” then it’s very likely that the “I don’t know” response would come up regularly when it otherwise would have given an answer. Besides, as a person sourcing training data for the LLM, if you’re coming up with questions to train it on, if the question has an answer, wouldn’t you want to include the answer rather than an “I don’t know” response?

    Training a large number of “I don’t know” responses could also limit the LLM’s ability to share responses that it does know, but that weren’t explicitly in its training data.