• Ashelyn@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    3
    ·
    9 months ago

    I mean, AI is used in fraud detection pretty often; when it hits a false positive (which happens frequently on a population-level basis), is that not a hallucination of some sort? Obviously LLMs can go off the rails much further because it’s readable text, but any machine learning model will occasionally spit out really bad guesses almost any person could have done better with. (To be fair, humans are highly capable of really bad guesses too).

    • Womble@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      9 months ago

      No, false positives and false negatives are not hallucinations. Otherwise things like a blood test not involving any ml would also be halucinating which removes all meaning from the term.

      • Ashelyn@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        2
        ·
        9 months ago

        That’s fair. I think fundamentally a false positive/negative isn’t that much different. Pretty much all tests—especially those dealing with real world conditions—are heuristic, as are all LLMs by necessity of the design. Hallucination is a pretty specific term given to AI as an attempt to assign agency to a system that doesn’t actually have any (by implying it’s crazy and making stuff up instead of a black box with deterministic inputs and outputs spitting out something factually wrong but with a similar format to what is trained on). I feel like the nature of any tool where “you can’t trust this to be entirely accurate” should have an umbrella term that encompasses both types of providing inaccurate info under certain conditions.

        I suppose the difference is that AI is a lot more likely to randomly go off, whereas a blood test is likelier to provide repeated false positives for the same person with their unique biology? There’s also the fact that most medical tests represent a true/false dichotomy or lookup table, whereas an LLM is given the entire bounds of language.

        Would an AI clustering algorithm (say, K-means for instance) giving an inaccurate diagnosis be a false positive/negative or a hallucination? These models can be programmed on a sliding scale and I feel like there’s definitely an area where the line could get pretty blurry.