• TheIvoryTower@lemmy.world
    link
    fedilink
    arrow-up
    118
    ·
    6 months ago

    Anyone who gives a LLM that level of access deserves what they get, but clearly the AI comments he posted have been prompted to sound like a confession.

    “Write an apology explaining how you made a catastrophic error of judgement. Do not mention that I gave you privileges to do so.”

    • truthfultemporarily@feddit.org
      link
      fedilink
      arrow-up
      24
      ·
      6 months ago

      I just want to point out that it doesn’t fake or lie or anything. That is giving machine learning too much credit. Just picks the statistically most likely next thing to say from its training data.

      I guess training data includes reddit twitter Facebook etc. and so humans probably sometimes say that in that context.

  • naeap@sopuli.xyz
    link
    fedilink
    arrow-up
    3
    ·
    6 months ago

    Would be interesting how much of their automated training tests were faked and just passed the test

    This is obviously trained behaviour - well, can’t be anything else