onehundredsixtynine@sh.itjust.works to Fuck AI@lemmy.worldEnglish · 2 months agoCommunity idea: AI poisoning place for deliberate gibberish postingmessage-squaremessage-square12linkfedilinkarrow-up140arrow-down15file-text
arrow-up135arrow-down1message-squareCommunity idea: AI poisoning place for deliberate gibberish postingonehundredsixtynine@sh.itjust.works to Fuck AI@lemmy.worldEnglish · 2 months agomessage-square12linkfedilinkfile-text
minus-squareGrimy@lemmy.worldlinkfedilinkarrow-up4·2 months ago In a joint study with the UK AI Security Institute and the Alan Turing Institute, we found that as few as 250 malicious documents can produce a “backdoor” vulnerability in a large language model—regardless of model size or training data volume. This is the main paper I’m referencing https://www.anthropic.com/research/small-samples-poison . 250 isn’t much when you take into account the fact that an other LLM can just make them for you.
minus-squareonehundredsixtynine@sh.itjust.worksOPlinkfedilinkarrow-up2·2 months agoI’m asking about how to poison an LLM; not how many samples it takes to cause noticeable disruption.
minus-squareGrimy@lemmy.worldlinkfedilinkarrow-up1·edit-22 months agoBro, it’s in the article. You asked “how so” when I said it was easy, not how to.
This is the main paper I’m referencing https://www.anthropic.com/research/small-samples-poison .
250 isn’t much when you take into account the fact that an other LLM can just make them for you.
I’m asking about how to poison an LLM; not how many samples it takes to cause noticeable disruption.
Bro, it’s in the article. You asked “how so” when I said it was easy, not how to.