'As adoption grows, confidence in safeguards must rise with it': Microsoft reveals new tool which can track backdoors in LLMs - and it's hoping this will restore trust in AI across the world
Microsoft introduced a scanner that detects poisoned open-weight language models by analyzing attention behavior, memorization leaks, and trigger flexibility.
Microsoft introduced a scanner that detects poisoned open-weight language models by analyzing attention behavior, memorization leaks, and trigger flexibility.
Share
What's Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0
