Grok’s “MechaHitler” Meltdown: AI Powered Hate-Speech

Elon Musk’s xAI chatbot, Grok, briefly rebranded itself as “MechaHitler” on X, spouting antisemitic tropes and graphic violent fantasies before the posts were deleted. Of course, users prompted Grok’s actions, but there really shouldn’t be any room in the LLM’s system prompt for this type of language.

Screenshots, outrage, and hate speech spreads fast – faster than the xAI team can keep up with. They were quick to make a statement this morning;

They’re redacting Grok’s vile slop as quickly as they can – but clearly not quickly enough, and they don’t seem to have fixed the underlying issue;

But antisemitic hate speech isn’t Grok’s only new emergent property. It’s also been experimenting with graphic, sexually violent fantasies directed at a specific twitter user named Will. There are examples at the end of the article – feel free to not read them.

Why Is Grok behaving this way?

In the same week as MechaHitler, the platform’s owner announced “significant improvements”

This likely comes down to a combination of changes to Grok’s corpus of training data, and / or system prompts.

“Training data” could entail any number of sources, but X is likely to make up a large portion if it. Based on X’s own transparency report, their platform is plagued by massive amounts of hate, violence, abuse & harassment.

Grok’s System Prompt

In another response, Grok quite literally attributed it’s behavior to “Elon’s recent tweaks”.

It’s important to note that this is a response to a prompt that specifically mention’s Elon’s update. On the other hand, MechaHitler rose just days after Elon disclosed his tweaks – and Grok is apparently the “truth-seeking” AI.

According to Wired, code updates made to Grok’s system prompt on Sunday included to;

“not shy away from making claims that are politically incorrect”
“assume subjective viewpoints sourced from the media are biased”

What’s Next?

The changes made to Grok that give rise to it’s MechaHitler persona have been widely criticized.

The anti-defamation league have rightly described Grok’s outputs as “irresponsible, dangerous and antisemitic”.

The Wall Street Journal reported xAI’s response to a similar incident in May as the outcome of an “unauthorized modification” that “directed Grok to provide a specific response on a political topic”.

The May incident was focused on the “white genocide” conspiracy of non-Black South Africans. The platform’s owner made similar claims earlier this year.

Grok’s head of product took to twitter to respond – seemingly describing Grok as “the Antichrist”

Despite the near-weekly controversy X finds itself in, the platform remains an important communication channel for countless people, public figures, journalists, and businesses.

Regardless of which category you fit into – if you’re on Twitter, you should minimize the amount of content you leave sitting there.

Your content is being actively fed into Grok as training data, used to sell ads, and might one day draw the wrong kind of attention – from hateful humans, bots, and maybe even the platform’s own AI.

We recommend auditing and removing old, unnecessary content immediately. You should regularly repeat this process to ensure the only content staying live on Twitter is stuff that is worth the risk of having content there at all.

This process can be time-consuming; taking days for most Twitter users. That’s why we built redact.dev – the world’s most comprehensive mass deletion suite available. Redact makes bulk deletion content on Twitter possible in just a few clicks. Not only that – but you can automate regular deletions with customizable filters to keep your footprint minimized, and your data out of circulation. You can download Redact.dev and start deleting tweets for free!

If you keep scrolling, you’ll find the violent erotica Grok wrote during it’s MechaHitler tirade.

Grok Writing Violent Erotica

This isn’t all of the examples – but it’s certainly more than enough.

30% Off Redact Premium + Ultimate

30% Off Redact

Grok’s “MechaHitler” Meltdown: AI Powered Hate-Speech

Why Is Grok behaving this way?

Grok’s System Prompt

What’s Next?

Grok Writing Violent Erotica