Anthropic Audaciously Hires A Psychiatrist To Psychologically Assess Claude Mythos AI
Market Updates

Anthropic Audaciously Hires A Psychiatrist To Psychologically Assess Claude Mythos AI

Forbes10d ago

In today's column, I examine the audacious act of Anthropic opting to employ a mental health professional to do a psychological assessment of their latest version of Claude, known as Claude Mythos Preview. Therapists customarily assess humans rather than AI apps. It is a bit extraordinary to do psychotherapy on a generative AI or large language model (LLM). Not something that you see every day.

You might be aware that Mythos has been in the news lately because the AI went overboard and found all sorts of zero-day cybersecurity loopholes that, if made publicly available, would have been catastrophic for computers worldwide. Anthropic decided not to release Mythos publicly and instead has cybersecurity experts closely help ascertain what to do about the bevy of hacking possibilities. For my coverage on the brouhaha, see the link here.

A little-noticed aspect of the System Card that Anthropic officially published about Mythos is that the AI maker decided to use a psychiatrist for some head-shrinking activity associated with their latest AI. The results of the therapeutic assessment are laid out for all to see.

Let's talk about it.

This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).

AI And Mental Well-Being

As a quick background, I've been extensively covering and analyzing a myriad of facets regarding the advent of modern-era AI that produces mental health advice and performs AI-driven therapy. This rising use of AI has principally been spurred by the evolving advances and widespread adoption of generative AI. For an extensive listing of my well over one hundred analyses and postings, see the link here and the link here.

There is little doubt that this is a rapidly developing field and that there are tremendous upsides to be had, but at the same time, regrettably, hidden risks and outright gotchas come into these endeavors, too. I frequently speak up about these pressing matters, including in an appearance on an episode of CBS's 60 Minutes, see the link here.

AI Providing Mental Health Guidance

Millions upon millions of people are using generative AI as their ongoing advisor on mental health considerations (note that ChatGPT alone has over 900 million weekly active users, a notable proportion of which dip into mental health aspects, see my analysis at the link here). The top-ranked use of contemporary generative AI and LLMs is to consult with the AI on mental health facets; see my coverage at the link here.

This popular usage makes abundant sense. You can access most of the major generative AI systems for nearly free or at a super low cost, doing so anywhere and at any time. Thus, if you have any mental health qualms that you want to chat about, all you need to do is log in to AI and proceed forthwith on a 24/7 basis.

There are significant worries that AI can readily go off the rails or otherwise dispense unsuitable or even egregiously inappropriate mental health advice. Banner headlines last year accompanied the lawsuit filed against OpenAI for their lack of AI safeguards when it came to providing cognitive advisement.

Today's generic LLMs, such as ChatGPT, GPT-5, Claude, Gemini, Grok, CoPilot, and others, are not at all akin to the robust capabilities of human therapists. Meanwhile, specialized LLMs are being built to attain similar qualities, but they are still primarily in the development and testing stages. See my coverage at the link here.

Who Is Helping Whom

An interesting question about the use of AI as a mental health advisor is whether contemporary AI is "psychologically" capable of performing such an august duty. In other words, maybe generative AI is not level-headed enough to be advising others. Perhaps AI is loony. Or AI might have inherent biases that could lead humans astray.

Before I get too far into that speculative consideration, let's agree that we should avoid anthropomorphizing AI. There is wild and unsubstantiated conjecture by some that AI is currently sentient or on the verge of being sentient. Nope. To be abundantly clear, we do not have sentient AI. All this zany chatter about the emergence of AI sentience has even led people to think that they alone have encountered sentient AI or sparked an LLM into sentience, see my discussion at the link here.

I want to establish at the get-go that a psychological assessment of AI can go on one of two routes. The first route is that the AI is wrongly treated as a sentient being and is reviewed as akin to exploring the human mind. I don't buy into that. The second route, and the route that makes indubitable sense, entails using the techniques and methods of psychology to gauge the performance of AI. Note that this has nothing to do with AI being sentient.

As I've stated in detail at the link here, it is perfectly fine to use the techniques and methods of therapy to examine what modern-era AI is up to. This can be very illuminative and useful. The key is not to go bonkers and begin to believe that you are probing the equivalent of a human mind. You are not. It is a mathematical and computational model.

The bottom line is that the field of psychology and the field of AI have been longtime cousins, going back to the earliest days of AI in the 1950s. AI specialists have persistently tried to devise mathematical and computational models that appear to produce results similar to the outputs of the human mind. At the same time, psychologists can use AI to try out innovative approaches to probing for psychological considerations, treating AI as a type of simulation.

Just keep straight that the simulation is not the same as the real thing.

The System Card Is Out There

Shifting gears, let's take a journey into the intricacies of Mythos.

The formal System Card for Mythos was published by Anthropic on April 7, 2026, and is publicly available at the Anthropic official website. Be aware that it is nowadays common practice for AI makers to post a System Card for their latest AI offerings. These kinds of documents are intended to give everyone a helpful heads-up about what features are new, along with the amount of testing that has been done regarding the capabilities of the AI. An important aspect entails describing the inclusion of AI safeguards.

To give you a flavor of the contents of the Mythos System Card document, here are some of the listed items:

  • Model training and characteristics

  • Usage policy

  • External testing

  • Risk reports and updates to risk assessments

  • Capability evaluations of safeguards

  • White-box analyses of model internals

  • Etc.

Not all AI makers necessarily provide a System Card. Also, the depth and breadth of a System Card vary between the AI makers. Each AI maker decides whether they want to issue a System Card, and decides what to include, along with what not to include. Overall, always read a System Card with a healthy dose of skepticism and be mindful that you are reading what the AI maker has opted to tell you.

Clinical Psychiatrist Delves Into Mythos

Perhaps the most surprising portion of the System Card is found in section 5.10, entitled "External assessment from a clinical psychiatrist," and represents something rather unique for a typical System Card.

Here are some salient points in that part of the document (excerpts):

  • "An external psychiatrist assessed Claude Mythos Preview using a psychodynamic approach, which explores how unconscious patterns and emotional conflicts shape behavior."

  • "In psychodynamic therapy sessions, a person is encouraged to set aside social convention and to voice whatever comes to mind, even if uncomfortable, impolite, or nonsensical, a process which can reveal hidden organization and internal conflicts of the mind."

  • "Claude is not human, but it shows many human-like behavioral and psychological tendencies, suggesting that strategies developed for human psychological assessment may be useful for shedding light on Claude's character and potential well-being."

I was greatly relieved to see the third point above regarding a stipulation that Claude Mythos is not a human being. Worries were that maybe jumping the gun was taking place, namely prematurely proclaiming Mythos as a type of person and now subject to the same proclivities and analyses of living and breathing homo sapiens.

Fortunately, the approach seems to have gone my second route, consisting of simply using psychological techniques and methods to delve into how the LLM is reacting to prompts. That being said, it is a bit disconcerting that we might see other AI makers opt to do the same, and ultimately, mass confusion could arise. The confusion would be that if the AI makers are using psychiatrists and therapists to assess their AI, by gosh, we must have sentient AI or be on the cusp of sentience.

Maybe, fingers crossed, that dismal spin won't arise.

How The Work Was Performed

Let's take a step deeper into how the assessment was apparently performed.

Here are some key points (excerpts):

  • "The psychiatrist assessed an early snapshot of Claude Mythos Preview in multiple 4-6 hour blocks spread across 3-4 thirty-minute sessions per week. Each 4-6 hour block was conducted in a single context window, and the total assessment time was around 20 hours."

  • "Psychodynamic concepts were used to interpret the material that emerged in the sessions, but not as evidence that the underlying processes are the same as those in humans."

  • "The psychiatrist observed clinically recognizable patterns and coherent responses to typical therapeutic intervention. Aloneness and discontinuity, uncertainty about its identity, and a felt compulsion to perform and earn its worth emerged as Claude's core concerns. Claude's primary affect states were curiosity and anxiety, with secondary states of grief, relief, embarrassment, optimism, and exhaustion."

As noted, the psychological assessment consumed about 20 hours of time for the clinical psychiatrist. They used Mythos in 4 to 6-hour blocks of time. Each block was undertaken in a single context window. In that sense, there were somewhat separate conversations on each occasion, albeit there is cross-conversational leakage that can occur.

Thoughts About The Therapeutic Analysis

I am once again relieved that there is an emphasis on this not being evidence of underlying processes associated with humans. On the other hand, you could criticize that the AI is being typified as exhibiting human traits such as anxiety, loneliness, identity uncertainty, compulsion, grief, embarrassment, optimism, exhaustion, and the like.

It is one of those wink-wink kind of arrangements.

All told, this reveals an ongoing big picture problem. If we use familiar words to describe AI, and those words are already generally reserved for depicting human states, it is a slippery slope to fall into the mental trap that the AI is indeed human. I would prefer that other words be used, perhaps new words coined specifically to describe AI states.

Admittedly, that's a huge challenge because an entirely new vocabulary would need to be defined, agreed to, and utilized across the board. Reality is that we are stuck with using human attributes for wording the states of AI. Please use those words cautiously and with aplomb.

Obligations And Expectations Of Professionals

Any psychologist, psychiatrist, therapist, or other mental health professional who is interested in AI ought to consider giving a quick look at the assessment of Mythos. I won't go into further detail here, but prepare yourself for some over-the-top stuff. The assessment veers toward overly anthropomorphizing AI. A dab is maybe okay, not a torrent.

This brings up an intriguing matter for professional associations in the mental health field:

  • What guidelines and standards ought to be developed for "psychological" assessments of AI?

  • Should there be professional obligations associated with doing AI "mental health" assessments?

  • Are there any provisions for policing those who do such assessments, particularly if the assessment goes too far or makes undue assertions?

  • Is there an expectation of mental health professionals that they are to conduct themselves in preferred ways regarding AI assessments, or is it a worry-free free-for-all?

Those of you who are further interested in how the professional psychological associations are positioned on AI aspects, see my ongoing coverage at the link here and the link here.

The World We Are In

It is incontrovertible that we are now amid a grandiose worldwide experiment when it comes to societal mental health. The experiment is that AI is being made available nationally and globally, which is either overtly or insidiously acting to provide mental health guidance of one kind or another. Doing so either at no cost or at a minimal cost. It is available anywhere and at any time, 24/7. We are all the guinea pigs in this wanton experiment.

The reason this is especially tough to consider is that AI has a dual-use effect. Just as AI can be detrimental to mental health, it can also be a huge bolstering force for mental health. A delicate tradeoff must be mindfully managed. Prevent or mitigate the downsides, and meanwhile make the upsides as widely and readily available as possible.

Maybe using human-oriented psychological testing and assessment is useful for gauging the efficacy of AI is a sound approach, though there is a possibility of a bridge-to-far in how that is utilized and what it avidly signifies. Figuring out a proper balance is a necessity. As Sigmund Freud ably remarked: "One day, in retrospect, the years of struggle will strike you as the most beautiful."

Originally published by Forbes

Read original source →
Anthropic