Issuu

Jungian Typology and related questionnaires (MBTI, Insights Discovery…)

(This chapter is an adaptation and extension of a peer-reviewed article I wrote for the Dutch psychological scientific magazine Gedrag & Organisatie Behavior and Organization).

Psychoanalytic thinking is like an ill weed that grows apace. How many generations will go by before psychiatry and psychology forever abandon the confabulations of notorious psychoanalysts like Sigmund Freud, Carl Gustav Jung, or—bloody hell—Jacques Lacan. Measured by his omnipresence in organizations, it is fair to conclude that Jung is probably the most resistant pest of them all. Let’s take a look at why his typology is wrong—as well as the numerous questionnaires and measures allegedly based on his typology.

Key words

Typology, MBTI, psychoanalysis, Jung, paranormal, mythology, the fad that won’t die.

What is Jungian Typology about?

In 1921, Carl Gustav Jung (1875–1961), a Swiss doctor interested in people with mental disorders, described a number of ‘types’ that he claimed were based on the speculative work of William James (1842–1910) (Bair, Fontijn, Nieuwkoop, & Visser, 2004, pp. 292 and 328). Jung’s typology draws a distinction between three dimensions. First, he argued that each of us is born with a dominant attitude or style (extraversion or introversion). Second, people were said to have ‘irrational’ information processing preferences, namely experience (sensory perception) versus intuition. Third, he postulated that people have ‘rational’ preferences, reflected in the dichotomy of thinking versus feeling. Jung claimed that these rational and irrational preferences (which he called cognitive or mental functions) formed a hierarchy within the personality. According to Jung, of the four ‘functions’ (thinking, feeling, sensing, and intuition), one would be dominant and a second would be what he called an auxiliary function. Furthermore, those auxiliary functions develop with age. Jung arrived at a combination of these three dimensions, which he described in his book Psychological Types (1921). On a side note, it is difficult to tell if Jung envisaged two, four, eight, or even sixteen types, because of his ambiguous writing style, filled with neologisms and complex terms (which he said were designed to frighten off lunatics).

The theoretical foundations of Jung’s Typology originate from psychoanalysis as well as belief in the paranormal, telepathy, and mythological thought. The first theoretical foundation, psychoanalysis, is largely based on the idea that psychological problems are caused by forgotten or repressed conflicts, experiences, and desires from childhood. The originator of this idea was, of course, Sigmund Freud, who used the term psychoanalysis for the first time in 1896. He introduced talk therapy on the basis of his hypotheses and provided the following definition:

“Psychoanalysis is the name (1) of a procedure for the investigation of mental processes which are almost inaccessible in any other way; (2) of a method (based on that investigation) for the treatment of neurotic disorders; and (3) of a collection of psychological information obtained along those lines, which is gradually being accumulated into a new scientific discipline.” (Freud, 1923, p. 211)

Initially, Jung was an adherent of Freud’s theories. The official version was that Jung turned away from Freud because the latter too often attributed psychological problems to repressed sexual trauma. Jung developed his own ‘school of thought’ and drew some of his ideas from mythology. Initially, Jung developed ideas about a collective evolutionary unconsciousness (1912), building on Ernst Haeckel’s speculations (1834–1919) that during pregnancy fetuses develop psychologically and physically according to a number of stages which are analogous to millions of years of evolution. Jung thought he could describe this collective unconsciousness in ‘archetypical symbols’ and in Chinese Taoist alchemy (trying to convert base metals into gold). Jung developed the concepts of anima and animus, analogous to the Chinese concepts of Yin and Yang. He developed the hypothesis that the ‘false self’ has to disappear to make way for the ‘true self.’ Men can use their feminine side (anima) and women can use their masculine side (animus) to discover their true self via dreams and active imagination.

Jung also developed his own psychoanalytical ideas. For example, he invented the terms individuation (a personal development process that establishes a connection between the ‘ego’ and the ‘self’), and the shadow (a form of archetype containing all the negative characteristics that the individual wishes to deny, including our animal instincts). He also believed that dreams served to restore psychological balance.

The second foundation of Jung’s ideas was his belief in the paranormal and telepathy. For example, he borrowed the term synchronicity from Lamarckian biologist Paul Kammerer (1880–1926), who, based on his diary, concluded that strange concurrences of circumstances could not be based on coincidence. Jung defined synchronicity as “meaningful coincidences” (Jung, 1952) in which mental processes coincide in time with ‘phenomena in the world of perception.’ The concept of synchronicity led Jung to claim that all patients over the age of 35 could be helped through the knowledge of archetypes found in the collective unconsciousness.

These mythological archetypes were the third foundation of his type theory. These archetypes are not a result of the physical world but exist separately in a “parallel universe” (metaphysical). However, according to Jung, every human brain has access to this world. Psychological disorders arise because people are beleaguered by spirits from the metaphysical world of the collective unconsciousness. Access to the collective unconsciousness would provide the solution to such psychological disorders.

Jung’s belief in the paranormal and the metaphysical archetypes, in combination with psychoanalysis, prompted him to develop his now renowned ‘Types.’ These types constitute the foundation for a number of popular (personality) measures, such as MBTI®, TDI®, JTI, Insights Discovery®, and Golden Personality Type Profiler™ (see below). What these tests have in common is that they operate on the assumption that all people can be divided into a limited number of types that differ from each other qualitatively. Some of these tests are based on Jung’s three original dimensions (e.g., Insights Discovery), though most of the tests are based on four dimensions (e.g., MBTI, TDI, and JTI), and the Golden Personality Type Profiler is based on Jung’s original three dimensions plus two extra dimensions.

■ Executive Summary Theory

The Jungian type theory is about 100 years old. Psychology was far from a real science at that time, the era of cruel brain lobotomies. Carl Gustav Jung was once a follower of fraudulent Freud, and psychoanalytic thinking was one of the three pillars for his theory of archetypes. The two other pillars were his belief that mythological archetypes are stored in a parallel universe and that we can gain access to these archetypes through a kind of paranormal process. He hypothesized that we have evolved a ‘collective conscious’ of archetypes and this can help us to be treated for psychiatric disease. He also believed coincidence didn’t exist and borrowed the term ‘synchronicity’ from another believer in telepathy. Jung’s theory has been highly discredited in most academic circles and even ridiculed. Academics don’t understand why HR is lagging so much behind.

Empirical data

Personality is not captured in (dichotomous) types. Personality can be best understood and presented in continuous scales, or dimensions. Any form of typology reasoning is problematic for two main reasons:

(2) the influence of environmental or contextual factors that cause people to react differently in different situations.

As some academics have put it: the archetypes are rather stereotypes that don’t reflect our unique personality or its underlying traits. If the Jungian theory is flawed, then it is impossible to measure personality adequately based on it. The oldest and still very popular test, the MBTI , is psychometrically flawed and has been widely discredited by professors in psychology. It not only uses a form of ipsative scoring, but the most worrisome aspect is that it is notoriously inconsistent; up to 50 to 60% of people receive a different type the second time they take the test. Numerous scientists, including Harvard professor Adam Grant, have warned against use of the questionnaire.

Test construction

● The MBTI suffers from unforgivable flaws.

● The Insights Discovery tool has not (yet) been subjected to independent peer-reviewed research. That doesn’t really matter though, since it is based on an entirely wrong theory.

Theoretical/empirical grid

Conclusion

The Jungian archetype theory is false and consequently it is impossible to build a good measure. You can’t adequately measure something that is false or doesn’t exist. Even if it is possible to build scales with internal consistencies without actually measuring the confabulated constructs in the fields of psychology or astrology, Jungian typology proponents never even succeeded in building a sound psychometrical measure.

Moral Assessment

In my view, typologies are a special sort of poison—a dangerous ‘meme.’ A major problem that results from the unreliable human brain is categorization, or the tendency to put people, animals, and things into categories. ‘Type thinking,’ or putting people into boxes, fits perfectly with this tendency and even encourages division. Categorization does have one beneficial effect: it helps us simplify the world. But it also leads to dangerous problems such as creating prejudice, in-group versus out-group thinking, and racism. Furthermore, despite all warnings (e.g., not to use a test for selection purposes) and privacy legislation, people fail en masse on these points. I regularly see CVs in which people quote their ‘MBTI type’ and even occasionally visit organizations where the profiles for the entire team are mindlessly displayed on the wall. What a sad waste of good company and tax money.

Last but not least, I agree with Joseph Stromberg and Estelle Caswell of Vox, who write that “there is something wrong with CPP peddling the test” on their website as “reliable and valid, backed by ongoing global research and development investment” (October 15, 2015)

Discussion

Typologies are as old as the road to Rome. In ancient times, Hippocrates and Galenus divided people into types according to the mixture of ‘the four bodily fluids’: the sanguine temperament (warm blooded), the choleric temperament (hot tempered, due to too much yellow bile), the melancholic temperament (due to too much black bile), and the phlegmatic temperament (due to too much phlegm). We still come across these ideas in our daily use of

language (warm-blooded, hot-headed, and phlegmatic). Furthermore, the psychoanalytical ling of thinking hasn’t entirely disappeared—certainly not in organizations and companies. Although not so true of Freud’s ideas, Jung’s archetypes are not only unusually popular, they are also undergoing a genuine re-emergence. However, many people don’t know the origin of these archetypes and types and whether or not they have a sound foundation. This chapter is a critical evaluation of Jung’s theories and the tests based on them. First, I discuss the theoretical foundations of Jungian theory, and then I consider the problems with Jung’s typologies and the tests based on those typologies. Next, I consider the spread and popularity of typology-based tests, and I conclude with a discussion of some of their harmful effects and some lines of thought for resolving this issue.

On the Internet, typologies (and typology tests) based on the ideas of Carl Gustav Jung are extremely popular. The Type Association Benelux 140 web site includes the MBTI (MyersBriggs Type Indicator—16 types), and the following instruments which refer explicitly to Jung’s personality types: MTR-I (Management Team Roles Indicator); TDI (Type Dynamics Indicator); JTI (Jungian Type Indicator); and Insights Discovery, an instrument that is quickly gaining interest in Europe. A recent newcomer, also based on Jung, is the GPTP (Golden Personality Type Profiler). Below I briefly discuss just two of these tests in want of any independent publications on the reliability and validity of the others. However, there is little point to holding any discussion at all since most scholars and scientific philosophers regard a model, typology, or test without a sound theoretical foundation to be futile.

■ Theoretical soundness

The least that can be said is that Jung’s (arche)type hypothesis is plagued by many flaws. For that reason, I have split the discussion into several sections, dealing with the flaws one by one.

Problem 1: Unsound theoretical foundations

The first major problem is that there is no empirical evidence for the key concepts of psychoanalysis, paranormal phenomena, and mythology. Psychoanalysis concepts have been refuted one by one: memory research has shown that “unconscious repression” doesn’t exist (e.g., Loftus, 1994a and 1994b); dream interpretation doesn’t yield any workable hypotheses (not according to either Freudian or Jungian interpretation; e.g., Lavie & Hobson, 1986; Hobson, PaceSchott, & Stickgold, 2000), schizophrenia is not caused by “regression of the libido” (as claimed by Freud), and autism is not caused by insensitive, “ refrigerator mothers .” Autism and schizophrenia both have a strong genetic component and are currently regarded as developmental disorders of the brain. For example, using twin research, several research groups have calculated that the heritability of schizophrenia is 80 to 84% (Cardno et al., 1999; Kendler, Myers, Potter, & Opalesky, 2009). Similarly, there is no empirical support for other ideas from psychoanalysis such as penis envy (i.e., all girls are said to envy boys because of their penis), or the Oedipus complex (i.e., all boys between the age of three and five secretly dream of having sex with their mothers and killing their fathers). A more extensive discussion of the problems with psychoanalysis as a theory and as a therapy falls outside the scope of this article, but let me conclude here by referring readers to a study led by epidemiologist Yolba Smit, which resulted in the discontinuation of reimbursement for psychoanalytical therapy by health insurers in the Netherlands (Smit et al., 2010; 2012), or to more extensive reviews (e.g., Buekens, 2006)

140 Downloaded on January 12, 2012 from www.type-association.org

Furthermore, Jung’s ideas that stem from mythology and metaphysics lack empirical evidence as well. No one has ever provided any evidence for the existence of synchronicity, and when Jung defended himself, he often made use of fallacies such as, “…because statistics are possible only if there are exceptions” (Adler et al., 1973 C G. Jung Letters, vol. 2, p. 246).

As described earlier, Jung postulated that of the four functions (thinking, feeling, sensing, and intuition), one was dominant and another was auxiliary. On a test, the difference between the dominant function and the auxiliary function should be reflected as a higher score for the dominant function and a lower score for the auxiliary function. In the MBTI, this is expressed in the so-called JP index (Judgment-Perception Index), which was designed to determine a person’s dominant function. However, the existence of dominant and auxiliary functions has never been confirmed in research, or in studies by the Myers themselves (Myers & Myers, 1980), or by others (e.g., McCrae & Costa, 1989). Moreover, test results have not shown that auxiliary functions develop with age.

If anyone is still in any doubt after this consideration of the main ‘theoretical foundations,’ let me repeat that Jung’s typologies have never been empirically proven. The excuse that they cannot be tested using current scientific methods is not convincing. Jung did not conduct any scientific studies, and based his theories mainly on his own observations and anecdotal accounts during a period in which many people described him as psychologically sick. The major reason why measures based on Jung’s ‘theory’ are almost always given an unfavorable assessment is because they are based on an unsound theoretical foundation (e.g., the assessment of the MBTI by the Dutch COTAN). Jung—like Freud—never made a secret of the fact that he did not follow the path of academic science: “Anyone who wants to know the human psyche will learn next to nothing from experimental psychology. He would be better advised to abandon exact science, put away his scholar’s gown, bid farewell to his study, and wander with human heart through the world.” (Jung’s New Paths in Psychology, Collected Works, London, 1916)

Problem 2: Type is at odds with biological variation

Tests based on Jung’s ideas generally divide people into a series of distinct types. However, the reasoning behind typology contains a major fallacy, namely the assumption of dichotomy and bipolarity. Type-thinking is based on the principle that the scales are discontinuous, dichotomous, or bimodal. This would mean that the population could be divided into two groups per scale, with a’gap’ in the middle of each distribution. To understand how impossible this is, compare this to the premise that the male population consists of two groups: men between 4 ft 9’’ and 5 ft 3’’ tall and men between 5 ft 11’’ to 6 ft 6’’ tall, with hardly any men between 5 ft 3’’ and 5 ft 11’’ tall. Indeed, most human characteristics are distributed normally, whether it is a question of height, muscle power, intelligence, or personality traits.

Figure III.12: the left graph represents a distribution of features of body height, intelligence, personality, etc. The right graph represents a normal distribution, as predicted by both evolutionary theory (reproduction creates variation) and what is found in real data.

Differences in personality are therefore more gradual or ‘fluid’ Modern personality psychology favors the trait approach—these traits are presented in continuous scales, or dimensions, instead of in dichotomies. The most accepted and scientifically established model is, of course, the Big Five or the Five Factor Model (FFM—five major trait domains). The most well-known and researched test is the NEO-PI-R by Costa and McCrae (1995), which has since been replaced by the NEO-PI-3. Both the five major domains and the thirty underlying facets (six facets per domain) show a normal distribution. Whichever scale one chooses, the mutual combinations of the many facets of our personality produce a vast potential for variation between personalities! Thanks to more powerful computer processing, a new factor, i.e. another personality dimension, has been identified:’honesty–humility’ (HEXACO model, Lee & Ashton, 2004)

Typologies don’t take into account gradual differences in personality and the enormous variation of human characteristics and their possible combinations: for example, the MBTI states that people can be divided into sixteen types; LIFO uses four types and the Enneagram uses nine types. According to typology, a person definitively belongs to either one category or the other. In other words, one category excludes the other. People are either extraverted or introverted in MBTI; however, the bulk of the population is neither extraverted nor introverted but lies somewhere in between (referred to as ambiverted ). The use of ipsative questionnaires (forced choice) broadens the dichotomy (see below) as the items are explicitly bipolar in nature. Any form of typology reasoning is problematic for two main reasons:

(1) the enormous variation in personalities as described above, which is an effect of evolutionary influences (e.g., random mutations combined with nonrandom natural selection, genetic drift, balancing selection, etc.), other biological influences (e.g., hormonal influences during pregnancy or viruses), and developmental processes; and (2) the influence of environmental or contextual factors that cause people to react differently in different situations (Barkow, Cosmides, & Tooby, 1992; Moscowitz & Zuroff, 2004). 141

Thus, the origin of differences between people can be explained parsimoniously by mechanisms driving evolution, but not on the basis of Jung’s ‘parallel metaphysical world.’

Problem 3: fictitious and incorrectly used scales

Jungian theory operates on the existence of three dichotomies, yet all three have been subjected to scientific criticism. First is the dichotomy of sensing versus intuition. Intuition, as described by Jung, arose from his faith in the paranormal, but, as I have already said, there is no evidence for this. Intuition is a concept deployed in modern psychology, but with a different definition: it is the whole of implicit knowledge acquired by multiple experiences in a regular and therefore predictable environment and the opportunity to learn these regularities by lengthy practice (Kahneman, 2011, p. 252). This holds up well for professions such as fire-fighting, medicine, and nursing, but not so much for professions such as financial investment consultancy, political science, and psychotherapy. Similarly, the feeling versus thinking scale is based on an untenable dichotomy. Research in both clinical psychology and neurobiology (e.g., Damasio et al., 2001) has shown that a distinction cannot be made between emotions and thoughts. In fact, they are linked to each other indivisibly in neural networks in the human brain.

Clinical psychologists (especially those trained in cognitive behavioral therapy) have firmly adopted that point of view. Anger, for example, is always related to the same sort of thoughts, namely thoughts that involve a command or prohibition. This often finds ex-

141 For more detailed information see the chapter on the Enneagram.

pression in thoughts involving the words must or not allowed or that’s not possible because a particular desire or objective is under threat. Fear is always related to thoughts that express negative effects: ‘That dog will bite me,’ ‘My partner will be angry,’ ‘The client won’t like that,’ or ‘The dentist is going to hurt me.’ In terms of human behavior, as is the case with animals, fear almost always leads to active or passive avoidance, flight, fight, or freeze. Some people are less emotionally stable than others, but that has nothing to do with intellectual capacity. It is scientifically untenable to portray people who are easily frightened as less capable in’thinking.’ In the 5FM, emotional stability is represented as a dimension ranging from highly unstable to highly stable. However, the question of how prudently someone can think is a completely independent dimension.

Finally, the way certain measures based on Jung’s ideas (such as the MBTI) deal with the scale of extraversion versus introversion is problematic. Once again, these measures present this scale as a dichotomy. However, Jung himself argued that there was no such thing as a person who was solely an extravert or an introvert, and that these were factors or dimensions (this is also what contemporary research in this field has shown—see above). Jung reportedly said that anyone who was only an extravert or an introvert should be “admitted to an asylum.”142 Therefore, it is the developers of the MBTI and other Jungian typology tests who have introduced this dichotomy and formulated introvert versus extravert as a type antithesis. A proposition such as ‘introverts draw their energy from within themselves, while extraverts draw energy from others’ cannot be tested from a scientific point of view and contradicts other scientific disciplines such as physics. Or do introverts have a faucet to drain out energy?

The curious case of… the therapist’s beetle Jung was a notorious believer in alchemy, astrology, spiritism, telepathy, telekinesis, clairvoyance, and extrasensory perception. Historical sources have revealed that he was influenced by William James, who, among other things, believed in communication with spirits via mediums in séances.

Jung maintained that some people sense things ‘intuitively,’ for example that a yellow car will come around the street corner.

Jung also cited examples “from his practice”: a gold (scarab) beetle that flew into the window as a patient was relating her dream which featured a beetle showed, according to Jung, that there has to be a non-coincidental link between the mental world and phenomena such as this one from the physical world (Jung, 1960, p. 142).

So far, however, no one has been able to demonstrate the existence of paranormal gifts or extrasensory perception under controlled conditions, despite a reward of one million U.S. dollars offered by James Randi several years ago.

What else is wrong with this stuf?

Not only does the theory suck, most of the efforts to operationalize its concepts into measures like the MBTI suffer from flawed methodology. I will deal with those in a later section.

What does my Champions League of experts say?

They are in unison: both top philosophers and psychologists consider all psychoanalytic

142 “The classification of individuals [By Type] means nothing at all.” Carl Jung in the Jung-Evans Conversations, Transcripts of the 1957 filmed interviews of Carl Jung by Richard Evans, p. 23. You can watch it here: http://e-jungian.com/jung-film-interview-c-g-jung-dr-richard-evans/

theories from Sigmund Freud, Carl Gustav Jung, and Jacques Lacan to be sheer pseudoscience. Paranormal phenomena are righteously ridiculed. Of course, people like Richard Dawkins, Steven Pinker, and other members of my Champions League ridicule belief in the paranormal and people like Bert Hellinger and his New Age soulmates Albrecht Mahr and Rupert Sheldrake. Some of them have also addressed the problems with the MBTI. For example, Judith Rich Harris wrote: “…or tests (such as the Rorschach or the Myers-Briggs) that have not held up to scientific scrutiny” (2005, p. 22-23)

What does the majority of the feld of experts think?

Adam Grant, psychology professor at Harvard University, wrote an article143 with the allrevealing title: “Goodbye to MBTI, the fad that won’t die.” He concluded the article by saying: “we all need to recognize that four letters don’t do justice to anyone’s identity. So leaders, consultants, counselors, coaches, and teachers, join me in delivering this message: MBTI, I’m breaking up with you. It’s not me. It’s you.”

Robert Hogan wrote:

“Most personality psychologists regard the MBTI as little more than an elaborate Chinese fortune cookie – each of the 16 MBTI types is described in a chirpy and upbeat fashion as having important and distinctive qualities. Nonetheless, psychological consultants have discovered that the business community has an endless appetite for MBTI-based feedback. Since 1975, the MBTI has become one of the best-selling psychological tests of all time. Although academics are baffled and annoyed by the astonishing popularity of the MBTI, MBTI feedback is harmless and it provides a nice income for many consultants. However, the important thing about the MBTI that many personality psychologists overlook is that its sheer popularity has served to legitimize the concept of personality assessment in the business community.” (2007, p. 28)

Dutch psychology professor and personality researcher Boele De Raad has also railed against typologies (translated): “ It is indeed easy for companies to work with neatly arranged models or typologies like the enneagram or MBTI. But some descriptions are rather a caricature than an adequate description of personality.”144

Psychology professor, dr. David J. Pittenger (1993) wrote the following about the MBTI:

“Many very specific predictions about the MBTI have not been confirmed or have been proved wrong. There is no obvious evidence that there are 16 unique categories in which all people can be placed. There is no evidence that scores generated by the MBTI reflect the stable and unchanging personality traits that are claimed to be measured. Finally, there is no evidence that the MBTI measures anything of value.” (p. 6)

And

“The MBTI reminds us of the obvious truth that all people are not alike, but then claims that every person can be fit neatly into one of 16 boxes. I believe that MBTI attempts to force the complexities of human personality into an artificial and limiting classification scheme. The focus on the “typing” of people reduces the attention paid to the unique qualities and potential of each individual.” (p. 6)

Scott O. Lilienfeld and three other psychology professors (2010, p. 94) write that “the MyersBriggs Type Indicator is a psychoanalytically oriented personality inventory. ” They also write why it is not good practice to try to find the ‘root cause’ of problems in childhood as

143 The article can be consulted at Hufftingtonpost.com or psychologytoday.com.

144 Interview in “Het Financieele Dagblad,” September 6, 2008.

often practiced by psychoanalytic psychotherapy: “ We can thank – or blame Sigmund Freud and his followers for most, if not all, of these popular beliefs” (p. 236). I could go on citing many more psychology professors, but this could quite possibly double the volume of this book. I rest my case.

The theoretical score: -5. Because the proponents give it the impression of being science, whereas it is in contradiction to both physics and biology.

■ Empirical findings

What is the level of evidence for the theory?

The hypothesized (1) paranormal access to, and (2) archetypes stored in (3) a parallel universe is of course in stark contradiction to the regularities of physics, but also those of chemistry, biology, and psychology. Over a period of 150 years, millions of dollars have been spent on studying the paranormal or ‘psychic phenomena,’ only to conclude that it most likely doesn’t exist.

On the contrary, evidence available from both human and animal research clearly demonstrates personality should be viewed as multifaceted. Each aspect of personality (intelligence, traits…) follows a normal Gaussian distribution. These findings are in line with expectations from evolutionary biology (‘descent with modification creates variation’) and refute the entire theory of typology.

What about the level of evidence for the measures?

As not all measures have been researched by independent academic researchers, I will address each measure separately.

The measures based on Jungian typology are unreliable, artifcially reliable, or haven’t been researched.

A problem arises when Jung’s archetypal theory is put into operation, and the problem lies in the measures themselves. Some of these measures have been researched to evaluate their psychometric quality, but the results have proven extremely problematic. With the MBTI, for example, there is up to a 60% chance of a person being classified under a completely different type after just four weeks, meaning the test-retest reliability is unacceptably low (see below under MBTI).

Without delving into an analysis of all the problems again, I will briefly repeat the ipsative nature of these tests (2) measures.

“ Ipsative scoring is a system whereby the respondent actually distributes a constant number of points (the available rankings) over (usually) a number of scales that are included in the measure. Therefore, the sum of the different item scores will be equal for each respondent. The result is an ordering of the scales on the basis of their importance to the respondent.” (Danny Rouckhout in a memo to me)

The biggest problem with ipsative scores lies in the artificial reliability of the tests. 145 Among other things, factor analysis is necessary to demonstrate construct validity and scale independence. For example, a measure that contains 6 constructs (scales) needs to demonstrate that each construct or scale measures something different than the other five scales. For example, extraversion should not measure conscientiousness, nor emotional

145 For an extensive discussion of the problems with ipsative or ipsatized scores, see the chapter on Belbin Team Roles.

stability, etc. A very common procedure is to calculate an intercorrelation matrix of the different scales. As the number of scales rises, the individual correlation values will gradually decrease, therefore a high correlation in the matrix will rarely be encountered. Such matrices are often incorrectly interpreted as evidence for a set of independent variables, certainly whenever the reader is not aware of the ipsative nature of the scales. In other words, the independency of the scales is purely artificial in the case of ipsative questionnaires. Since the MBTI uses ipsative scales, both the reported internal reliability of the scales and the reported independence of the scales are unreliable because they are artificial. But the problems with the ipsative nature of the MBTI are not as bad as in other measures, since the MBTI never compares items across scales; it only makes you choose between two bipolar items in one scale (e.g. between an item expressing introversion and an item expressing extraversion). Therefore, the MBTI obtains smaller intercorrelations between scales (Myers herself reported -0.10 to -0.12 values, whereas the Clemans formula would lead to an intercorrelation value of -0.33). Already in 1962,146 Lawrence Stricker and John Ross noted that most people don’t fall into distinct categories—which of course is the most striking feature of a type: if there is no dichotomy, then why would you use type and not dimensions, like modern science does? People are just pigeonholed into one category or the other. Stricker and Ross concluded that their research findings offered little support for “any of the structural properties attributed to the typology” and they wrote that either “(a) Jung’s typology is not consistent with the real world; or (b) the Indicator does not correspond to the theoretical formulation of the typology” (1964, p. 62). Well, they were wrong: both interpretations were right; both the theory is wrong, and the operationalization was flawed too, as we will see in the following paragraphs.

Huge problems with the ever-popular MBTI

The MBTI was developed in 1942 (“Type Indicator”) by Isabel Briggs Myers and Katherine Cook Briggs (her mother). It was based on Jung’s entirely untested theory. Briggs and Myers initially gave labels to the types, like the scientist, the idealist, the caregiver, etc. Both the theory and the test were concocted (construction is not a good term for this) in an era when psychology was not yet a science. Moreover, neither of the two were trained as psychologists. The MBTI (Briggs & Myers, 1987) is an ipsative (forced choice) questionnaire which, according to proponents, indicates the course of a preference on four bipolar dimensions (Hicks, 1970):

1. extraversion-introversion (E-I),

2. sensing-intuition (S-N),

3. thinking-feeling (T-F), and

4. judging-perceiving (J-P).

The MBTI theory specifies that there are three levels at which a typological distinction could be made:

(1) qualitative differences between the opposite preferences (listed for that purpose);

(2) statistical interactions between preferences based on external criteria (such as performance); and

(3) the difference between a dominant function and an auxiliary function.

The current version uses 93 questions, yet it still claims it can group all people in the world into 16 different and discrete types. As always, proponents claim it can cure all sorts of problems: it can help you build better relationships, perform better, be successful, be useful for teambuilding, help you make career choices, etc.

146 I consulted both the 1962 draft for the Educational Testing Service and the 1964 publication in the APA Journal of Abnormal and Social Psychology.

The MBTI theory difers from Jung’s theory

Briggs Myers and Cook Briggs claimed that Jung’s theory differed from their tool on the basis of there being four rather than three dichotomies. This is simply not true, as in Psychological Types, Jung clearly referred to three distinct dichotomies: extraverted versus introverted attitudes and two opposite pairs of functions: rational (judging) functions (thinking-feeling) and the irrational (perceiving) functions (sensing-intuition). The MBTI developers added the ‘judging versus perceiving’ (judging-perceiving) dimension. This dimension is supposed to indicate whether someone prefers to organize his or her life according to plans and structure (J), or in a more flexible manner (P). So, this dimension differs in relation to content from Jung’s written assertions.

Other basic principles of the MBTI have emerged that are also contrary to Jung’s theory. I have already discussed the problematic dichotomy of extraversion versus introversion, which was criticized by Jung himself in 1921; he regarded these dichotomous definitions to be ‘fictions.’ Of course, that criticism is true for all four of the main MBTI categories. The fact that the authors regard each type as fundamentally equivalent and positive (Myers & Myers, 1980) is untenable because people with pronounced tendencies often experience conflicts or problems. For example, dominant and aggressive leaders cause a lot of adverse effects and are often labeled in the subject literature as abusive supervisors (e.g., Tepper, 2000). McCrae and Costa (1989) argue that the Big Five dimension of neuroticism (the measure of emotional stability) on its own provides evidence that not all ‘types’ can be regarded favorably.

McCrae and Costa (1989) conducted research into the soundness of the MBTI because they were looking for a good personality test for their research at the National Institute on Aging. While the Big Five theory emerged as solid, they also researched whether the MBTI was a good tool for putting Jung’s theory into operation. And so, (once again) they examined whether people could really be grouped into sixteen types, whether dichotomous preferences really existed, whether interactions existed between the preferences (55 possible interactions in all), and whether or not people developed their auxiliary functions alongside their naturally dominant functions as they got older, as Jung had claimed. McCrae and Costa found no confirmation for any of the MBTI developers’ claims summarized above: “There was no evidence that preferences formed true dichotomies, the 16 types did not appear to be qualitatively distinct, because analyses of their joint effects on personality dimensions showed that only 1 of 55 interactions was significant, and only in women, and, contrary to hypothesis, the theoretically dominant function was no more clearly preferred than the auxiliary. The Jungian prediction that opposing functions should be developed in later life was not confirmed using the MBTI.” (1989, p. 32)

McCrae and Costa also argued: “Weighing the evidence to date, the MBTI does not seem to be a promising instrument for measuring Jung’s types, those who embrace Jung’s theory should probably avoid the MBTI.” (1989, p. 32) The authors also argue that either Jung’s theory is faulty, or the fault lies with the way in which it is put into operation, namely the MBTI.

The third level that MBTI theoreticians describe (dominant versus auxiliary functions) should be easily found in test results: dominant functions must receive a higher (preference) score in these tests than auxiliary functions (Myers & McCaulley, 1985, p. 58). This could not be confirmed in any study, not even in their own research, although they didn’t seem to realize that if these interactions could only be found among half of the types (Myers & McCaulley, 1985, p. 60) , then this yields the same result as what could be expected based on pure coincidence.

The psychometric properties are inadequate

The reliability of the measure is also problematic. Independent research has shown that, on average, 60% of MBTI test participants receive a different result . In a 1979 study, Howes and Carskadon established that the test-retest reliability was very weak: after a test-retest interval of a mere five weeks, 50% of participants showed up under a different type. In 1983, McCarley and Carskadon replicated these results. In a study carried out by the National Research Council (NRC, 1991), it emerged that “from all studies examined, only 24 to 61% of the participants showed stability in type.” In total, 15 psychologists worked for the NRC on the MBTI test and investigated 11 relevant studies with test-retest data. Their findings showed that 39 to 76% of participants were given a different type in a repeat of the test administered no more than three to five weeks later. The NRC calculated a median of 60% uncertainty regarding the allocated type. To be precise, this would mean that over half of people change their personality type each month!

Similarly weak test-retest reliability results have been demonstrated by scholars such as Bess and Harvey (2002), Fleenor (2001), and Mastrangelo (2001). Even Isabel Myers herself reported in 1998 that 35% of the test subjects had different type-scores after a test-retest interval of only four weeks. That percentage is much higher in independent studies. Other researchers also argue that the 5FM measures are clearly superior to the MBTI (e.g., Furnham, 1995; Pittenger, 1993 and 2005). Although correlations have sometimes been found between the scales of 5FM measures and the MBTI, this does not provide support for the proposition that the MBTI’s theoretical foundation is sound. As discussed previously, the internal reliability of the instrument is obviously partly artificial as is with all ipsative tests.

The Dutch Committee on Tests and Testing (COTAN) of the Dutch Association of Psychologists (NIP) provides the following assessment of the MBTI in its Documentation of Tests and Test Research:

● Basic principles of the test construction: insufficient; unsound theoretical foundation.

● Quality of the test material: good.

● Quality of the manual: good.

● Standards: insufficient; standards too small and standards not representative and/ or the representation cannot be assessed.

● Reliability: good.

● Concept validity: insufficient; not enough research.

● Criterion validity: insufficient (no research).

In short, most academics generally agree that the MBTI is not based on a sound theory, it does not possess good psychometric qualities, and there are much better tests available. Even Consulting Psychologists Press (CPP), the publisher of the MBTI, warns against the MBTI for a number of purposes, including hiring decisions.

Insights Discovery: increasingly popular, but no trace of independent research

The Insights Discovery instrument (referred to as an ‘evaluator’ in the quotation below) has been heralded as follows: “The Insights Discovery model is based on the extensive research of Swiss psychologist Carl Jung and the subsequent work of Jolande Jacobi, one of his leading students.” And: “The evaluator has been tested and updated to measure the quality of the 100 word pairs and weaker ones replaced with stronger ones. They’ve also been tested for reliability and validity in huge numbers to help gauge how robust the word pairs used in the

evaluator are.”147 The following can be read about this instrument elsewhere: “There is still a lot of unrelenting scientific interest in the work of Jung.”148

Apart from the incorrect claim that Carl Gustav Jung was a psychologist (he graduated with a degree in medicine in 1902 and specialized in psychiatry) or that he did “extensive research,” the claim that Jungian theory is sound is, of course, laughable. Proponents claim the test is both valid and reliable. However, my search of the APA database on January 12, 2013 did not yield a single article on Insights Discovery (“0 hits”). There are also no entries in the Dutch COTAN documentation (Documentation of tests and test research in the Netherlands–NIP or Dutch Association of Psychologists, January 12, 2013), nor does the Buros Institute have even a single review (consulted on January 12, 2013). This is a classic move: people make strong claims about the scientific status of an instrument because they reckon the vast majority of people won’t test it or aren’t capable of assessing it properly. In 2018, I found two papers on Insights Discovery that are not peer-reviewed. The first is a very brief document (only 3 pages!), called “fact sheet,” claiming that Confirmatory Factor Analysis (CFA) has revealed that the circular representation is warranted149 The problem is that the document does not show a table with the results (e.g. fit indexes such as goodness of fit) of this CFA. They conclude that ‘cool blue’ and ‘sunshine yellow’ are opposites, because the loadings on factor two are +0.53 and -0.52 respectively, and they draw the same conclusion from the finding that ‘earth green’ and ‘fiery red’ have loadings on factor one of +0.52 and -0.57 respectively, but opposite poles (i.e. the midpoint of each quadrant) should show factor loadings of -1.0. An MDS solution would have shown us whether the four colors really form a correct circumplex with opposite poles.

I again asked psychometrician Danny Rouckhout to look at the document (personal correspondence April 23, 2019). He found at least 4 problems that I will explain hereafter.

(1) He concluded that the presented data pointed to the use of an Explorative Factor Analysis (EFA) and not a CFA. He stated that EFA should not be used to analyze 4 scales.

(2) Rouckhout questions the value proposed by the (unknown) authors of the three pages (“factor loadings greater than 0.3 or less than -0.3 are considered acceptable”), as he has never encountered such an ‘acceptance’ in a statistics book. The minimum value that statisticians consider ‘on the border of important’ is 0.50. The correlations of the four scales or colors is in the range of 0.50 as can be seen in the table below. More importantly, the Insights Discovery fact sheet mentions that “the fundamental explanation of the four Insight colour preferences is contained in the first two factors that account for the bulk of the variance.” There are two problems with that:

(3) The first problem concerns the low Eigenvalues of the factors. The Eigenvalue is an indicator for assessing how important a factor is. According to the Kaiser Criterion, an Eigenvalue >1 is considered important. But the Eigenvalue of the factors are only 0.60 and 0.57 respectively.

(4) The second problem is that the explained variance of the communality of the four scales should be higher than 0.450, but they only reach 0.29. Rouckhout concludes the four scales do not match the two factors.

He then raises further questions, such as what about the factor analysis at the item level? And, what were the other factors?

147 Downloaded on January 12, 2013 from www.insights.com/2119/validating-the-insights-discoverymodel.html

148 Downloaded on January 12, 2013: Report titled Frank Sample, 2005, p. 5; www.christelclear.nl/files/ Frank%20Sample%20-%20Basis-Man-Pers.ontw.%20NL.pdf

149 “Insights Discovery: Validating the System.” I downloaded the document in April 2019 from www.insights.com, a website belonging to “The Insights Group Ltd.”

Scale factor 1 factor 2 communality

Table above: calculations provided by Danny Rouckhout from the University of Antwerp.

The second document is from 2005 and is still… awaiting peer-review (Benton et al., 2005 draft version). From it, we learn that the Evaluator 3.0 is a kind of ipsative (forced choice) questionnaire, though the authors call it a hybrid form. As you know from the Belbin chapter, that should set our alarm bells ringing. The Evaluator contains 25 frames each of which have 4 word pairs: participants must choose one pair of words that “that MOST describes you in your work environment and circle M next to this,” one pair of words “that LEAST describes you in your work environment and circle L next to this,” and for each of the remaining word pairs, a weighting with the values 1, 2, 3, 4, and 5. This makes it difficult to interpret and the authors go to great lengths to frame the criticism of using factor analysis (FA) on ipsative scores in a such a way that people might think that using FA is not a problem: “the narrow classical view about the use of Factor Analysis ” is that “ only ‘interval’ data types can be used,” and “in summary, although at odds with the narrow classical view, there is sufficient evidence to support the valid use of these techniques on the Insights Discovery Evaluator data” (all quotes from the 2005 version, p. 26, bold emphasis my own to demonstrate their framing efforts) 150

In both the 2005 and 2008 versions, we read that “Ipsative (forced-choice) scales are based on ‘ordinal’ (i.e. ranked) data types and this ‘forces’ a correlation between items that artificially inflates the correlations in the correlation matrix” (2008, p. 21). Their conclusion is unwarranted for at least two reasons: first of all, the authors are wrong when they state that ipsative scores are merely ordinal scores. I explained the true problem in the Belbin myth: the problem is the interdependence of the scores; if you score high on one, you must score low on another. This results in an unacceptable dependence of the scores and an artificial interdependency in the scales. Second, as I mentioned in the Belbin chapter, ipsatized scores rule out techniques such as factor analysis. If you have forgotten the main problems with ipsative scores, I recommend you read the paragraphs on this topic in the Belbin Chapter once again.

Both documents I mentioned report very high (all above 0.90) Cronbach’s alphas—an indicator of the internal consistency of a scale. Danny Rouckhout confirmed for me that the literature mainly investigated the effects on true ipsativity. So, we have no idea how their ‘hybrid ipsative’ scoring will affect the Cronbach’s alpha values. And having to choose between 4 alternatives is still kind of ipsative or ‘forced choice,’ of course. The burden of proof rests on the proponents of the Insights Discovery Evaluator: they will have to prove

150 I also found an updated version from 2008. The comments to peer reviewers have disappeared from this version. The framing about the ‘narrow classical’ view still remain however.

that their hybrid ipsative system does not distort true interval type scores (e.g. 5- or 7-point Likert normative scores).

The factor loadings reported in this document are almost identical to the loadings in the table above stemming from the 2018 ‘fact sheet,’ meaning the problem with the low total variance explained by the two factors remains. As I wrote before, most of the time, problems don’t stand on their own: if there are serious problems with the theory, then the empirical findings will present problems as well, in addition to the psychometric practices and outcomes.

Now, if you still aren’t convinced that you shouldn’t use Insights Discovery and would like to give it the benefit of the doubt, you have surely forgotten that the underlying Jungian theory is outrageously ridiculous. Moreover, the ‘inventors’ of Insights Discovery added ‘ color energies ’: “ We call these the color energies, and it’s the unique mix of Fiery Red, Sunshine Yellow, Earth Green and Cool Blue energies, which determines how and why people behave the way they do.”151 I could not find any good explanation for where these colors came from or through what mechanism they exert power over our personalities. I only found references to the ‘four humors’ of Hippocrates (460-370 BCE), who believed that certain human moods and related behaviors were caused by an excess of four body humors or body fluids: blood, yellow bile, black bile, and phlegm. It goes without saying that modern psychiatrists and psychologists do not adhere to such ancient ‘wisdom.’ One website on Insights Discovery writes that Hippocrates’ four humors are “knowledge” upon which “many researchers have subsequently expanded.”152 I don’t think so. On wearebowline.com I read that “the order and strength of the four colour energies in each participant generates eight types” (Bold emphasis my own). These eight types are defined as follows:

● “Director—extraverted thinking—results focus, decisive, assertive

● Motivator—extraverted intuition—drive, enthusiasm, positive thinking

● Inspirer—extraverted feeling—persuasive, creative, people skills

● Helper—introverted intuition with extraverted sensing—flexible and helps others, shared ideas

● Supporter—introverted feeling—listens, loyal, team approach

● Coordinator—introverting sensing—planning, organizing, time management

● Observer—introverted thinking—sets standards, product knowledge, analysis

● Reformer—extraverted sensing with introverted intuition—determination, monitoring performance, discipline”

(www.wearebowline.com, April 16, 2019, color, capitals and bold emphasis omitted)

So, we are left with Jung’s incorrect theory, a color scheme that is seemingly based on Hippocrates’ four humors, and questionable psychometric practices and results. I come to no other conclusion than this: using Insights Discovery is akin to taking a wild guess. Taking into consideration the Forer effect and the problematic decisions that people tend to make based on personality measures, its use is highly immoral in my opinion.

The other Jungian tests—a deafening absence of evidence

There is a profound lack of evidence regarding the soundness of the other tests. Apart from journals devoted to Jungian typology and psychoanalytical publications, such as the Journal of Psychological Type, there is not a single reference to most of these tests in sound,

151 Source: the official website of Insights Discovery (April 16, 2019): https://www.insights.com/us/products/insights-discovery/

152 See for example: http://www.inside-inspiration.com.au/insights-discovery/insights-colour-energies. html#.XLW1YC-YOF0

peer-reviewed journals. A consultation of the APA database on January 12, 2013 yielded the following results: for TDI I found only a brief definition of the test with no peer review; regarding its validity, I found 0 hits; for MTR-I, I found the same results.

The empirical score: -4. Because so many reviews point to the completely flawed theory, and so many reviews discuss the many problems with measures like the MBTI.

Why do people believe these questionnaires, or believe the theory can ofer them valuable insights?

To begin with, the most positive views are held by people who have a stake in selling these tools. As psychology professor Rob Briner put it:

“The most positive views are held by highly enthusiastic (bordering-on-evangelical) believers. For them such tools are essential for almost everyone and everything: career development, selection, team building and so on.”

Rob Briner, November 15, 2018

This is no exaggeration, as we can learn from a 2012 Washington Post interview (Cunningham, 2012), in which Barry Edwards—who has described himself as a big fan—says: “It’s like religion. Believe what you want. Get out of it what you want.”

The usual suspects, such as authority arguments (“the famous Swiss psychiatrist Jung”) and biases, offer us the most probable explanation. But given that the MBTI was one of the first ‘personality measures’ ever to be used in personnel matters, it has become a kind of perpetuum mobile: because it is so well-known and widely used, few dare to oppose it or refuse it to customers (for many, money is more important than honesty or the principle of do no harm to others). One publisher of tests recently tried to evade responsibility by explaining to me that “Yes, but there’s a demand for those tests, so we supply them.”

People who take these tests are not only lured in by authority arguments, they are also emotionally influenced by the flattering labels: you are categorized as a caregiver or nurturer, a thinker, a performer, etc. People feel happy after taking the test and the Barnum or Forer effect takes hold: if people believe it is a valid instrument developed by trustworthy people, like psychiatrists and psychologists, then they will tend to believe the outcomes are accurate. This is especially true if the descriptions are mainly positive, as if there were no bad apples in the bunch, or no inappropriate behaviors or reactions ever existed. Moreover, the proponents argue that other people need to accept your type, and this acceptance would help make the world a better place. Hallelujah, Platonic Idealism is raising its angel-like head again.

Consider the fact that some business schools (Vlerick Business School in Belgium, the Rotterdam School of Management in the Netherlands, IMD in Switzerland, INsead in France, etc.) and even (non-psychology) university faculties use the MBTI. Universities are apparently houses with so many rooms that faculty members can’t find their way to ask their psychology professor colleagues for advice. Gullible or powerless public servants help spread its popularity by putting it in their invitations to tender, etc. These thoughtless promotions lend the model undeserved credibility.

However, a number of these providers seemingly act in good faith because they are simply untrained in major fields such as behavioral biology, evolutionary psychology, personality psychology, and labor (I/O) psychology. No doubt some practitioners consciously opt for the easy money that can be made by offering the most popular models, but a good number

of them act simply out of ignorance, having fallen for (false) lines of authoritative reasoning and manipulation techniques (Cialdini, 2009). However, judging by the vast range of models, typologies, and tests available from training and coaching providers, they don’t seem to be making any effort to distinguish between sound and unsound theories. They may not be capable of distinguishing between good and bad scholarship and don’t go to the trouble of immersing themselves in the scholarly literature of the aforementioned fields. Research in Belgium (Segers, Vloeberghs, De Prins, & Henderickx, 2009) and the Netherlands (Groen, Sanders, & Van Riemsdijk, 2006; S anders, V an R iemsdijk, & G roen, 2008) has confirmed that HR professionals have very little training in psychology, let alone training in assessing the soundness of research articles. This shows that there is a very low level of knowledge of academic findings on HR concepts and instruments for HR professionals. This applies to both external providers (consultants, trainers, and coaches) and internal HR staff and line managers.

Apart from popular claims like ‘millions of people use it and they can’t all be wrong.’ which are discussed extensively in the subject literature (e.g., Bardone & Magnani, 2010; Cialdini, 2009; Goodwin, 1998; Walton, 2000), there are four commonly heard rationalizations for the use of Jungian typology, as described below:

“It doesn’t matter if it’s scientific or not, it is just a discussion/conversation starter.”

“Sometimes typologies can be useful to help you communicate reality and make it somewhat comprehensible.”

“In my experience people get a lot out of it” or “I’ve had good experiences with it.”

“Science often contradicts itself; maybe they’ll be saying ten years from now that XXX is a good measure.”

In Part I of this book, I already dealt with these rationalizations and thinking errors. Let me just repeat that what we see here is lazy thinking, switcheroos, reverse switcheroos, selfjustification, the Forer or Barnum effect, etc. Lastly, it’s easy to imagine that many employees prefer to refrain from criticizing the choices made by their managers or the HR or training department, out of fear of a negative impact on their careers.

How likely is it this theory will ever prove to be valid?

It is about as unlikely as the likelihood that I will be able to prove invisible fairies do my gardening every Friday night.

Can you use the MBTI or Insights Discovery for selection purposes?

In most countries, this is simply illegal, as necessary prudent steps must be taken to avoid discrimination or the use of invalid measures. For example, OPP, the publisher of the MBTI, has repeatedly warned against and explained that the instrument must not be used for selection. Practitioners who “misuse” the instrument for selection purposes are no longer supplied with the instrument (OPP, 2017). Now you know why.

Can you use the MBTI or Insights Discovery for development purposes?

The answer is a straightforward no. A tool that is not a valid measure of personality cannot offer any practical value. As I wrote in the introductory chapter: it is not even a good conversation starter. Garbage in, garbage out: nothing good can be expected from giving people incorrect information about their personality: “...it’s wrong, misleading, practically unhelpful and even harmful” (Briner, 2018). What you wouldn’t accept for a medical diagnosis (e.g. blood test), you shouldn’t allow in any other tool or measure that could dramatically influence your self-perception, self-esteem, well-being and career choices. It is clearly a no-go for me!

Believe it or not, even one of the psychologists on the board of CPP, 153 the company that distributes the MBTI, doesn’t use MBTI in his research “In part because it would be questioned by my academic colleagues” (statement by Carl Thoresen, a Stanford psychologist and CPP board member in the Washington Post in 2012—see Cunningham reference). What’s more, by 2012, he had published close to 150 papers, yet none of them mention the MBTI. If he doesn’t believe it is a valid tool, why should you? In 2012, then 86-year old Katherine Myers was quoted in the same Washington Post article saying: “It was a family that didn’t think you had to go to a class to learn something. You could just learn it on your own.” The author of the article, Lillien Cunningham, cynically replied to this quote, saying “The 2,500 people who got their Myers-Briggs certification last year likely agree.”

And can you imagine the self-assuredness—one could even say arrogance—of consultants who work for CPP? Without hiding their commercial interests, Penny Moyle and John Hackston managed to get their opinion article published in the Journal of Personality Assessment in 2018. In their article, they launched a vehement attack on academics, accusing them of sitting in their “Ivory Tower” and pursuing their own interests, as “Many academic writers have their own assessments, commercial associations with rival test publishers, or sell their own consulting services, and like journalists, want to capture attention with a memorable headline” (p. 513). As I predicted in Part II, these believers defend themselves with invalid ad hominem attacks. A glance at their reference list shows that the studies they cite that try to demonstrate MBTI validity are either published in the biased Journal of Psychological Type or Research in Psychological Type (biased because these are outlets for type research), OPP or CPP publications, non peer-reviewed books, and conference papers, or in journals on accounting, information and software technology, marketing, nutrition, and… cyberpsychology (I kid you not)—whose relevance to the topic totally escapes me. None of these articles that allegedly show support for the validity of the MBTI were published in top ranked journals such as the APA Journal of Applied Psychology (JAP), or Academy of Management journals, for example. Only one article (Insko et al., 2001) appeared in the prestigious APA Journal of Personality and Social Psychology (JPSP). But guess what, when I dug into this article, I found that this paper didn’t in the slightest ‘demonstrate the validity of the MBTI,’ as Moyle and Hackston claimed. Insko et al. were simply referring to correlational research by McCrae and Costa from 1989 between the NEO (5FM) instrument and the MBTI—remember their devastating conclusion. Insko and colleagues merely explained why they used only the Sensing-Intuition scale of the MBTI (as well as the OpennessIntellect scale of the Big Five Inventory) to assess the predisposition towards abstract thinking. This is more than cherry picking by Moyle and Hackston. Either a mistake was made in citing this article as a validity report, or it’s a lie. I’ll let you be the judge.

As I already anticipated (see Part II), Moyle and Hackston hold the CEBMA model against itself, claiming that Eric Barends from CEBMA has acknowledged that the four sources of evidence should be considered. Throughout the article, they suggest that evidence from MBTI practitioners and test-takers should be taken into account. I wholeheartedly disagree; the opinions of MBTI and other Type ‘experts’ and practitioners who embrace the MBTI as if it were a religion do not carry the same weight as published scientific evidence (the first source), as they are fatally biased.

153 The company name became OPP Ltd. If you search for www.cpp.com or www.opp.com you are now redirected to the myersbriggs.com website.

■ Original sources consulted

(Articles preceded by ** contain no empirical research, but are either opinion articles or narrative reviews)

Adler, G., Jaffe, A., & Hull, R.F.C. (1973). C.G. Jung Letters Vol. 2, 1951-1961. Princeton, NJ: Princeton University Press.

Bair, D., Fontijn, B., Nieuwkoop, P., & Visser, W. (2004). Jung: Een biografie. De Bezige Bij. Bardone, E., & Magnani, L. (2010). The appeal of gossiping fallacies and its eco-logical roots. Pragmatics & Cognition, 18, 365–396.

Barkow, J., Cosmides, L., & Tooby, J. (eds) (1992). The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press.

Baron, H. (1996). Strengths and limitations of ipsative measurement. Journal of Occupational and Organizational Psychology, 69, 49–56.

Bartram, D. (1996). The relationship between ipsatized and normative measures of personality. Journal of Occupational and Organizational Psychology, 69, 25–39.

Benton, S., van Erkom Schurink, C., & Desson, S. (2008). An overview of the development, validity and reliability of the English version 3.0 of the Insights Discovery Evaluator. London: University of Westminster’s Business Psychology Centre, version 5.0 (229 pages). This is a non-peer-reviewed publication.

Bess, T.L., & Harvey, R.J. (2001, April). Bimodal score distributions and the MBTI: Fact or artifact? Paper presented at the Annual Conference of the Society for Industrial and Organizational Psychology. San Diego.

Briggs, K., & Myers, I. (1987). Myers-Briggs Type Indicator Form G. Palo Alto, CA: Consulting Psychologist Press.

Briner, R. (2018). Do personality-typing tools have a place in HR? HR Magazine (UK). Downloaded from: https://www.hrmagazine.co.uk/article-details/do-personality-typing-tools-have-a-place-in-hr Buekens, F. (2006) Freuds vergissing. De illusies van de psychoanalyse. Leuven: UMCO / Uitgeverij Van Halewyck.

Cardno, A.G., Marshall, E.J., Coid, B., Macdonald, A.M., Ribchester, T.R., Davies, N.J., Venturi, P., Jones, L.A., Lewis, S.W., Sham, P.C., Gottesman, I.I., Farmer, A.E., McGuffin, P., Reveley, A.M., & Murray, R.M. (1999). Heritability estimates for psychotic disorders: The Maudsley twin psychosis series. Archives of General Psychiatry, 56, 162–168.

Carroll, R.T. (2000). Becoming a critical thinker - A guide for the new millennium. Boston: Pearson Custom Publishing.

Chamorro-Premuzic, T., Winsborough, D., Sherman, R. A., & Hogan, R. (2016). New talent signals: Shiny new objects or a brave new world? Industrial and Organizational Psychology, 9(3), 621–640.

Cialdini, R. B. (2009). Invloed. De zes geheimen van het overtuigen. Den Haag: Academic Service. (Original English title: Influence).

Clemans, W. V. (1956). An analytical and empirical examination of some properties of ipsative measures. Psychometric Monographs, 14. Richmond, VA: Psychometric Society.

Costa, P. T., Jr., & McCrae, R. R. (1995) Domains and facets: Hierarchical personality assessment using the Revised NEO Personality Inventory. Journal of Personality Assessment, 64, 21–50. Coyne, J. (2009). Why evolution is true. Penguin Books.

**Cunningham, L. (2012). Myers-Briggs: Does it pay to know your type? (including the interview with Carl Thoresen, a CPP board member who doesn’t defend the MBTI). https://www.washingtonpost.com/ national/on-leadership/myers-briggs-does-it-pay-to-know-your-type/2012/12/14/ eaed51ae-3fcc-11e2-bca3-aadc9b7e29c5_story.html?noredirect=on&utm_term=.e85feaace95b Damasio, A.R. (2001). Emotion and the human brain. In A.R. Damasio, A. Harrington, J. Kagan, B.S. McEwen, H. Moss, & R. Shaikh (Eds.), Unity of knowledge: The convergence of natural and human science Annals of the New York Academy of Sciences, 935, 101-106. New York: New York Academy of Sciences.

Darwin, C.R. (1859) On the origin of species, by means of natural selection. London: John Murray. Dawkins, R. (2009) The greatest show on Earth: The evidence for evolution. London: Transworld Publishers. Dennett, D. (1995). Darwin’s dangerous idea. London: Penguin Press. Dunlap, W. P., & Cornwell, J. M. (1994). Factor analysis of ipsative measures. Multivariate Behavioral Research, 29, 115–126.

Eysenck, H.J. (1997) Personality and experimental psychology: The unification of psychology and the possibility of a paradigm. Journal of Personality and Social Psychology, 73, 1224–1237.

Fleenor, J. W. (2001). Review of the Myers-Briggs Type Indicator Form M. In B.S. Blake, & J.C. Impara, (Eds.) The fourteenth mental measurements yearbook. (pp. 816–818), Lincoln, NE: Buros Institute of Mental Measurements, The University of Nebraska Press.

Freud, S. (1923/1976). «Psychoanalyse» und «Libidotheorie». In S. Freud Gesammelte Werke 13. (pp. 209–233). Frankfurt am Main: S. Fischer Verlag.

Furnham, A. (1995). The big five versus the big four: The relationship between the Myers-Briggs Type Indicator (MBTI) and NEO-PI five factor model of personality. Personality and individual Differences, 21, 303–307.

Goodwin, J. (1998). Forms of authority and the real Ad Verecundiam. Argumentation, 12, 267–280.

Groen, B., Sanders, K., & van Riemsdijk, M. (2006). De kloof tussen theorie en praktijk. Een onderzoek naar de kennis van HRM’ers over arbeids- en organisatiepsychologie. Tijdschrift voor HRM, 9, 33–52.

Hicks, L. E. (1970). Some properties of ipsative, normative, and forced choice normative measures. Psychological Bulletin, 74, 167–184.

Hobson, J. A., Pace–Schott, E. F., & Stickgold R. (2000). Dreaming and the brain: Toward a cognitive neuroscience of conscious states. Behavioral and Brain Sciences, 23, 793–842.

Howes, R.J., & Carskadon, T.G. (1979). Test-retest reliabilities of the Myers-Briggs Type Indicator as a function of mood changes. Research in Psychological Type, 2, 67–72.

Insko, C. A., Schopler, J., Gaertner, L., Wildschut, T., Kozar, R., Pinter, B., . . . Montoya, M. R. (2001). Interindividual–intergroup discontinuity reduction through the anticipation of future interaction. Journal of Personality and Social Psychology, 80(1), 95–111.

Johnson, C. E., Wood, R., & Blinkhorn, S. F. (1988). Spuriouser and spuriouser: The use of ipsative personality tests. Journal of Occupational Psychology, 61, 153–162.

Jung. C.G. (1912). The psychology of the unconscious. (vertaling van Wandlungen und Symbole der Libido) London: Kegan Paul Trench Trubner.

Jung. C.G. (1921). Psychologische typen. Zurich: Rascher Verlag. Translation in G. Adler & R.F.C. Hull (Eds. & Trans), Collected Works of C.G. Jung (1957-1979), ‘Psychological Types’ (volume 6, 1971). Princeton, NJ: Princeton University Press.

Jung, C.G. (1950). Foreword. The I Ching or Book of Changes. (Translation) Princeton, NJ: Princeton University Press.

Jung, C.G. (1952). Synchronizität als ein Prinzip akausaler Zusammenhänge. Ges. Werke, Bd, 8, 15. Jung, C.G. (1957–1979). Psychology and Alchemy (volume 12, 1968) and Alchemical Studies (volume 13, 1968). In G. Adler & R.F.C. Hull (Eds. & Trans.), Collected Works of C.G. Jung. Princeton, NJ: Princeton University Press.

Jung, C.G. (1957–1979). The Structure and Dynamics of the Psyche (volume 8, 1960). In G. Adler & R.F.C. Hull (Eds. & Trans.), Collected Works of C.G. Jung. Princeton, NJ: Princeton University Press.

Jung, C.G. (1964). Man and His Symbols. New York: Doubleday.

Kahneman, D. (2011). Thinking, fast and slow. New York: Farrar, Straus and Giroux. Kendler, K.S., Myers, J., Potter, J., & Opalesky, J. (2009). A web-based study of personality, psychopathology and substance use in twin, other relative and relationship pairs. Twin Research and Human Genetics, 12, 137–141.

Kurzban, R. (2010). Why everybody (else) is a hypocrite: Evolution and the modular mind. Woodstock (UK): Princeton University Press.

Lavie, P., & Hobson, J. A. (1986). Origin of dreams: Anticipation of modern theories in the philosophy and physiology of the eighteenth and nineteenth centuries. Psychological Bulletin, 100, 229–240.

Lee, K., & Ashton, M. C. (2004). Psychometric properties of the HEXACO personality inventory. Multivariate Behavioral Research, 39, 329–358.

Lilienfeld, S.O. (2007). Psychological Treatments That Cause Harm. Perspectives on Psychological Science, 2, 53–70.

Lilienfeld, S.O. (2012). Public skepticism of psychology. Why many people perceive the study of human behavior as unscientific. American Psychologist, 67, 111–129.

Lilienfeld, S.O., Lynn, S.J., Ruscio, J., & Beyerstein, B.L. (2010). 50 Great myths of popular psychology. Shattering widespread misconceptions about human behavior. Chichester UK; Wiley-Blackwell. Loftus, E. (1994a). The myth of repressed memory. New York: St. Martin’s. Loftus, E. (1994b). The repressed memory controversy. American psychologist, 49, 443–445. Marczyk, J. (2013). I find your lack of theory (and replications) disturbing. www.psychologytoday.com. Blog post in May 2013.

Mastrangelo, P. M. (2001). Review of the Myers-Briggs Type Indicator, Form M. In B.S. Blake, & J.C. Impara, (Eds.) The fourteenth mental measurements yearbook. (pp. 818 – 820), Lincoln, NE: Buros Institute of Mental Measurements, The University of Nebraska Press.

McCarley, N.G., & Carskadon, T.G. (1983) Test-retest reliabilities of scales and subscales of the Myers-Briggs Type Indicator and of criteria for clinical interpretive hypotheses involving them. Research in Psychological Type, 6, 24–26.

McCrae, R R., & Costa, P T., Jr. (1989). Reinterpreting the Myers-Briggs Type Indicator from the perspective of the Five-Factor Model of personality. Journal of Personality, 57, 17–40.

Meade, A W. (2004). Psychometric problems and issues involved with creating and using ipsative measures for selection. Journal of Occupational and Organizational Psychology, 77, 531–552.

Moskowitz, D S., & Zuroff, D C. (2004). Flux, pulse, and spin: Dynamic additions to the personality lexicon. Journal of Personality and Social Psychology, 86, 880–893.

**Moyle, P., & Hackston, J. (2018). Personality Assessment for Employee Development: Ivory Tower or Real World? Journal of Personality Assessment, 1–11.

Myers, I.B., & McCaulley, M.HK (1985). Manual. A guide to the development and use of the Myers-Briggs Type Indicator. Palo Alto, CA: Consulting Psychologists Press.

Myers, I.B., McCaulley, M.HK, Quenk, N.L., & Hammer, A.L. (1998). Manual: A guide to the development and use of the Myers-Briggs Type Indicator. Palo Alto, CA: Consulting Psychologists Press.

Myers, I. B., & Myers, P. B. (1980). Gifts differing. Palo Alto, CA: Consulting Psychologists Press.

National Research Council; Druckman, D., Bjork, R.A. (Eds.) (1991). In the mind’s eye: Enhancing human performance. Commission on Behavioral and Social Sciences and Education. Washington, D.C.: National Academy Press.

Pittenger, D. J. (1993). Measuring the MBTI ... and coming up short. Journal of Career Planning and Employment, 54, 48–53.

Pittenger, D.J. (2005). Cautionary comments regarding the Myers-Briggs Type Indicator. Consulting Psychology Journal: Practice and Research, 57, 210–221.

Sanders, K., van Riemsdijk, M., & Groen, B. (2008). The gap between research and practice: A replication study on the HR professionals’ beliefs about effective human resource practices. The International Journal of Human Resource Management, 19, 1976–1988.

Segers, J., Vloeberghs, D., De Prins, P., & Henderickx, E. (2009). Niets is zo praktisch als een goede theorie: Wie weet dit in HRM? Tijdschrift voor HRM, 2, 7–28.

Smit, Y., Huibers, M., Ioannidis, J., van Dyck R., van Tilburg, W., & Arntz, A. (2010). The effectiveness of psychoanalysis—a systematic review of the literature. (Study conducted for insurance companies –www.cvz.nl).

Smit, Y., Huibers, M., Ioannidis, J., van Dyck R., van Tilburg, W., & Arntz, A. (2012). The effectiveness of longterm psychoanalytic psychotherapy—a meta-analysis of randomized controlled trials. Clinical Psychology Review, 32, 81–92.

Stricker, L. J., & Ross, J. (1964). An assessment of some structural properties of the Jungian personality typology. The Journal of Abnormal and Social Psychology, 68(1), 62–71.

Tenopyr, M.L. (1988) Artifactual reliability of forced-choice scales. Journal of Applied Psychology, 73, 749–751.

Tepper, B. (2000). Consequences of abusive supervision. Academy of Management Journal, 43, 178–190.

Tooby, J., & Cosmides, L. (1992). The psychological foundations of culture. In J. Barkow, L. Cosmides, & J. Tooby, (Eds), The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press.

Trivers, R. (2002). Natural selection and social theory. Selected papers of Robert Trivers. New York: Oxford University press.

Trivers, R. (2012). The folly of fools. The logic of deceit and self-deception in human life. New York: Basic books. Vermeren, P. (2006). De hr-ballon. 10 populaire praktijken doorprikt. Gent: Academia Press.

Walton, D. (2000). Evaluating appeals to popular opinion. Inquiry: Critical Thinking Across the Disciplines, 20, 33–45.

Williams G.C. (1996) Plan and purpose in nature. London: Weidenfeld & Nicolson.