Byline: A study has outlined eight indicators of toxic masculinity in heterosexual men — and finds that ‘manliness’ is not necessarily a problematic aspect of masculinity.
How rife is the problem of ‘toxic masculinity’ in Western societies? A research study run in New Zealand has found that only a small percentage of men surveyed fell into the worst category of hostile toxicity — and that a desire to feel ‘manly’ wasn’t necessarily indicative that a person held socially damaging views.
In 2024, Sanders and his colleagues published a ‘toxic masculinity scale’, identifying 28 questions that assessed the degree of toxicity expressed by white male university students in the United States. Psychology doctoral candidate Deborah Hill Cone at the University of Auckland in New Zealand and her colleagues have now added to this with a more all-encompassing view of toxicity and a larger, broader sample of men in a study published in Psychology of Men & Masculinities.
The team dug into the results of the 2018–19 New Zealand Attitudes and Values Study, a broad survey with responses from nearly 50,000 people. More than 15,000 of the participants identified as heterosexual males and had answered relevant questions such as “being a woman/man is an important part of how I see myself” and “inferior groups should stay in their place”.
In a statistical analysis, the respondents fell into five groups. The good news is that only the smallest group (3.2% of the men) was characterized by the researchers as ‘hostile toxic’, whereas the largest group was ‘atoxic’ (35.4%)… Hill Cone and her colleagues found two moderate groups split between those who were more- or less-tolerant of people from sexual and gender minorities (LGBTQ+) , and a ‘benevolent toxic’ group, whose members got relatively high scores in measures of sexism but not in hostility… The odds of men in the sample having the hostile toxic profile were higher for those who were older, single, unemployed, religious or an ethnic minority, as well as those high on scales of political conservatism, economic deprivation or emotional dysregulation, or who had a low level of education… “The entitled rich tech bro or frat boy didn’t really appear” in the hostile toxic group, says Hill Cone. Instead, the hostile toxic group was made up mainly of marginalized, disadvantaged men… Importantly, how central ‘being a man’ was to someone’s sense of self wasn’t particularly predictive of which group they landed in. Although the men in the hostile toxic group did tend to report that their gender was important to them, so did many men in the other categories.
Of course as pointed out: this is a well-executed study but is only in New Zealand. Results may vary depending on location. Results are overall not surprising.
The two featured key studies are both open access:
Wait I thought toxic masculinity was a broader social concept that negatively harmed men directly by enforcing narrow definitions of what qualifies as manhood?
E.g., toxic masculinity makes a man feel like he can’t sew or watch a romcom without being labeled “not a real man.”
I think it is. The first linked paper is the one designing the scale… so they went into more details on this:
The definition of toxic masculinity fluctuates depending on context. For example, hegemonic masculinity, sometimes used as proxy for toxic masculinity, is a manifestation of masculinities that is characterized by the enforcement of restrictions in behavior based on gender roles that serve to reinforce existing power structures that favor the dominance of men (e.g., [7,8,9]). Hegemonic masculinity speaks to the systems and processes that elevated men to positions of power and maintain their dominance (e.g., [10,11]). Additionally, traditional masculinity is marked by stoicism, competitiveness, dominance, and aggression, characterizing it by an adherence to gendered attitudes [3].
Their final scale uses five factors: “masculine superiority”, “domination and desire”, “gender rigidity”, “emotional restriction”, “repressed suffering” (and a six one that they dropped). So some of these are indeed related to enforcing narrow definitions
And that broader social concept is enforced by other men being toxic to people not fitting that narrow definition. Only takes a small loud group to make a much bigger group feel bad about themselves if they don’t fit
Here’s the actual article and abstract:
https://psycnet.apa.org/fulltext/2027-02373-001.html
Despite being frequently discussed in both mainstream discourse and academic scholarship, little empirical work defines “toxic masculinity.” We address this oversight by estimating the prevalence of men’s distinct response patterns to eight indicators of problematic masculinity: gender identity centrality, sexual prejudice, disagreeableness, narcissism, hostile sexism, benevolent sexism, opposition to domestic violence prevention initiatives, and social dominance orientation. Latent profile analysis of a nationwide random sample of heterosexual men from New Zealand (N = 15,808) identified five profiles. The largest profile (35.4%), “Atoxics,” scored low across all focal measures, whereas two other profiles (totaling 53.8%) expressed low-to-moderate support across indicators. The remaining two profiles reflected distinct forms of problematic masculinity marked by contrasting forms of sexism: “Benevolent Toxics” (7.6%) and “Hostile Toxics” (3.2%). Notably, gender identity centrality was only a weakly informative indicator of problematic masculinity. We thus demonstrate the need to separate problematic masculinity from other constructive forms of masculinity.
For the measurement:
We utilize data from Time 10 (2018/2019) of the New Zealand Attitudes and Values Study (NZAVS), the largest cross-sectional sample to date that contains all our focal variables. The NZAVS is an ongoing national probability panel study of New Zealand adults that began in 2009 and examines myriad variables, including personality, social attitudes, and health and well-being each year. The NZAVS was approved by the (University of Auckland Human Ethics Committee) Human Ethics Committee and is renewed every 3 years. Although the data presented in this study are not publicly available due to restrictions imposed by our Ethics Committee, a deidentified data set containing the variables analyzed here is available upon request from the authors for the purpose of replication. The NZAVS uses extensive recruitment strategies to ensure broad national representation and achieves large sample sizes (i.e., N = 47,948 in the wave utilized in the present study), which helps mitigate concerns about sampling error and enhances the stability and precision of parameter estimates. Sibley (2021) provided full details of the sampling procedure, retention rates, and ethics approvals for the NZAVS (see also the NZAVS Open Science Framework page at https://osf.io/75snb/?view_only=dc7e2214ec194a63a0401a442e69354d).
Gender identity centrality was assessed with a single item adapted from Leach et al. (2008): “Being a woman/man is an important part of how I see myself.”
Sexual prejudice was assessed using one item adapted from the Pew Forum on Religion and Public Life U.S. Religious Landscape Survey (Pew Research Center, 2007): “I think that homosexuality should be accepted by society” (reverse-scored).
Disagreeableness was measured by reverse-scoring participants’ responses to the Agreeableness facet of the Mini-International Personality Item Pool–6 (Sibley et al., 2011). Participants were asked to respond to four items: I … (a) “sympathize with others’ feelings” (reverse-coded), (b) “feel others’ emotions” (reverse-coded), © “am not interested in other people’s problems,” and (d) “am not really interested in others” (α = .71).
Narcissism was assessed using three of the highest loading items from Campbell et al.’s (2004) Psychological Entitlement Scale. Participants rated how strongly they (a) “feel entitled to more of everything,” (b) “deserve more things in life,” and © “demand the best because I’m worth it” (α = .70).
Hostile and benevolent sexism were captured using 10 items from the 22-item Ambivalent Sexism Inventory (Glick & Fiske, 1996). Hostile sexism was measured using the mean of items 5, 11, 14, 15, and 16 (e.g., “Women seek to gain power by getting control over men”; α = .84). Benevolent sexism was measured using the mean of items 8, 9, 12, 19, and 22 (e.g., “Women should be protected and cherished for by men”; α = .70).
Opposition to domestic violenceprevention was measured with a single item (Sibley et al., 2020). Participants were asked to rate how strongly they support “Greater investment in reducing domestic violence” on a 1 (strongly oppose) to 7 (strongly support; reverse-scored) scale.
Social dominance orientation was assessed using the mean of six items from Sidanius and Pratto’s (2001) 16-item SDO6 scale: (a) “It is OK if some groups have more of a chance in life than others,” (b) “Inferior groups should stay in their place,” © “To get ahead in life, it is sometimes okay to step on other groups,” (d) “We should have increased social equality” (reverse-scored), (e) “It would be good if groups could be equal” (reverse-scored), and (f) “We should do what we can to equalize conditions for different groups” (reverse-scored; α = .74).
The analysis to find the profiles:
We followed Johnson’s (2021) recommendations and used Mplus Version 8.10 to estimate LPAs with between 1 and 10 profiles under four distinct variance–covariance structures (see Supplemental Tables S3)
To assess model fit, we examined the Akaike information criterion (AIC), the Bayesian information criterion (BIC), and the sample-size adjusted BIC (aBIC), with lower values indicating relatively better model fit. We also examined the Lo–Mendell–Rubin adjusted likelihood ratio test (see Lo et al., 2001; Nylund et al., 2007; Vermunt, 2024) and the Parametric Bootstrapped Likelihood Ratio Test (Curran & Bauer, 2021) to determine if a model with k-profiles significantly improves model fit relative to the k − 1 profile solution. Finally, we evaluated the entropy of the different model solutions, with values closer to 1.0 indicating clearer separation into distinct profiles (see Collins & Lanza, 2010).
the BIC under the Type 1 variance–covariance structure increased after the eighth profile, suggesting model fit declined beyond this point. Moreover, models with nine and 10 profiles did not converge. The Lo–Mendell–Rubin adjusted likelihood ratio test for the eighth profile was nonsignificant (p = .99), indicating that adding an eighth profile did not improve model fit relative to the seven-profile solution. Of the seven models that converged and produced improvements to model fit, the five-profile solution had the highest entropy (0.82), indicating a clear separation of these data into distinct profiles (see Collins & Lanza, 2010). Both the Lo–Mendell–Rubin adjusted likelihood ratio test and Bootstrapped Likelihood Ratio Test also supported the five-profile solution over the four-profile solution.
Here are the five profiles they discuss:

Always attempting to quantify nebulous labels we’ve assigned to collections of behaviours.
You can’t ungeneralise generalisations that society has made about the most complex known entities in the universe.





