Unfin(n)ished Business: What the Finnish Registry Study Can and Cannot Tell Us About Mental Health in Gender-Dysphoric Youth
On April 4, 2026, the Society for Evidence-Based Gender Medicine (SEGM), a group of clinicians who promote and support research that seeks to raise doubts about the effectiveness and safety of medical and surgical interventions for gender dysphoria, shared on X its initial impressions of a recently published study out of Finland, claiming that it showed “youth gender transitions (under age 23) did not improve mental health symptoms.” The post referred to a study titled “Psychiatric Morbidity Among Adolescents and Young Adults Who Contacted Specialized Gender Identity Services in Finland in 1996–2019: A Register Study,” published a few days earlier in Acta Paediatrica, a peer-reviewed journal with a solid reputation in its field. In this study, early career clinician-researcher and adolescent psychiatrist Sami-Matti Ruuska and colleagues examined mental health care utilization among every Finnish resident under age 23 who had been referred to the country’s only two clinics providing gender reassignment interventions, comparing service use before and after referral for assessment, and concluded that “psychiatric needs do not subside after medical gender reassignment.”
The study’s senior author was pediatric psychiatrist Dr. Riittakerttu Kaltiala, director of the Finnish Gender Identity Service (GIS), one of the more controversial figures in pediatric gender medicine. She was among the first clinicians to publicly raise concerns about the efficacy and safety of the prevailing affirmation-based model of care, and among the first to implement, on a national basis, a more restrictive approach that placed greater emphasis on mental health-focused support than on pharmacological or surgical intervention. While Kaltiala is not officially a member of SEGM or part of its leadership, her work has been prominently featured by the organization on social media, and she appeared as a speaker at a SEGM-organized conference in 2023.
At the time of writing, the SEGM post had been viewed more than 1.2 million times and reposted more than 2,100 times. Many of those amplifying it have suggested that this study represents some of the strongest evidence yet not only that gender-affirming care fails to improve the mental health of the young people who receive it, as its strongest advocates have often claimed, but that it actually makes their mental health significantly worse.
The study’s publication has also given fresh energy to gender-critical advocates seeking to ban or severely restrict access to medical transition for young people. Claire Lehmann, the editor of the right-leaning, free-speech-promoting Quillette, called youth gender transition “the biggest medical scandal of the 21st century” while quoting the original SEGM post. In his widely read Substack Hazard Ratio, the influential gender-critical journalist Ben Ryan proclaimed that “Major Psychiatric Problems Didn’t Improve After Youth Gender Transition Treatment in Finland. They Rose.” (note: Ryan later revised the headline to read: “Indicators of Major Psychiatric Problems Didn’t Improve After Youth Gender-Transition Treatment in Finland. They Rose.”)
Yet, as is often the case when studies in this field appear to cast doubt on the benefits of gender-affirming care, the certainty expressed in public discourse by opponents of that model far exceeds what the study can actually support. To understand why this paper has attracted so much attention, and why its findings require careful interpretation, it is worth looking closely at what the research actually showed, and at why Finland, despite being a relatively small country, plays such a central and disproportionate role in debates over the effectiveness, safety, and appropriateness of medical interventions for adolescents and young adults seeking gender reassignment.
The question of whether gender-affirming care—the constellation of medical and surgical interventions that support a person’s ability to live and be legible as a member of the opposite sex and function in that role, produces a sustained and -meaningful improvement in mental health—remains highly contested. Though trans-health advocates and multiple mainstream medical societies have long maintained that it does, the evidence supporting that claim rests largely on studies with substantial methodological limitations. Many of the studies reporting mental health benefits have examined highly selected patient cohorts that may not be representative of the broader population of young dysphoric patients seeking care. This concern may be especially important given the widely reported shift in the composition of the youth gender-dysphoria cohort over time, which has become much more natal-female predominant and has come to include a higher burden of pre-existing psychiatric comorbidity, a greater prevalence of autism spectrum traits, and more frequent histories of adverse childhood experiences. It is therefore fair to ask whether positive effects observed in earlier, more selected cohorts can be generalized to this substantially different contemporary population.
Many studies have used pre-post treatment designs without a control group, making it difficult to determine whether any observed improvement reflects the intervention itself, regression to the mean, or spontaneous recovery unrelated to treatment. Others rely primarily on patient self-report rather than more objective indicators of poor mental functioning, such as hospitalizations, health care visits, or standardized validated questionnaires. Follow-up periods are often short, leaving long-term effects uncertain. Patients may also be lost to follow-up or fail to complete scheduled assessments, and the reasons they stop attending a clinic are often unknown. This can create a healthy survivor effect, in which the patients who remain under observation are the very ones most likely to improve. When prospective studies with closer follow-up have been conducted in more recent treatment cohorts, improvements in mental health have not been consistently detected.
As a result, multiple systematic reviews have concluded that there is insufficient evidence to determine whether gender-affirming care, or any of its component interventions, leads either to improvements or to decrements in mental health among adolescents and young adults with gender dysphoria. That does not mean gender-affirming care does not work, or that it is harmful. It means only that the research to date has not established with sufficient confidence that it works well enough, and consistently enough, to justify broad claims about its mental health benefits. Studies using stronger designs and more bias-resistant data sources may therefore provide important clarifying information.
This is where the Finnish data hold real promise. Gender care in Finland is highly centralized and publicly administered. All young people aged 22 and under seeking gender reassignment services are seen at one of two publicly funded clinics overseen by the Finnish GIS. Very few young people in Finland have been able to receive gender-affirming care without first being assessed within that system.
Another distinctive feature of Finland’s health care system is that nearly all encounters, including physician visits, hospitalizations, surgical interventions, and prescription drug use, are recorded in nationally representative health care registries. Each physician or hospital encounter is associated with one or more diagnostic codes, giving some indication of the condition that prompted the visit. Every Finnish resident has a unique numerical health identifier, which allows researchers to construct anonymized longitudinal patient profiles and track the complete health care trajectory of individuals over time. That gives Finnish researchers a uniquely powerful tool for evaluating trends and outcomes in young people seeking gender care, one that is difficult to replicate in most other health care environments.
The completeness of registry data, however, comes at the expense of comprehensiveness. Researchers can observe patterns of health care utilization, but they cannot observe the patients themselves. A registry can show that a patient saw a particular physician in a particular setting and received a particular diagnosis. It cannot show how distressed that person was, how severe their symptoms were, how well they were functioning, or how they themselves experienced their condition. A patient who sees a cardiologist for congestive heart failure may be acutely unwell, clinically stable, minimally symptomatic, or living with a poor quality of life. The registry alone cannot tell us which.
Even with those limitations, the Finnish registry remains exceptionally well suited to identifying all GIS patients and tracking their health care utilization over time. It is therefore a high-quality data source for understanding how young people with gender dysphoria access care, and how their patterns of health care use compare with those of other patients. It also makes it possible to identify controls: people who are otherwise similar to GIS patients but do not have gender dysphoria, whose health care utilization can be compared with that of those who do.
But a high-quality data source does not guarantee a high-quality study. A Formula 1 car may be a high-performance vehicle, but it matters considerably whether it is being driven by the second coming of Michael Schumacher or a sixteen-year-old fresh off her learner’s permit. A skilled driver can navigate at high speed through the tightest chicanes; a less skilled one will be sent spinning off course. Similarly, a poorly designed research protocol, whether through sloppiness or intent, can undermine even an exceptional data source and yield a misleading conclusion. With that in mind, it is worth examining exactly what choices the authors made, what those choices allow the study to show, and, crucially, what they do not.
The authors identified 2,083 people who had received care at the GIS clinics between 1996 and 2019, before Finland adopted its more restrictive model of care. They then used the registry to ask two questions: whether each subject had seen a specialty-level psychiatrist at any point before their initial GIS assessment, and whether they had such a visit more than 730 days after that initial assessment. Psychiatric contacts during the two-year interval immediately following intake were excluded on the assumption that psychiatric involvement during that period would be common, if not nearly universal.
Because specialty-level psychiatric care in Finland generally requires referral from primary care, the authors treated any such visit as a marker of “severe psychiatric morbidity.” Subjects were also categorized by natal sex, whether they were seen before or after 2011, and whether they ever received any gender-reassignment medical or surgical intervention. In addition, the authors counted each subject’s total number of psychiatric visits, regardless of when those visits occurred in relation to GIS intake or any later intervention.
Each case was then matched to eight controls of the same age and index date who had never been seen at the GIS, four natal male and four natal female. The main outcome was whether the likelihood of specialty psychiatric contact was greater after GIS intake than before it. The authors also used a Cox proportional hazards model to estimate how quickly GIS patients entered specialty psychiatric care in the follow-up period compared with healthy controls, while adjusting for natal sex, prior psychiatric contact, and receipt of gender-reassignment interventions.
Of the 2,083 patients seen over this 24-year period, 481, or 23%, were natal males and 77% were natal females. Fully 91% were seen during the nine-year period from 2011 to 2019. Most patients seen at the GIS did not receive any gender-reassignment intervention. Only 34% of males and 39% of females did so. The study did not report either the type of intervention received or its timing relative to the index date.
Unsurprisingly, GIS patients were more likely than controls to have seen a psychiatrist both before the index date and in the follow-up period beginning at the 730-day mark. The likelihood of pre-existing psychiatric contact was higher in the later era than in the earlier one, 47.9% versus 23.7% (p < .001), but there were no significant differences between eras in the likelihood of psychiatric contact during follow-up, 61.3% versus 66.1% (p > 0.2).
When these outcomes are examined by receipt of gender reassignment, a striking pattern emerges. Among the 62% of subjects who did not receive a gender-reassignment intervention, there was no meaningful difference between the probability of a psychiatry visit before GIS intake and the probability of one after the two-year mark. Among the 38% who did receive some form of gender-reassignment intervention, by contrast, only 19% had seen a psychiatrist prior to intake, whereas 56% had at least one such visit during follow-up. This is the finding that has attracted the greatest controversy, and that gender-critical commentators, including SEGM, have widely interpreted as evidence that gender reassignment worsened patients’ mental health. The study also found that prior psychiatric contact was more common among both natal males and natal females who did not go on to receive a gender-reassignment intervention, suggesting an inverse relationship between pre-existing psychiatric health care utilization and eventual progression down a medical transition pathway. But the study cannot tell us how much of that pattern reflects clinician gatekeeping within the Finnish system, patient self-selection, or some combination of the two.
Much of this is not especially surprising. Other studies have likewise shown that young people presenting with gender dysphoria in more recent years have a higher burden of prior psychiatric contact, suggesting a rising burden of psychiatric comorbidity over time. Nor is it surprising that psychiatric contact was more common among GIS patients than among controls, as that is consistent with the well-established excess of mental health comorbidity in this population. What has driven the fiercest condemnation of pediatric gender-affirming care is not those background findings, but the apparent fourfold increase in medium- to long-term specialty psychiatric contact among those who ultimately received a gender-reassignment intervention. Before that result is treated as evidence that medical transition worsens mental health, it requires much closer scrutiny.
The problem can be framed in three questions. First, does a single visit to a specialty-level psychiatrist at any point more than 730 days after intake truly constitute a valid marker of “severe ongoing psychiatric morbidity,” as the authors suggest? If not, then the study’s central outcome is already on uncertain ground. Second, does the paper establish the temporal sequence clearly enough to show that the gender-reassignment intervention preceded the psychiatric contact being counted? The authors do not report either the type of intervention received or its timing relative to the index date or to subsequent psychiatric visits. Third, are there unmeasured differences between those who did and did not proceed to medical transition that could account for some or all of the association? That includes not only potential confounding by indication, but also differential gatekeeping, patient self-selection, or both.
On each of these points, there is enough uncertainty to cast serious doubt on the validity of the authors’ stated conclusions. More troublingly, the analytic choices are unusual enough that a methodologically literate reader is left asking why these particular decisions were made, and how much confidence they warrant.
Let’s deal with them one by one:
Operationalization of the Outcome Variable. The choice of a single instance of a visit to a specialist-level psychiatrist as a marker of severe psychiatric illness is somewhat curious, and fails even a basic common-sense test. Technically, this is referred to as face validity. Let me explain with some personal history.
I first sought care for my gender dysphoria in late 2000, started hormones in 2001, and had sexual reassignment surgery in 2003. I was lucky not to have any major psychiatric comorbidity aside from my dysphoria prior to seeking care, and had not seen a psychiatrist since my early childhood. While in the midst of my gender transition, I saw a psychiatrist as part of my dysphoria assessment and managed to avoid any diagnoses. In 2012, I experienced a bout of mild depression. Because of my trans status, my family doctor thought it would be best handled by a psychiatrist. Like in Finland, Canadians generally need a referral from their primary care physician to see one. I saw her three times over the course of two months, and have not seen her again in the fourteen years since.
Now, most reasonable people would not consider that I had “major psychiatric comorbidity,” but under this protocol, I would have been counted as such. This definition of severe psychiatric morbidity makes no differentiation between the subject who sees a psychiatrist once at any point following the 730-day mark and someone who is seeing a psychiatrist repeatedly. While the authors would argue that minor psychiatric complaints will be managed by the subject’s primary care physician, with referrals reserved for major issues, the threshold for referring a trans patient for specialist involvement may be lower than it would be for a cis patient with the same severity of psychiatric symptoms, especially if a history of gender transition or having undergone gender reassignment is considered a significant comorbidity.
In addition, many GIS patients will already have an established relationship with a psychiatrist as part of their GIS assessment, and it is generally easier to re-establish care with a physician where there is a pre-existing patient-physician relationship. Ongoing psychiatric care continuing after an initial assessment may also be fairly routine for persons who had more intense psychiatric illness earlier in the transition journey, but whose symptoms are now in remission. In this scenario, seeing a psychiatrist in the ambulatory care setting a few times a year may not be indicative of ongoing severe psychiatric morbidity.
What makes this operationalization especially curious is that the richness of the Finnish registry should have allowed for measures of severe psychiatric illness with far greater face validity, and likely greater construct validity as well. Psychiatric hospitalizations, escalating frequency or clustering of encounters, medication initiations or switches, and other markers of intensifying care would all have been more credible indicators of active and clinically significant illness than the one the authors chose. Instead, the authors selected an outcome so crude that it risks collapsing mild, transient, routine, and severe psychiatric phenomena into the same category. It is like getting behind the wheel of our aforementioned Formula 1 racer and then driving it around the track at 25 kilometers an hour with the blinkers flashing.
Temporality of Gender Reassignment: As written, the study implies that any gender-reassignment intervention was initiated during the two years following the initial assessment, and therefore that the psychiatric outcome necessarily occurred afterward. But once again, despite the fact that both the occurrence and timing of these interventions appear to have been captured in the registry, the authors chose not to report them explicitly. It is entirely plausible that some patients did not receive their first intervention within the initial 730 days following intake. And because gender-reassignment interventions are not typically delivered all at once, it is also possible that not all desired or indicated interventions had been initiated by that point. If so, the assumption of temporality begins to weaken. Some of the psychiatric health care utilization recorded after the 730-day mark may have been driven not by regret, deterioration, or treatment failure, but by distress related to delays in receiving care or to an incomplete transition process. While these alternate explanations are speculative, it is speculation the authors could have largely foreclosed by reporting timing data that were plainly available to them. Their decision not to do so leaves a central causal assumption unproven and further erodes confidence in the study’s design and presentation.
Influence of Unmeasured Confounding: A confounder, by definition, is a factor that (a) is known to cause the outcome, psychiatric specialist visits, in the absence of the primary exposure, gender reassignment, and (b) is associated with the exposure of interest. Any unmeasured factor that is more likely to be present, or more severe, in persons who undergo gender reassignment and that may also promote poorer mental health can be a potential confounder, and may provide an alternate explanation for the observed increased likelihood of specialist psychiatric care. Unmeasured confounders come in two forms: those that cannot be measured because they are not recorded in the dataset, and those that are included but where a decision was made not to account for them.
The most likely candidate for an unmeasured confounder is that once people begin a transition from their birth-sex-associated gender role toward their preferred gender role, they often also embark on a second transition: from being a member of a relatively non-stigmatized group to one that is heavily stigmatized. This stigmatization may be especially severe among young people early in transition who have not yet obtained visual congruence with the acquired gender (VCAG), a technical term for self-perceived passability, and an outcome that has been strongly associated with improved mental and social functioning as well as decreased stigmatization. Obtaining VCAG requires sustained exposure to hormones, facial feminization surgery in natal males, and is also more likely with early intervention before irreversible pubertal changes have taken hold. While the authorship group cannot be faulted for not including stigma or VCAG as control variables, since they would not be recorded in the Finnish registry, the impact of this potential alternate explanatory mechanism for delayed access to psychiatric services cannot simply be handwaved away
So why did the authors make these curious choices? Given the smorgasbord, or more accurately the voileipäpöytä, of options available within the comprehensive Finnish health care utilization databases, why did they make the selections they did? Why look at a single instance of contact with a psychiatrist when the long-term trajectory of post-treatment mental health care use would have been more meaningful? Why did they merely control for the presence of pre-existing psychiatric contact instead of adjusting for its intensity? Why were they not more precise about the date of the first gender-reassignment intervention?
While it may be the first instinct of many to reach for conspiracy theories based on Kaltiala’s relationship with SEGM or simple partisan hackery, the most likely explanation is probably much more pedestrian.
When researchers wish to access a national health care registry like the one in Finland, they are not given carte blanche to inspect the full breadth of available data. They must submit a protocol in which a cohort is defined, and only the variables needed for the analysis are provided. They do not receive unfettered access to raw patient-level data; rather, internal analysts generate aggregate or structured datasets that the authors can then use as their working file. Data extraction is time-consuming and requires substantial analyst time. In Ontario, a data extraction request can run into the tens of thousands of dollars. To go back and request a new set of data, or even a different aggregation of the same data, is often cost-prohibitive. Just as in woodworking, you only get one cut. Study budgets are often limited, and researchers are stuck with the data they get.
The researchers did not request this dataset specifically for the present project. It appears to have been assembled originally to address a completely different research question, namely whether young patients with gender dysphoria were at increased risk for all-cause and suicide-related mortality, the results of which were published by this same authorship group in 2024. That study found that while raw mortality and suicide rates were elevated among persons with gender dysphoria, much of that risk was driven by the degree of psychiatric morbidity, operationalized as cumulative lifetime contacts with specialty psychiatry.
Rather than go through the trouble and cost of commissioning an entirely new data pull to address the research question in the current study, the authors seem to have decided to reuse the dataset they already had on hand, one that was not purpose-built for this purpose. Just as you could use a racquet you originally bought for badminton for a game of tennis, you may be able to get the job done, but probably not as well as you would with the one designed specifically for the game you are playing.
I do not offer this relatively innocent explanation for the deficiencies in the study design in order to absolve the authors of responsibility for releasing a flawed study into the public sphere. While the limitations section of the paper contains the appropriate disclaimers about the conclusiveness of its findings, given the centrality of this authorship group to the political and scientific debate surrounding pediatric gender care, they had to anticipate that results emerging from a protocol that was not fit for purpose would be overgeneralized and misconstrued in support of efforts to further restrict gender care. And given Kaltiala’s well-described public positioning on the dangers and excesses of pediatric gender care, even a relatively neutral observer could reasonably wonder whether that risk of misuse was merely foreseen or quietly welcomed.
So where do we go from here? The research question they were seeking to address, the impact of transition on long-term mental health, is of critical importance, and the Finnish national health care registries, if properly utilized remain well suited to address it if the proper data are extracted and analyzed, and the proper design applied. It is well worth the applying the necessary resources to be done correctly and accurately. Most importantly, consideration should be given to a definition of mental health comorbidity that is more quantitative and multidimensional than a dichotomous yes/no outcome. Several algorithms have been identified in other domains which translate different aspect of mental health care utilization and diagnoses into a proxy for psychiatric disease severity; while they have not formally evaluated in the gender dysphoria population, they would make for an interesting exploratory outcome. Recognizing the polarized nature of the discourse around pediatric gender care, it would also be a good-faith offering on the part of the Finnish authorship team to include researchers (particularly those experienced in analyzing claims-based data) who have shown a commitment to unbiased inquiry but are more publicly identified as being supportive of the notion of transition. (Not naming names, but these people are definitely out there 😄)
This study did not cross the fin(n)ish line; there is little more clarity on the relationship between mental health and gender transition than there was prior to publication, only more polarization. Setting politics aside and aiming for the best design possible will ensure the next one does.









"While it may be the first instinct of many to reach for conspiracy theories based on Kaltiala’s relationship with SEGM or simple partisan hackery"
I guess I was more driven by patient accounts of borderline abusive behavior towards them while attempting to access care. This seems like another Blanchard situation, where the person in charge regards each person they prevent from transitioning as a win and everyone they have to allow to transition as a loss.
Good piece - great to see composed analysis of this sensitive but important topic.