UPDATED: A Detailed Commentary on a Really Bad Paper on IVF Outcomes in Older Women Using PGT-A

This is a detailed review of a recently published single paper which - our editors felt - was timely for several reasons: First, the subject of preimplantation genetic testing for aneuploidy (PGT-A), which by now for over 20 years has been a core theme of research, clinical practice, and discussion at the CHR. Second, for all this time, PGT-A has remained a controversial subject even though, by clinical and ethical standards, it no longer should be. A third very important motivation has been the rapidly declining quality of peer review in medical journals and, increasingly, also in science journals – another subject the CHR has been increasingly concerned about. The latter was obviously at play here, as this manuscript was published by Fertility and Sterility, one of the leading journals in the infertility field and the main organ of the American Society for Reproductive Medicine (ASRM). Remarkably, this article moreover appeared in the June 2025 issue of F&S, while in November of 2024, F&S had published the updated opinion of the Practice Committees of ASRM and the Society for Assisted Reproductive Technology (SART) on PGT-A, which completely contradicted much of what here discussed paper claimed. Yet, the ASRM/SART position paper was not even referenced in the manuscript. The reference list of the paper was, indeed, totally biased toward supporters of PGT-A (who, of course, also were likely the reviewers of the paper). This brings us to the fourth reason, a steadily growing publication bias we are seeing in journals. Popular opinions are published, while unpopular opinions often do not even make it into peer review. It is, of course, not all the editors’ fault because all of them are these days overwhelmed by rapidly increasing paper submission numbers and, at the same time, also face declining reviewer availability – as especially good reviewers – who, of course, still work for free – also are overwhelmed by review requests and, therefore, have to increasingly decline reviews. In short, a rapidly deteriorating publication industry urgently requires and begs for a major reorganization.


Introduction 

When reading what many of our supposedly even better medical journals at times allow into print these days, it sometimes feels like we live in an alternative universe. And as fate would have it, a very recent paper in Fertility and Sterility (1) offers an excellent example of what we mean and, therefore, offers an opportunity for discussion.  

 

Considering how much – forgive the word – “garbage” has over the last two decades been published about preimplantation genetic testing for aneuploidy (PGT-A) – of course, a favorite topic of permanent discussion at the Center for Human Reproduction (CHR) and, therefore, the VOICE as well as The Reproductive Times – it should not surprise that we here chose to address a PGT-A paper. 

 

It does not appear to matter that well-designed prospectively randomized studies have demonstrated no outcome benefits for IVF in general populations, as even a recent combined ASRM/SART Practice Committees’ opinion finally pointed out (2) as even this ASRM/SART guidance does not appear to have made even the slightest dent in the number of “garbage” papers on this subject, many in even better medical journals. Who then can be surprised that PGT-A utilization in association with IVF in the U.S. (and in many other countries) is still increasing? 

 

A Chinese paper on PGT-A

Coming back to the paper of discussion, this paper was authored by Chinese investigators – a point not made to disparage Chinese colleagues in general, who, of course, often publish very well thought-out and conducted studies. But AI-mass produced by paper mills, mostly in China and a few other select countries, has really become a serious problem for medical and science journals (3-5).  A really bad paper from one of these countries, nevertheless undiscovered sailing through peer review, therefore, deserves – and, indeed, mandates – special attention (hopefully also by editorial peer review processes at the journal). 

 

The paper here addressed claimed to have investigated the effects of preimplantation genetic testing for aneuploidy (PGT-A) in IVF cycles of women, and this is a very important point, who at the time of the IVF cycle, were at an advanced age (1). The paper, however, offers absolutely no rational why it would even conduct such a study, considering even ASRM and SART in a joined document finally – and with difficult to understand delay of not only months but years – recently concluded that PGT-A to this point, in all of its iterations and under all of its earlier names, has been unable to demonstrate any IVF cycle outcome benefits in unselected patients populations (2).  

 

And without clinical utility in general patient populations, why would anybody with just minimal common sense reach the conclusion that PGT-A, even theoretical, might work in older women? Women with advancing age, of course, produce progressively smaller numbers of eggs and embryos in IVF cycles, which, indeed, is one of two principal causes for declining pregnancy rates in IVF with advancing female age (the second cause is declining egg quality). Older women, based on biologically, but also mathematically indisputable facts, therefore, must be among the worst candidates for routine utilization of PGT-A, even though some “garbage” studies have suggested that women above age 35 benefit from PGT-A (6).  

 

The problem with the still expanding use of PGT-A in IVF cycles is not only the procedure’s complete lack of clinical utility (i.e., PGT-A not only fails to improve IVF outcomes) but by now overwhelming evidence suggests that, at least in some subpopulations, PGT-A actually reduces pregnancy and live birth chances in IVF. And older patients, for very obvious reasons, are, of course, among those subpopulations that are harmed by the procedure. And, though there are several very good reasons for that, we here will point out only the most obvious one because it is simply indisputable: Every false-positively “aneuploid” labeled embryo (which, therefore, is not used for transfer or is even discarded) reflects lost pregnancy chance!  

 

A word on the irrationality of PGT-A

That PGT-A produces an unacceptably high percentage of false-positive results has, as innumerable publications attest, been argued by the CHR against fierce opposition from PGT-A proponents since approximately 2007. By 2015, reporting the first chromosomally normal offspring in the world after transfers of what then were reported to be “aneuploid” embryos that should have been discarded based on test laboratory recommendations (under current PGT-A practice some of these embryos might be classified as “mosaic”), the CHR’s arguments against routine PGT-A utilization were no longer disputable.  

 

Rather than reconsidering the whole concept of PGT-A (in medical terminology called a “hypothesis”) of biopsying embryos for chromosomal abnormalities, PGT-A proponents introduced the concept of “mosaicism” to their report repertoire and rechristened the test under its current name, PGT-A, though without reassessing whether this test biologically, genetically, and mathematically really made any sense.


When a procedure has failed to demonstrate any utility for over 20 years and adds significant costs to already for many unaffordable IVF cycle costs, the only logical and ethically correct conclusion left is to abandon the use of the procedure, unless convincing indications exist for its use.  


Refuting that any of the promised outcome benefits from PGT-A have been achieved, the recent ASRM/SART document on the subject of PGT-A really offers a clear answer (2) because if a medical intervention does not fulfill its purpose and does not at least fulfill a compensatory outcome benefits, and when this procedure in different presentations and under different protocols and – on top of all of this – when this procedure in addition adds significant costs (in this case at least $5,000) to already for many unaffordable IVF cycle costs, the only logical as well as ethical conclusion left is to abandon the use of the procedure, unless convincing indications exist for its use (and nobody argues that there do not exist some of such rare indications). 

 

Without further rehashing the history of PGT-A, obviously very closely related to the history of the CHR, only so much: PGT-A can, for several technical and biological reasons, have false-positive results. One relatively late recognized reason is the self-correction of embryos, first reported in mice by British investigators and more recently reported by the CHR’s investigators in collaboration with colleagues in the Brivanlou Laboratory at Rockefeller University in NYC in human embryos. Many at blastocyst-stage by PGT-A as “aneuploid” diagnosed embryos, even though at that point correctly diagnosed as containing “aneuploid” cells (which means that an embryo is either “mosaic” or outright “aneuploid”), can, downstream from blastocyst-stage still self-correct (7) (i.e., expel “aneuploid” cells). Even a technically correct PGT-A diagnosis may, therefore, turn out to be biologically incorrect. 

 

And, as a biologically incorrect diagnosis – if such an embryo is then deselected from embryo transfer – this deselection reduces a woman’s cumulative pregnancy chance in the IVF cycle in which this embryo was produced. It, therefore, once again, does not take a genius – or, for that matter, large prospectively randomized studies – to figure all of this out. Yet, “garbage” papers, making all kinds of convoluted arguments in support of PGT-A utilization, still make it into the literature 

 

Further evidence on why this paper should never have been published

So, this time, it was F&S that fell for the scam involving a paper that claimed to have investigated “embryo transfer outcomes in women of advanced reproductive age” (1). And to make the study apparently more interesting, the researchers decided to restrict the study population to only older women who at most produced three oocytes in a given IVF cycle. 

 

Among 230 studied IVF PGT-A cycles, 49 (21,3%) were repeat IVF cycles, representing the first major protocol design error since repeat cycles in the same patients can lead to significant patient selection biases. Though the study alleges to have studied women of “advanced reproductive age,” astonishingly, all the manuscript reveals is that all women were above age 38 (no mean ± SD, median, or age range, and all of those, of course, greatly matter. In contrast, 309 patients received no PGT-A, and among those, 89 (28.8%) had more than one cycle (with, of course, the same concern about patient selection bias as in the PGT-A group).  

 

Those two study groups of patients were, moreover, of course, not randomized, suggesting that there had to be reasons why all of these women either did or did not undergo PGT-A, very obviously creating further concerns about patient selection biases and confounding patient characteristics. Considering the relatively large number of women with repeat IVF cycles in the study, it also seems likely that at least some of them may have been in PGT-A as well as no PGT-A cycles.  In other words, study design alone should have disqualified this paper from publication. 

 

But there are significant other reasons as well: By restricting the study to women with maximally three retrieved oocytes, the authors are trying to give the impression that they were studying a relatively poor prognosis patient group with abnormally low functional ovarian reserve (FOR). That could be a correct assumption, but only if the mean age was under 40 years. If the mean age were above 40, retrievals yielding only 1-3 oocytes would not necessarily be abnormal.   

 

We, however, thought that we also had found a seemingly redeeming characteristic of the paper: The authors in the Materials and Methods section claimed to have assessed IVF cycle outcomes with reference points, cycle start (intent to treat), and egg retrieval (another valid reference point). Since many supposedly important PGT-A papers in the literature unfortunately, over the years reported IVF cycle outcomes with reference embryo transfer, a correct assessment of cycle outcomes would, of course,  have been a positive brownie point for the paper because an intent to treat analysis includes all patients starting an IVF cycle, without selecting or deselecting specific patient sub-groups. Cycle outcome analyses with reference embryo transfer, in contrast, select out good prognosis patients because they usually include patients who did not make it to embryo transfer and, therefore, artificially inflate outcomes. 

 

But in proceeding with a review of the paper and data analysis, it very quickly became apparent that the credit we had given to the investigators was seriously misplaced: They reported that PGT-A cycles as one would expect, of course, had lower embryo transfer rates; but the difference was truly astonishing: Only 14.1% of PGT-A cycles reached embryo transfer vs. 84.4.2% in controls (on a side note: in the abstract the latter percentage was incorrectly listed as 73.2%). And beyond this point, the data analysis of the paper became outright deceptive because the paper reported a positive chemical pregnancy test in the PGT-A group in 71.4% of cycles (30/42, with 42 being the number of cycles that produced at least one transferrable embryo and, therefore, reached transfer). 

 

This, of course, very clearly reflects data manipulation and misdirection because the Materials and Methods alleged reference points for analysis of IVF cycle outcomes were cycle start and retrieval, for which we, above, obviously prematurely, lauded the authors. They now, however, replaced cycle start and retrieval with the incorrect and biased reference point, embryo transfer, as earlier noted, a very typical statistical “trick” proponents of PGT-A have used in innumerable studies over the years. Every study with reference point embryo transfer obviously selects out the best prognosis patients because it takes at least one or more blastocysts to be included in IVF cycle outcome determinations. In contrast, poorer prognosis patients who do not produce transferrable blastocysts just don’t exist in outcome determinations. The obvious, indisputable consequence is a significant exaggeration of outcome benefits for PGT-A, as so many pro-PGT-A studies have now claimed for over two decades, thereby misleading the IVF field.  

 

The here-discussed study demonstrates these facts, however, better than most other studies. Considering the huge discrepancy in the percentage of cycles reaching embryo transfer (14.1% vs 84.4%). How this point was not recognized during peer review is, therefore, difficult to understand. Among non-PGT-A cycles 362/429 reaching embryo transfers, these cycles produced in reference to transfers only 62/362 pregnancies, resulting in a seemingly statistically highly significant outcome benefit of positive pregnancy test (in itself a very unreliable endpoint) of 71.4% (PGT-A cycles) and 17.1% (non-PGT-A cycles, P<0.001). 

 

Correct data analysis of positive pregnancy test with reference point cycle start (i.e., intent to treat) should, however, of course have compared in PGT-A cycles 30 pregnancies out of 298 (10.1%) cycle starts to 62 pregnancies out of 429 cycle starts (14.5%), not only an inversion of reported outcomes, but suggestive of an almost 50% higher pregnancy rate for women not-undergoing PGT-A. 

 

The same principles, of course, also apply to other more telling outcomes than chemical pregnancies, including clinical pregnancy rates, clinical pregnancy rates per retrieval, live birth rates per retrieval, live birth per number of retrieval cycles, and – requiring slightly different considerations – time to pregnancy (see Table below).

In addition, the paper reported a significant difference in miscarriages with reference to retrieval of  4/298 (1.4%) vs.18/429 (4.2%) to the disadvantage of non-PGT-A cycles. (P<0.027); miscarriage rates per pregnancy of 4/28 (14.3%) vs 18/45 (40%) were also significantly higher in non-PGT-A cycles. Time to pregnancy in months was non-significantly longer in PGT-cycles (17.6 ± 4.6 vs. 16.7 ± 7.9 months), quite a long time in both groups. 


SUMMARY AND CONCLUSIONS 

After multivariate logistic regression, the paper claimed a significant advantage (P<0.001) for PGT-A in chemical pregnancies and in clinical pregnancies, which we cannot agree with, marginally lower miscarriage rates per retrieval (P=0.038) and per pregnancy P=0.013) which may or may not be correct, a better live birth rate per ET  – unquestionably –  a spurious finding, and even the paper noted no difference between both cycle groups when it came to live births.  

 

Considering this study’s results, the authors note that “PGT-A significantly improved clinical pregnancies and live birth rates per embryo transfer in women with advanced reproductive age and diminished ovarian reserve.” What the paper, however, failed to say is that pregnancy outcomes with reference embryo transfer have a very different meaning than pregnancy outcomes with reference cycle start (or even egg retrieval). Consequently, the paper gives the false impression that PGT-A in older women improves IVF cycle outcomes. The correct interpretation of the paper’s data has to be that in highly selected (likely the rare cases of older women with especially good ovarian reserve), older women may have an outcome advantage in time to pregnancy (and even that appears uncertain). 

 

As already noted before, a conclusion that PGT-A would improve pregnancy and live birth rates in older women is also completely counterintuitive to all logical and mathematical considerations. Nothing in this paper would, indeed, suggest such improvements. Alleged observations with great certainty can be attributed to patient selection biases, which are so obvious in this manuscript. It, for example, seems intuitive to assume that older women with unusually good ovarian reserve will see more frequent utilization of PGT-A than older women with low ovarian reserve. 

 

The authors furthermore concluded that “although the cost-effectiveness of PGT-A in this population requires further investigation, it offers a valuable tool for patient selection and may reduce the emotional toll of miscarriage for some patients.” We again must politely disagree since we saw nothing in this study to suggest that it in any way contributed to patients’ selection of PGT-A. Even assuming a marginal non-significant reduction in miscarriage rate with PGT-A to be valid – a very questionable proposition in itself – one must question whether such a minor effect warrants the wide utilization of PGT-A, which in the U.S. already involves over half of all IVF cycles.


REFERENCES

Because of its subject, this article – exceptionally – offers extended references listing with up to 5 authors and including titles of articles. 

  1. Ou Z, Liu N, Chen A, Li Q, et al. Effects of preimplantation genetic testing for aneuploidy on embryo transfer outcomes in women of advanced reproductive age with no more than three retrieved oocytes. Fertil Steil 2025;123(6):991-998 WORST PAPER AWARD in this issue 

  2. Practice Committees of the American Society for Reproductive Medicine and the Society for Assisted Reproductive Technology. The use of preimplantation genetic testing for aneuploidy: a committee opinion. Fertil Steril  2024;122(3):421-434 

  3. Ro C, Leeming J. Buy the “BY”: Revealing the machinations of paper mills. Nature 2025;642:823826 

  4. Soriano JB, Ruano-Ravina A. The rising threat of predatory journals and paper mills in respiratory medicine and research. Lancet Resp Med 2025;13(6):E30-E31 

  5. Parker L, Boughton S, Bero L, Byrne JA.> Paper mill challenges: past, present, and future. J Clin Epidemiol 2024;111549 

  6. Simopoulou M, Sfakianoudis K, Maziotis E, Tsisoulou P, Sokratis G, et al., PGT-A: who and when? A systematic review and network meta-analysis of RCTs. J Assist Reprod Genet 2021;38(8):939-1957 

  7. Yang M, Rito T, Metzger J, Naftaly J, Soman R, Hu J, et al. Depletion of aneuploid cells in human embryos and gastruloids. Nat Cell Biol 2921;23(4):314-3 

Next
Next

IMPORTANT GENERAL ISSUE FOR THE PRACTICE OF IVF