Proving a negative? Methodological, statistical, and psychometric flaws in Ullmann et al. (2017) PTSD study

Ullmann et al. recently published a pilot study in Translational Psychiatry in which they report failing to find a statistically significant reduction in either hair cortisol or hair cortisone levels in circumcised men as compared with genitally intact (noncircumcised) men. Based on such null findings, the authors purport to have "refuted the psycho-pathological long-term effects of circumcision" and that the lack of significant results, "add to the growing body of evidence in the literature that male circumcision is not likely psychologically traumatizing across the life-span." In addition, they claim that they have proven a "healthy functionality of the LHPA axis" in men subjected to circumcision during infancy or childhood. However, it is not possible to draw any such conclusions on the basis of a null finding, especially one derived from an underpowered study in which the trend in the data suggest, if anything, that an adequately powered study may have shown the opposite of what the authors claim. When combined with other weaknesses in study design, measurement, and interpretation, it becomes apparent that the authors' conclusions are not supported by their data.
[1] Ullmann E, Licinio J, Barthel A, Petrowski K, Oratovski B, Stalder T, Kirschbaum C, Bornstein SR. Circumcision does not alter long-term glucocorticoids accumulation or psychological effects associated with trauma- and stressor-related disorders. Transl Psychiatry 2017; 7: e1063.
[2] Boyle GJ, Langley PD. Elementary statistical methods: for students of psychology, education and the social sciences (pp.188-192). Sydney: Pergamon. (1969).
[3] Trafimow D, Earp B. Null hypothesis significance testing and Type I error: the domain problem. New Ideas in Psychol 2017; 45: 19-27.
[4] Keppel G. Design and analysis: a researcher’s handbook (3rd ed.). Englewood Cliffs, NJ: Prentice Hall. (1991).
[5] Earp BD, Wilkinson D. The publication symmetry test: a simple editorial heuristic to combat publication bias. J Clin Transl Res 2017; 3(S2): 5-7.
[6] Trafimow D, Rice S. A test of the null hypothesis significance testing procedure correlation argument. J Gen Psychol 2009; 136: 261-269.
[7] Rietschel L, Streit F, Zhu G, McAloney K, Kirschbaum C, Frank J, Hansell NK, Wright MJ, McGrath JJ, Witt SH, Rietschel M, Martin NG. Hair cortisol and its association with psychological risk factors for psychiatric disorders: a pilot study in adolescent twins. Twin Res Hum Genetics 2016; 19: 438-446.
[8] Steudte S, Kirschbaum C, Gao W, Alexander N, Schonfeld S, Hoyer J, Stalder T. Hair cortisol as a biomarker of traumatization in healthy individuals and posttraumatic stress disorder patients. Bio Psychiatry 2013; 74: 639-646.
[9] Campbell DT, Stanley JC. Experimental and quasi-experimental designs for research (p. 12). Chicago, IL: Rand McNally. (1963).
[10] Nunnally JC, Bernstein IH. Psychometric theory (3rd ed.). New York: McGraw-Hill. (1994).
[11] Tabachnick BG, Fidell LS. Using multivariate statistics (4th ed.). Boston, MA: Allyn & Bacon. (2001).
[12] Winer BJ, Brown DR, Michels KM. Statistical principles in experimental design (3rd ed.). New York: McGraw-Hill. (1991).
[13] Ferguson GA. Statistical analysis in psychology and education (5th ed., p. 176). Auckland, New Zealand: McGraw-Hill. (1981).
[14] Levenstein S, Prantera C, Varvo V, Scribano ML, Berto E, Luzi C, Andreoli A. Development of the Perceived Stress Questionnaire: a new tool for psychosomatic research. J Psychosom Res 1993; 37: 19-32.
[15] Fliege H, Rose M, Arck P, Walter OB, Kocalevent RD, Weber C, Klapp BF. The Perceived Stress Questionnaire (PSQ) reconsidered: validation and reference values from different clinical and healthy adult samples. Psychosom Med 2005; 67: 78 88.
[16] Brahler E, Schumacher J, Brahler C. Erste gesamtdeutsche Normierung der Kurzform des Gießener Beschwerdebogens GBB-24. (First standardisation of the short version of the Giessen-Subjective Complaints List GBB-24 in re-unified Germany. Psychother Psychosom Med Psychol 2000; 50: 14-21.
[17] Zigmond AS, Snaith RP. The Hospital Anxiety and Depression Scale. Acta Psychiatr Scand 1983; 67: 361-370.
[18] Hermann-Lingen, Buss U, Snaith RP. Hospital, Anxiety and Depression Scale - Deutsche version (HADS-D). Diagnostica 2002; 48: 112-113.
[19] Antonovsky A. The structure and properties of the Sense of Coherence Scale. Soc Sci Med 1993; 36: 725-733.
[20] Schumacher J, Wilz G, Gunzelmann T, Brahler E. The Antonovsky Sense of Coherence Scale: test statistical evaluation of a representative population sample and construction of a brief scale. Psychother Psychosom Med Psychol 2000; 50: 472-482.
[21] Wagnild GM, Young HM. Development and psychometric evaluation of the Resilience Scale. J Nurs Meas 1993; 1: 165-178.
[22] Leppert K, Koch B, Brähler E, Strauß B. Die Resilienzskala (RS) Überprüfung der Langform RS-25 und einer Kurzform RS-13. (The Resilence Scale: validity of the long-form RS-25 and a short-form RS-13). Klin Diagnostik und Evaluation 2008; 1: 226 243.
[23] Earp BD. The need to control for socially desirable responding in studies on the sexual effects of male circumcision. Available online: http://journals.plos.org/plosone/article/comment?id=info: doi/10.1371/annotation/d9e45961-b986-40f7-9268-8fb843d80797
[24] Boyle GJ, Matthews G, Saklofske DH. Personality measurement and testing: an overview. In GJ Boyle, G Matthews, DH Saklofske (Eds.), The SAGE handbook of personality theory and assessment, Vol. 2: personality measurement and testing (pp. 1-26). Los Angeles, CA: Sage. (2008).
[25] Boyle GJ, Helmes E. Methods of personality assessment. In PJ Corr, G Matthews (Eds.), The Cambridge handbook of personality psychology (pp. 110-126). Cambridge, UK: Cambridge University Press. (2009).
[26] Boyle GJ, Saklofske DH, Matthews G. (Eds.), SAGE benchmarks in psychology: psychological assessment, Vol. 2: personality and clinical assessment. London, UK: Sage. (2012).
[27] Boyle GJ. Does item homogeneity indicate internal consistency or item redundancy in psychometric scales? Pers Indiv Differences 1991; 12: 291-294.
[28] Cattell RB. Personality and mood by questionnaire (Table 54, p. 354). San Francisco, CA: Jossey-Bass. (1973).
[29] Standards for educational and psychological testing. Washington, DC: American Educational Research Association (AERA), American Psychological Association (APA), and National Council on Measurement in Education (NCME). (2014).
[30] Boyle GJ. Review of the (1985) ‘Standards for educational and psychological testing: AERA, APA and NCME.’ Aust J Psychol 1987; 39: 235-237.
[31] Boyle GJ, Saklofske DH, Matthews G. Criteria for selection and evaluation of scales/measures. In GJ Boyle, DH Saklofske, G Matthews (Eds.), Measures of personality and social psychological constructs. Amsterdam: Elsevier/Academic. (2015).
[32] Brooks JL. Counterbalancing for serial order carryover effects in experimental condition orders. Psychol Methods 2012; 17: 600 614.
[33] Szucs D, Ioannidis J. When null hypothesis testing is unsuitable for research: a reassessment. Frontiers Hum Neuroscience 2017; 11: 390.
[34] Cohen J. Statistical power analysis for the behavioral sciences. New York: Academic. (1977).
[35] Cohen J. Statistical power analysis (2nd ed.). Hillsdale, NJ: Erlbaum. (1988).
[36] Trafimow, D. Using the Coefficient of Confidence to make the philosophical switch from a posteriori to a priori inferential statistics. Educat Psychol Meas 2016; 77: 831-854.
[37] Trafimow D, MacDonald JA. Performing inferential statistics prior to data collection. Educat Psychol Meas 2017; 77: 201-219.
[38] Faul F, Erdfelder E, Buchner A, Lang A-G. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses. Behav Res Methods 2009; 41: 1149-1160.
[39] Trafimow D. An a priori solution to the replication crisis. Philosophical Psychol (in press).
[40] Trafimow D. Using the Coefficient of Confidence to make the philosophical switch from a posteriori to a priori inferential statistics. Educ Psychol Meas 2017; 77: 853-854.
[41] Trafimow D. Confidence intervals, precision and confounding. New Ideas in Psychol (in press).
[42] AAP Task Force on Circumcision. Male circumcision. Pediatrics 2012; 130: 585-586.
[43] Centers for Disease Control and Prevention. draft: background materials: background, methods, and synthesis of scientific information used to inform the ‘Recommendations for providers counseling male patients and parents regarding male circumcision and the prevention of HIV infection, STIs, and other health outcomes’. (2014). Docket No. CDC-2014-0012.
[44] Earp BD. Do the benefits of male circumcision outweigh therisks? a critique of the proposed CDC guidelines. Front Pediatr 2015; 3: 18.
[45] Frisch M, Earp BD. Circumcision of male infants and children as a public health measure in developed countries: a critical assessment of recent evidence. Glob Public Health. 2018;13: 626 641.
[46] Taddio A, Goldbach M, Ipp M, Stevens B, Koren G. Effect of neonatal circumcision on pain responses during vaccination in boys. Lancet 1995; 345: 291-292.
[47] Taddio A, Katz J, Ilersich AL, Koren G. Effect of neonatal circumcision on pain response during subsequent routine vaccination. Lancet 1997; 349: 599-603.
[48] Gunnar MR, Porter FL, Wolf CM, Rigatuso J, Larson MC. Neonatal stress reactivity: predictions to later emotional temperament. Child Dev 1995; 66: 1-13.
[49] Slater R, Cornelissen L, Fabrizi L, Patten D, Yoxen J, Worley A, Boyd S, Meek J, Fitzgerald M. Oral sucrose as an analgesic drug for procedural pain in newborn infants: a randomised controlled trial. Lancet 2010; 376: 1225-1232.
[50] Gunnar MR, Connors J, Isensee J, Wall L. Adrenocortical activity and behavioral distress in human newborns. Dev Psychobiol 1988; 21: 297-310.
[51] Porter FL, Wolf CM, Gold J, Lotsoff D, Miller JP. Pain and pain management in newborn infants: a survey of physicians and nurses. Pediatrics 1997; 100: 626-632.
[52] Boyle GJ, Saklofske DH. (Eds.), SAGE benchmarks in psychology: the psychology of individual differences, Vol. 4: clinical and applied research. London: Sage. (2004).
[53] Sorrells ML, Snyder JL, Reiss MD, Eden C, Milos MF, Wilcox N, van Howe RS. Fine-touch pressure thresholds in the adult penis. BJU Int. 2007; 99: 864–869.
[54] Ramos SM, Boyle GJ. Ritual and medical circumcision among Filipino boys: evidence of post-traumatic stress disorder. In GC Denniston, FM Hodges, MF Milos (Eds.), Understanding circumcision: a multi-disciplinary approach to a multi dimensional problem (pp. 253-270). New York: Kluwer Academic/Plenum. (2001).
[55] Menage J. Post-traumatic stress disorder after genital medical procedures. In GC Denniston, FM Hodges, MF Milos (Eds.), Male and female circumcision: medical, legal, and ethical considerations in pediatric practice (pp. 215-219). New York: Kluwer Academic/Plenum. (1999).
[56] Boyle GJ. El trastorno por estrés postraumático (PTSD) de larga duración como resultado de cirugía genital en menores (Longterm posttraumatic stress (PTSD) resulting from genital surgery in minors). Revista de Psicología de la Universidad de Chile 2002; 11: 17-24.
[57] Bensley GA, Boyle GJ. Physical, sexual, and psychological impact of male infant circumcision: An exploratory survey. In GC Denniston, FM Hodges, MF Milos (Eds.), Understanding circumcision: a multi-disciplinary approach to a multi dimensional problem (pp. 207-239). New York: Kluwer/Plenum. (2001).
[58] Boyle GJ, Bensley GA. Adverse sexual and psychological effects of male infant circumcision. Psychol Reports 2001; 88: 1105 1106.
[59] Boyle GJ, Goldman R, Svoboda JS, Fernandez E. Male circumcision: pain, trauma and psychosexual sequelae. J Health Psychol 2002; 7: 329-343.
[60] Boyle GJ. Circumcision of infants and children: short-term trauma and long-term psychosexual harm. Advances in Sexual Med 2015; 5: 22-38.
[61] Boyle GJ, Svoboda JS, Price CP, Turner JN. Circumcision of healthy boys: criminal assault? J Law Med 2000; 7: 301-310.
[62] Svoboda JS, Boyle GJ, Price CP. Circumcision of boys: a serious male health problem. Everyman: A Men’s Journal, 2000; 43(May/June): 58-62.
[63] Hammond T. Preliminary poll of men circumcised in infancy and childhood. BJU Int 1999; 83: 85-92.
[64] Hammond T, Carmack A. Long-term adverse outcomes from neonatal circumcision reported in a survey of 1,008 men: an overview of health and human rights implications. Int J Hum Rights 2017; 21(2).
[65] Hammond T, Reiss MD. Antecedents of emotional distress and sexual dissatisfaction in circumcised men: previous findings and future directions − comment on Bossio and Pukall (2017). Archives of Sexual Behavior 2018; 1-2.
[66] Earp BD. The need for reporting negative results − a 90-yearupdate. J Clin Transl Res 2017; 3(S2): 1-4.
[67] McBee MT, Matthews MS. Welcoming quality non-significance and replication work, but not the p-values: announcing new policies for quantitative research. J Adv Academics 2014; 25: 68 78.
[68] Boyle GJ, Hill G. Circumcision-generated emotions bias medical literature. BJU Int 2012; 109: E11.
[69] Rosenthal R. Experimenter effects in behavioral research. New York: Wiley. (1976).
[70] Rosenthal R, Rosnow RL. Artifacts in behavioral research: Robert Rosenthal and Ralph L. Rosnow's Classic Books. Oxford University Press. (2009).
[71] Sheldrake R. Experimenter effects in scientific research: how widely are they neglected? J Scientific Exploration 1998; 12: 73 78.
[72] Experimenter Expectancy Effect. In MS Lewis-Beck, A Bryman, T Futing Liao (Eds.), The SAGE encyclopedia of social science research methods. London: Sage. (2004).