Debate

What Makes a Good Psychological Experiment?

9:54 am May 29, 2026

by Azra Hussain

Follow Us OnG-News | Whatsapp

The pursuit of knowledge must never come at the expense of our shared humanity. Science exists to enhance life, not to diminish it, and any research that forgets this principle risks losing sight of its true purpose.

Some of psychology’s most famous experiments—such as the Milgram experiment and the Stanford Prison experiment—have profoundly influenced the field, but not without controversy. These studies, while yielding fascinating results, have been criticised for their dubious methodologies. Concerns include the use of unreplaceable conditions, coercion, deliberate deception, and unscientific techniques. Such issues prompt an essential question: what constitutes a robust and ethical psychological experiment?

Psychology, as a scientific discipline, is relatively young, having existed in its modern form for just over a century. Its origins remain the subject of debate, though many trace experimental psychology to 1879, when Wilhelm Wundt, a German psychologist, founded the first psychology laboratory in Leipzig. This moment is often cited as the formal beginning of psychology’s journey to establish itself as a rigorous science, distinct from the speculative traditions of metaphysics.

From its inception, psychology has sought to legitimise itself through the application of scientific methods. However, this ambition has not come without challenges. Chief among them is the difficulty of controlling the numerous variables intrinsic to psychological research. Unlike plants or cells, human beings—and by extension, animals—exhibit unpredictable behaviour, particularly in artificial, controlled environments. This variability complicates efforts to ensure consistency and replicability in studies.

Besides, humans are uniquely responsive to authority, a dynamic vividly illustrated by both the Stanford Prison experiment and the Milgram experiment. In the former, participants tasked with playing the roles of guards and prisoners quickly adopted extreme behaviours, raising questions about the role of situational pressures. In the latter, participants obeyed instructions to deliver what they believed were harmful electric shocks to others, demonstrating the powerful influence of perceived authority.

These iconic studies underscore the complexity of designing experiments that are not only scientifically valid but also ethically sound. As psychology continues to evolve, the discipline faces the enduring challenge of reconciling its pursuit of knowledge with the imperative to respect and protect the individuals who participate in its inquiries.

What Was The Milgram Experiment?

The Milgram experiment, devised by Yale University psychologist Stanley Milgram, sought to probe the boundaries of human obedience. Conducted against the backdrop of the trial of Nazi war criminal Adolf Eichmann, it was framed by a chilling question: could the atrocities of Nazi soldiers be attributed to their submission to authority rather than their own volition?

The experiment involved three roles: the experimenter, the teacher, and the learner. Crucially, the learner was not a genuine participant but an actor, instructed to feign responses by the experimenter. The teacher, the true subject of the study, was led to believe they were part of an inquiry into the effects of punishment on learning.

In the initial phase, the teacher and learner were introduced to a room where the learner was strapped into a chair, ostensibly “to ensure they would not escape.” The teacher was then taken to an adjoining room, separated visually from the learner but still able to hear them. The experimenter provided the teacher with a list of word pairs, which the teacher was tasked with teaching to the learner.

The process unfolded as a seemingly straightforward test of memory. The teacher would quiz the learner on the correct word associations. When the learner gave the correct answer, the teacher would proceed to the next question. However, when the learner made an error—deliberately prearranged by the experimenter—the teacher was instructed to press a button on a shock generator.

With each incorrect response, the supposed voltage of the shock increased by increments of 15 volts, eventually reaching a maximum of 450 volts. In reality, the learners received no shocks; instead, pre-recorded audio of escalating distress was played to simulate their suffering. The shock levels were labelled with terms ranging from “Slight Shock” to the ominous “Danger: Severe Shock.”

Artificial Intelligence, Deep Learning, Machine Learning, Robotics

The experiment’s design placed participants under significant psychological strain, testing how far they would go in following orders, even when faced with what appeared to be extreme harm to another human being. It remains one of psychology’s most unsettling and thought-provoking studies, raising enduring questions about authority, morality, and the capacity for human obedience.

As the experiment progressed, the learners’ increasing number of incorrect answers led to higher shock voltages, heightening the discomfort and hesitation of the teachers. The experimenters, dressed in white lab coats to project authority, issued firm instructions and subtle pressure to ensure the teachers continued despite their evident unease. At the climactic 450-volt threshold, the simulated cries ceased, and the learners fell silent, amplifying the psychological tension.

The participants, tasked as teachers, displayed profound discomfort, with some showing visible signs of distress or experiencing near breakdowns. All questioned the process at some point. According to Milgram, several participants continued administering shocks only after receiving verbal reassurance from the experimenters. Those who delivered the final shocks did not request the experiment’s termination nor seek to check on the learners’ condition after the process concluded.

Beyond ethical concerns, the relevance of the Milgram experiment to events like the Holocaust has been contested. While Milgram framed the study as a means to understand obedience under authority, significant differences undermine its applicability. Participants were assured that the learners would suffer no lasting harm—a stark contrast to the calculated atrocities of genocide. Moreover, the teachers harboured no personal animosity or bias towards the learners, unlike the racial hatred that fuelled Nazi actions. Despite Milgram’s robust defence of the experiment’s ethical framework, scepticism persisted, particularly regarding its psychological impact on participants.

What Was The Stanford Prison Experiment?

The Stanford Prison Experiment was led by Philip Zimbardo, a psychology professor at Stanford University. Funded by the US Office of Naval Research, the study was advertised in The Stanford Daily as a “psychological study of prison life” lasting two weeks. Seventy-five individuals applied, of whom only twenty-four were selected following what researchers described as a rigorous screening process. However, critics have since argued that the study’s premise inherently attracted participants predisposed to certain behaviours, particularly those interested in authority and control.

The selected participants, all male, were randomly assigned roles as either “prisoners” or “guards.” Researchers encouraged them to fully inhabit their roles, with the warden, David Jaffe, playing a key role in shaping behaviour. Jaffe reportedly urged guards to act tougher, ostensibly for the sake of the experiment’s authenticity.

The study was conducted in a section of the psychology building’s basement. This 35-foot area was transformed into a makeshift prison, with cells measuring seven by ten feet, each holding three prisoners. Prisoners were given only the barest essentials: a mattress, a sheet, and a pillow. The space also featured a closet for solitary confinement and a more spacious area reserved for the guards and the warden.

From the outset, the experimental conditions mimicked a punitive environment. The prisoners were subjected to strict rules, while the guards—under little oversight—were granted considerable authority. The study quickly spiralled beyond its original intent, raising profound questions about the nature of power, authority, and human behaviour under artificial constraints.

The guards in the Stanford Prison experiment were issued khaki uniforms, batons, and reflective sunglasses to aid their immersion into the role and provide a sense of anonymity. They were instructed to refer to the prisoners by numbers rather than names, a deliberate strategy that Philip Zimbardo argued stripped the prisoners of their individuality. The guards were repeatedly reminded to act “firmly” and create an atmosphere akin to a real prison. Zimbardo clarified that while the guards were prohibited from inflicting physical harm or withholding food, they were encouraged to establish and maintain control through other means.

The prisoners, by contrast, were dressed in ill-fitting smocks and forced to wear a chain around one ankle. The dynamic between the two groups deteriorated swiftly. On the second day, the prisoners, woken at 2:30 AM, began openly defying the guards by hurling insults. In response, the guards sought to reassert authority by using fire extinguishers against the prisoners, stripping them of their clothing, and confiscating their mattresses as punishment.

The experiment began to unravel on the third day. Douglas Korpi, one of the prisoners, experienced a severe mental breakdown and was released from the study. By the fourth day, another prisoner had become emotionally distressed, crying in his cell and requesting medical assistance. This prompted Zimbardo to intervene personally and remove him from the experiment.

On the fifth day, the situation escalated further. Parents were permitted to visit the prisoners for ten minutes under the strict supervision of a guard. Alarmed by the visible distress of their children and the apparent withholding of necessities, the parents voiced growing concerns.

By the sixth day, the experiment’s facade of control had crumbled. The increasingly harsh behaviour of the guards, the deteriorating mental health of the prisoners, and the protests of the parents compelled Zimbardo to bring the study to an early conclusion. What was initially designed as a two-week simulation of prison life ended abruptly after just six days, leaving in its wake enduring ethical and scientific debates about the limits of psychological research.

After debriefing the participants, Philip Zimbardo documented his findings, which he later expanded upon in his book The Lucifer Effect: Understanding How Good People Turn Evil. He argued that the authoritarian behaviour displayed by the guards was primarily a result of the power they wielded over others. According to Zimbardo, even good individuals could exhibit antisocial tendencies when placed in particular circumstances. This conclusion, however, has been widely criticised as subjective, with many dismissing the study as more of a “case study” than a rigorous experiment.

Concerns over the objectivity of the experiment have persisted, particularly regarding evidence that the guards were influenced to behave in specific ways by the warden, David Jaffe. Erich Fromm, the German-American psychologist, was among the most prominent critics. He condemned the study for its sweeping generalisations and for overlooking the transformative effect of the artificial environment on participants’ behaviour. Fromm pointed out that two-thirds of the guards refrained from displaying any antisocial tendencies, which, in his view, contradicted Zimbardo’s conclusions. Instead, Fromm suggested that the experiment demonstrated how circumstances do not necessarily alter a person’s character.

Ethical questions also overshadow the study’s legacy. The treatment of the prisoners, many of whom experienced severe mental distress, has been described as dehumanising. Instances of mistreatment reportedly violated the terms of the participant’s contracts, raising further doubts about the study’s ethical framework. These issues, combined with the absence of rigorous scientific controls, have led most scholars to deem the experiment both irreproducible and invalid as a piece of psychological research.

Despite its flaws, the Stanford Prison experiment remains a touchstone in discussions of authority, power, and human behaviour, serving more as a cautionary tale than a definitive scientific contribution.

A Good Experiment

The primary distinction between psychology and the natural sciences lies in their subject matter. While the natural sciences investigate tangible, observable phenomena, psychology grapples with the abstract realm of the mind—if such a construct exists. Unlike a cell or a chemical compound, the mind cannot be placed under a microscope or dissected for closer study. Psychologists must therefore infer its workings through the observation of behaviour. This presents a challenge, as behaviour is extraordinarily susceptible to influence, even by the mere presence of an observer.

Designing a robust psychological experiment requires meticulous adherence not only to the scientific method but also to ethical standards and careful management of variables. A major flaw in the Stanford Prison Experiment was its inability to control variables. Not only were there numerous influencing factors, but the behaviour of participants was reportedly shaped or distorted by the experimenters off the record. Though ostensibly a study of prison life, the experiment instead demonstrated the profound impact of authority on individuals who believe they are serving a higher purpose—in this case, the advancement of science.

The Milgram experiment, by contrast, suffered from an egregious disregard for ethical considerations. Participants were subjected to considerable psychological distress in the name of research. Milgram later asserted that many of those who took part expressed gratitude for their involvement, but this does not mitigate the ethical shortcomings of the study. Such a proposal would almost certainly fail to gain approval from a contemporary ethics board, and for good reason.

Both experiments, despite their significant flaws, have become pivotal reference points in psychology. They highlight the critical importance of balancing the pursuit of knowledge with the responsibility to protect the well-being of participants. Without this balance, the line between scientific inquiry and harm becomes perilously blurred.

The optimal approach to conducting a psychological experiment requires a thorough understanding of the numerous variables at play and a clear strategy for managing them. This includes knowing which variables to manipulate and how to control or neutralise others effectively. One of the most reliable ways to ensure impartiality and scientific validity is to include a control group where these variables remain unaltered, allowing researchers to observe how the situation would unfold under normal conditions.

Above all, it is essential to guarantee that neither human participants nor animals experience psychological or physical harm during the experiment. The process must be conducted as humanely as possible. Disregarding the inherent dignity of living beings in the name of poorly conceived research with negligible results would be a betrayal of the principles of science. Interestingly, while the Milgram experiment yielded impactful findings, its methods starkly illustrate the risks of neglecting these considerations.

Every scientist and psychologist must confront a fundamental question before embarking on an experiment: Does it cause more harm than good? If harm is involved, can its consequences ever be justified by the potential benefits? The pursuit of knowledge must never come at the expense of our shared humanity. Science exists to enhance life, not to diminish it, and any research that forgets this principle risks losing sight of its true purpose.

(The author is a Psychology student interested in philosophy and artificial intelligence. She is an avid reader and writer who explores the connections between psychology, AI, and philosophy. The views expressed are personal.)