University of Idaho Social Psychology
 Lesson 2.2: Transcript
 
Home
Syllabus
Schedule
Contact
Help

 

Register Here

Department of Psychology

  © 2004
 
University of Idaho
  All rights reserved.

  Psychology Dept.
  University of Idaho
  Design - P&D  CTI

 




 

 


 

 

Back

Transcript of Audio Lecture

Welcome to lesson two, module two. We’re still talking about research methods, but now we’re going to continue to talk about experimental methods.

Let’s move to slide two to begin. Experiments allow for causal inference, that is, by running an experiment, we can determine whether one factor causes another. Remember with correlations we cannot do this. Experiments allow for causal inference by using random assignments. Random assignment means that as participants show up to participate in an experiment, they are randomly assigned to one condition of an experiment or another, that is, everyone has an equal opportunity of being in either the control or treatment condition in your experiment. This is not the same as random sampling. Random sampling means that from the population in which we’re interested, you randomly choose people to participate in your experiments. We need to know what an independent and what a dependent variable is. An independent variable is what the experimenter does, that is, what we’re manipulating. For example, if we’re studying aggression, we might choose to manipulate the independent variable as the temperature of the room. This would be something that the experimenter would do. Bring some participants randomly into a hot room and other participants be randomly chosen to be in a cooler room. Then our dependent variable might be how much they aggress, perhaps how frequently they yell at other participants or choose to play very loud music in the presence of the other participant. Let’s talk about another study.

Let’s move on to slide three. Ballmeister and colleagues in 1998 were interested in studying willpower. Essentially they believed that exerting willpower took cognitive resources. By exerting willpower, trying not to engage in a tempting behavior, your cognitive resources will be depleted. In this experiment, what they did is had people randomly assigned to one of three conditions. What they called a radish condition, a controlled condition, or a cookie condition. Participants came into a room, sat down and were told that they were participating in a case perception test. In this test, with the radish condition, they were shown a bowl of radishes as well as a plate of cookies and were asked to eat two or three of the radishes and none of the cookies. They would not be provided with any water for several minutes. The experimenter then left the room and determined whether or not the participant actually ate the radishes by looking at what was still remaining in terms of radishes and cookies upon return.

In the cookie condition, participants again were shown a bowl of radishes and a plate of cookies. This time they were asked to eat two or three cookies and no radishes. Then the experimenter left the room. Upon returning, examined the plate of cookies and radishes to determine what had been eaten.

In the control condition, there was no food present. Participants were then asked to work on a task. It was an impossible puzzle task. This means that there was no solution and the dependent variable here is how long the participants would continue to try to solve the puzzle.

We see that in the radish condition, they only persisted for less than 10 minutes, while in the cookie and control conditions, people worked 15 to 20 minutes on the task. This implies that when we’re eating radishes in the presence of cookies and trying very hard not to eat the cookies, that is using willpower to eat only radishes, we actually use cognitive resources that are then not available to allow us to persist in a cognitive task.

Let’s move on to slide four. Construct versus operation. In the experiment, we have both, construct and operation. The definition of construct is that they are the broad concepts or ideas that we’re interested in studying. For example, Ballmeister and colleagues were interested in studying the broad concept of cognitive resources and the broad concept of willpower. The operations of the specific conceptualization or operationalizations of constructs. What this means is that in the Ballmeister study, they specifically conceptualized cognitive resource depletion by looking at persistence on an impossible task. The broad concept of willpower is conceptualized with one’s ability to refuse to eat cookies or to deny themselves cookies when forced to eat radishes.

Let’s go on to talk about measures in slide five. When we’re measuring a dependent variable, there are many ways we can do this. One is to use repeated measures. This means that we look at your responses over time, so if we’re interested in measuring your response to attractive faces, instead of showing you one attractive face and measuring your response, we might show you several attractive faces, perhaps 50. This would be a repeated measure, getting your response on each face. We could also use multiple measures, so if we’re interested in how you respond to attractiveness, we could show you photographs of faces would be one way to assess same constructive attractiveness and your response to it. We could also look at your ability to interact with a person who would be perceived as attractive, we could provide not only still photos but also videos of attractiveness and then look at your reactions to those attractive people. We can also measure dependent variables from other means, be it observations, that is just simply watching people. For example, in Ballmeister’s confederate observed how many cookies had been eaten. They could have also relied on a self-report measure, simply asking people if they had eaten cookies in the radish condition or radishes in the cookie condition. What they did look at was performance on the cognitive tasks, that is how long did people persist, how hard did they try. If they had used a possible task, they could also look at performance in terms of how often people managed to solve the puzzle.

Let’s move on to slide six. In our course, we’re often going to talk about the differences between two groups of people or two conditions. When we do this, we’re talking about mean differences. One example would be that women are shorter than men. This is true; women are shorter than men on average. This does not mean that all women are shorter than all men or that all men are taller than all women. In fact, we all know that there are some men who are shorter than most women. A D statistic is one way we can talk about the mean difference. The D statistic is calculated by taking the mean of group one minus the mean of group two and dividing it by a full standard error. For the purposes of this course, you need to understand what the D statistic means. You do not need to be able to calculate a pooled standard error.

Let’s move on to slide seven. Many times during this course, as you hear about these mean differences, you might want to raise your hand and say well my Aunt Tilda would never buy that or my Uncle Joe is not attracted to people that have those features. These people might be outliers. They can influence our findings but usually they would only weaken our effect rather than strengthen it and by randomly assigning people to conditions, we remove this concern that we haven’t encompassed everyone in our study. Remember our concern in social psychology is with the mean difference, not the individual case. Submissions on the other hand, are interested in individual cases. Whether or not that is a good thing is something that should be considered and is often debated.

Let’s move on to slide eight. Significance. Many times you’ll hear in this course that there was a significant finding. When they say it is a significant finding, typically it means a statistically significant finding. We can also discuss findings in terms of their practical significance. Statistical significance is not always practical significance. For example, if I find that there’s one point difference in terms of openness on the seven point scale between men and women, that may be a statistically significant difference if I have enough people in my sample, but may not translate to a practical difference. That is, the way in which men and women interact generally may not reflect this one point difference. There is need for careful consideration, however. SAT scores are one of these things. A one point difference where women score one point higher than men on verbal SAT or that men score one point higher on the math section of the SAT is one place where we need to be very careful. It is certainly statistically significant as hundreds of people take the SATs every year. The practical significance is not necessarily in whether or not that one point means anything in terms of mental ability, but it could mean the difference between getting into a school or not and this is why we need to carefully consider the way in which we use statistics and the way in which we dismiss those statistics thinking about only practical significance.

In order for something to be statistically significant, we typically require that p be less than .05. This is the convention, although it’s very widely accepted. The idea here is that if you have a p less than .05, that means that 5 times out of 100 or less than 5 times out of 100, you would expect to get these results, the same results that you had in the experiment simply by random chance. That is, it’s very rare that you would be able obtain exactly these data points by chance, implying that there is something about the manipulated or independent variable that systematically influences the dependent variable.

Let’s move on to slide nine. We can also talk about realism in psychological experiments. There are two types; mundane realism and psychological realism. Mundane realism; how well does the lab study reflect the outside world. Is it realistic in terms of it’s everyday experience? Certainly most people don’t find themselves in a psychological lab everyday, so often lab studies do not have much mundane realism. However, if we’re interested in how students score on an exam in a room painted pink versus a room painted blue, the study could  have a high level of mundane realism. That is, students often take exam in their day to day life.

Psychological realism is much more important in terms of lab studies, that is while most people don’t find themselves in a lab study, the types of decisions or activities that we have them do are very similar to the decisions and activities that they do in the everyday world psychologically. That is, choosing between two blenders in a lab setting would be very similar to the same psychological process you would use if you were at a department store choosing between two blenders.

Let’s go on to slide ten and talk about validity. There are three types of validity with which you need to be concerned. The first is construct validity. This is the idea that operations are good measures of construct. An example would be while we’re interested in the construct of love, we might operationalize showing love through a number of kisses given during the day. This may have good construct validity or it may not, depending on the couple. For example if one person is ill and contagious, there may be less kissing, therefore, it may not be a very good operation of the construct for that particular couple on that particular day.

Internal validity. Causality by ruling out alternative explanations. Experiment is high in internal validity, to the extent that we can infer causation and not come up with another reason these results might be obtained. We go back to our pink and blue room in which we’re having students take exams and we find that those in the pink room score much better than those in the blue room, we might be able to say that has high internal validity. There doesn’t seem to be any other reason that the people in the pink room would do differently than the people in the blue room as long as they were randomly assigned. However, internal validity could be questioned if students in the pink room had always been in the pink room and also took their exam in the pink room. While students in the blue room were originally in the pink room during the course, but later removed to the blue room to take the exam.

External validity is our ability to generalize the findings, that a study has good external validity to the extent that one can say that this is true not only in this setting or this lab, but also in other settings, in other labs with other types of people.

Let’s move on to slide eleven. Reliability. There are three types; interjudge reliability, inter-item and test/retest reliability. Interjudge reliability means that when we have people code interactions. For example, we might have someone interact with another person and then we need to know how friendly they were to one another. We would bring in two coders, have them watch the tape and make several ratings about the friendliness of each person. The extent to which the two judges agree about the friendliness of each individual would be considered the interjudge reliability. If they agree, we have good interjudge reliability; if they disagree, then we would have poor interjudge reliability.

Inter-item reliability means that on any given test, for example a test in this course over this chapter, each item correlates well with the other item. This would be good inter-item reliability. If on the exam there was one question that did not correlate with their performance on the remaining items, then the test would have poor inter-item reliability. That is, all the items on the test should be attempting to measure essentially the same thing; your knowledge of this chapter.

Test/retest reliability is frequently used for traits, ideas, attitudes that we believe we need to remain stable over time. This is not the same as the pretest and a posttest. For example, if I believe that your high in need for cognition and that I’ve developed a skill that can accurately measure that, I might give you a test once in January, measure your needs for cognition then, which I don’t expect to change, I expect that to be a stable trait and then retest your need for cognition in May. If I get no change between the test and retest, then I have good test/retest reliability. If there is a significant change, then I have poor test/retest reliability.

Let’s move on to slide twelve. Replication and meta-analysis. Replication is the idea that we need to find the same effect again and again, that is finding the same effect in a different study or with different people using essentially all the same methods used in the original study. The more often a study is replicated, the more sure we can be about the findings or results of those studies. Meta-analysis is one way to statistically combine effects from all the studies done on a particular construct. How this used to be done is via lit review or tally box method, that is, prior to the development of meta-analysis, researchers specifically read through the literature and then simply state their opinion or write a literature review in which they summarized the findings and then decided which effects were most powerful and how large those effects were. Another more sophisticated method was the tally box method. In this method researchers went through the literature and did a tally box mark or a tally mark for each study that found the effects and a tally in a different column for each study that did not find the effects. In the end, if there were five studies that found the effect, two studies that didn’t, the researcher would conclude that the effect was there. That is, more studies found it than not.

However, what is not taken into account by these four methods is the power in each study or the preciseness of the method. Meta-analysis takes all these into account including how many people were involved in each study, how strong the effect that was found was, that is if the means were different, how different were they. Meta-analysis results can often be relied upon as they specifically combine the effects from all studies and therefore do not leave anything out.

Let’s move on to slide thirteen. The Zimbardo’s Stanford Prison Experiment. You should visit this website. It is the beginning of ethics in social psychology. That is, we do not live on an island where we can take people and have them marry or have relationships with people with whom they would like to see what happens if those relationships exist. We cannot put people on an island and manipulate our independent variables. There are ethical considerations. The true nature of these ethical considerations was began by the Zimbardo’s Stanford Prison Experiment. For more information, please visit this website. This material will be covered on the exam.

Let’s move on to slide fourteen to discuss the human subject board. After Zimbardo’s experiment, there was, at most universities, the development of the human subjects board in which all protocols are reviewed. That is before studies can be done, a researcher must write a protocol and send it to the human subjects board for review. The human subjects board then determines whether or not the procedures are ethical, whether or not the participants are going to be accurately informed and so on. This has had a broad impact on science. Some students in Social Psychology courses ask questions that we simply cannot answer due to ethical concerns. It would be wonderful to know what happens if someone is marrying someone who is less attractive than they are, but because we cannot randomly assign people to get married, we have no way to answer these sorts of questions.
This concludes the lesson.

Back