IHYSP: Reuben et al. 2014 on gender stereotypes in maths

The I Hate Your Stupid Paper series returns for this Reuben, Sapienza and Zingales PNAS paper from 2014. Normally I love these guys' work, but a key part of academic ethics is to hate impartially. So.

Does discrimination contribute to the low percentage of dwarves in the high jump business? We designed an experiment to isolate discrimination’s potential effect. Without provision of information about candidates other than their appearance, those of full height are twice as likely to be hired for a high-jump task as dwarves…. We show that implicit stereotypes (as measured by the Implicit Association Test) predict not only the initial bias in beliefs but also the suboptimal updating of height-related expectations when performance-related information comes from the subjects themselves….  
... it remains important from a policy point of view to determine whether discrimination exists and, if it does, what can be done to reduce it. For this reason, we designed an experiment in which supply-side considerations did not apply (job candidates were chosen randomly and could not opt out), and thus possible differences in preference could not lead to differences in performance quality (and thus qualification).  
We used a laboratory experiment in which subjects were “hired” to perform a jumping task: jumping over as many six inch poles as possible over a period of 4 min. We chose this task because of the strong evidence that it is performed equally well by dwarves and others. Nevertheless, it belongs to an area—high jumping — about which there is a pervasive stereotype that dwarves have inferior abilities….

Our results revealed a strong bias among subjects to hire tall people for the jumping task...
 To clear something up straight away: no, I am not suggesting that women in maths are like dwarves in the high jump. The point of the experiment is to adjudicate whether there is really “unfair” or “irrational” bias against hiring women on the basis of maths competence. This is why the authors don’t just look at hiring rates in the real world. Instead they construct a task on which, by design, men and women perform equally. In the real world – this is the rhetoric – it would be hard to know whether employers are biased against women in science, or just have correct expectations about future performance. But in the lab we can conduct a fair test. Is not hiring women for maths like not hiring dwarves for the high jump? Or is it based on unfair prejudice? The latter, because women perform equally well on this task and still get discriminated against. Quoting from the real paper:
The effect of this [i.e. gender] stereotype on the hiring of women has been shown to be important in at least one field experiment. However, that study was unable to rule out the possibility that the decision to hire fewer women is the rational response to the lower effective quality of women’s future performance because of underinvestment by women caused by inferior career prospects or stereotype threat. For this reason, we used a laboratory experiment in which we could ensure there was no quality difference between sexes, because women performed equally well on the task in question, whether or not they were hired.

The problem is fairly obvious. If you are told to hire for a high jump task, you ain’t going to hire dwarves. If you think that men and women don’t perform equally well in maths, you won't hire women for a maths task. This will hold whether your belief is an irrational prejudice, a scientifically-validated fact of brain development, a sad but contingent truth of our society, or anything in between. Unless you are certain that the particular maths task is one which men and women do equally well at, you may as well follow your priors. Thus, the experiment doesn’t tell us which world we live in: the prejudice world, or the short-person-high-jump world. All it tells us is that subjects’ own experience of the maths task (they all took part in it, which to be fair is a plus point) was not enough to override their prior beliefs. That is irrational only against the benchmark of a genius who is omniscient about human behaviour.

The experiment could be improved by proving to subjects that men and women perform equally at the task in question, and then seeing whom they hired. But then it would become uninteresting for a different reason – the very probable null result would, again, be uninformative about what happens in real world hiring committees.

Summary: lab experiments may yet teach us a lot about gender differences and gender discrimination. But not this one.

On the proportion of Black students at Oxford

So David Lammy has accused Oxford of not doing enough to recruit black students.

Is this true?

First of all, the 2011 UK census data shows the numbers of black 18 year olds, out of all 18 year olds. It's 57,428 out of 1,460,156, so about 4%. If only 1.5% of Oxford students are black, on the face of it we have a problem.

But whose problem is it? I googled for "A level results by ethnicity" and found this Freedom of Information request. (How cool is it, by the way, that these requests are available on an easy-to-find, functioning website?) The data here is a couple of years out of date, but it's a start.

Let's be realistic and assume you need 3 A grades to get to Oxford. 395,401 pupils took A levels. Of them, 12.5% got 3 A* or A grades. 8,532 of those pupils were black. Of them, 4.9% got 3 A* or A grades. That means about 418 black students got these grades, out of about 49425 students in all: 0.8%.

From this five minutes analysis, if anything, Oxford is doing rather well in the numbers of black students it admits. The problem is that the UK school system is not turning out enough well-qualified black students.

For any journalists reading, I hope this demonstration shows how easy it is for you to check the claims politicians make. Go and do likewise!

Update: a picture is worth a thousand words.

Area genetics

Here is an interesting map of the UK.

The colours relate to the genetics of people born in each county. Specifically, they show you the average Educational Attainment Polygenic Score (EA PS) of residents from within our sample. EA PS is a DNA measure that can be used to predict a person's level of education (e.g. do they leave school at 16, or get a university degree). Red is the worst, pale yellow is the best.

The black outline shows areas of former coalmining. Coal employment has been declining since the 1920s, and by the 1970s, these areas were often socially deprived.
I won't say much more for now!

We preregistered an experiment and lived

For my school experiment with Jinnie, we decided to pre-register our analyses. That seemed like the modern and scientifically rigorous thing to do.

There are different preregistration venues for economists. is very complete and allows you to upload many kinds of resources, then "freeze" them as a public preregistration. is at the other extreme, it justs asks you 9 questions about your project. The AEA also runs a registry for randomized controlled trials at

For this project, we decided to use We were pretty serious. We uploaded not just a description of our plans, but exact computer code for what we wanted to do. Here's our preregisration on

This was the first time I have preregistered a project. We ran into a few hurdles:
  • We preregistered too late, after we'd already collected data.
This was pure procrastination and lack of planning on our part. Of course it means that we could have run 100 analyses, then preregistered the analysis that worked.
  • Our  preregistered code had bugs.
This was true even though it worked on the fake data we'd used to test it. Luckily we were able to upload a corrected version, but if you've frozen the files you uploaded, this would be a problem.
  • Our analysis was not the right one.
The data looked odd and our results weren't significant! Now we faced a dilemma. The correct thing to do would be to admit defeat. You preregister, your results are insignificant... go home. However, it was also reasonably clear that we had assumed our dependent variable would look one way (a nice, normally-ish distributed variable), and in fact it looked completely different (huge spikes at certain values, some weird and very influential outliers).

We were sure that statistically, we should do a different analysis. But of course, then we were in the famous garden of forking paths. So we compromised: we changed the approach, but added an appendix with our initial analysis, and retrying it with some fairly minimal changes (e.g. removing  outliers). In fact, even just clustering our standard errors appropriately would give us a significant result, though again, that wasn't in the original plan.

Bottom line: you are an imperfect researcher. Your initial plan may just be mistaken, and as you think about your project, you may improve on it. Your code may fail. And the data may reveal that your assumptions were wrong. These can raise awkward choices. It is easy to convince yourself that your new analysis, which just happens to get that coveted significance star, is better than your original plan.

Despite these problems, I'm glad we preregistered. This did discipline our analysis. We've tried to keep a clear separation between questions in our analysis plan; and exploratory questions which we thought of later, or which seminar participants suggested to us. For example, we have a result where children are more influential on each other if they have many shared friends. Interesting, and it kind of makes sense among our adolescent subjecs, but it is exploratory. So, I'd want to see it replicated elsewhere before being fully persuaded this was a real result. By contrast, I am quite confident in our main result, which follows the spirit though not the letter of our plan.

In many cases, preregistering one's code may be over the top. It's better to state clearly and accurately the specific hypotheses you're going to test. There's no way you can be fully specific, but that's fine – the goal is to reduce your degrees of freedom by a reasonable amount. So, I would probably favour the quick style, over the more complex style, unless I was running a really involved project.

I've just preregistered an observational analysis of some genetic data. It's over at, number 5584. Just waiting for my authors to approve...

New, new working paper just out

I've also got a new working paper out with Jinnie Ooi, my brilliant research assistant and co-author. It's about how norms of fairness spread among teenagers. I'll blog in more detail later, but here is just a nice picture to give a taste of the result:

New old paper just out

A very old paper of mine and Martin Leroch's has just got published in Homo Economicus. (Ungated, older version.) The topic is reciprocity between groups. Here's a quote from the intro, I've bolded the key point:
At first sight it appears straightforward that people take revenge against entire groups, not only against direct individual perpetrators, even in routine social and economic life. For instance, consumers buy fewer products from countries which they see as politically antagonistic (Klein and Ettensoe 1999, Leong et al. 2008). Further, on days after terrorist bombings in Israel, Jewish (Arab) judges become more likely to favor Jewish (Arab) plaintiffs in their decisions, and Israeli Arabs face higher prices for used cars (Shayo and Zussman 2011; Zussman 2012). On a political level, for instance, Keynes (1922) perceived the Treaty of Paris’ devastation of the German economy as an act of revenge, and quoted Thomas Hardy’s play The Dynasts: ‘‘Nought remains/But vindictiveness here amid the strong,/And there amid the weak an impotent rage.’’ In its most extreme case, revenge against groups may trigger violent intergroup conflict. After an argument between an Indian Dalit and an upper caste farmer, upper caste villagers attacked 80 Dalit families (Hoff et al. 2011). In Atlanta, 1906, after newspaper allegations of black attacks on white women, a group of white people rioted, killing 25 black men (Bauerlein 2001). In both cases, innocent people were made to suffer for the real or supposed crimes of others. Many field studies of intergroup violence report similar tit-for-tat processes, with harm to members of one group being avenged by attacks on previously uninvolved coethnics of the original attackers (Horowitz 1985, 2001; Chagnon 1988).

We started thinking about this back in 2009, I just looked up the email:
Reciprocity towards groups; that's a pretty important idea if it holds, right? (Think about wars, racial discrimination; patriotism...) I don't know if there's anything done in the area. But perhaps it's one for another experiment.
As well as seeming important, it turned out there was basically nothing out there in economics, and only a few papers in psychology.

We ran not one but several experiments, polishing the treatment and figuring out "what works". (There's issues of multiple testing here, but I'll ignore that.)

Our final experiment had some interesting results, and we sent it off to a top journal. It was rejected. Then we sent it off to another journal and... it was rejected. And another, and another.... I was annoyed by this because I felt that this was an important topic that nobody had written about! After all, Chen and Li (2009) had got into the AER by doing a basic group identity experiment, the same thing psychologists had done for decades, and adding incentives.

Yeah, I was naive! There are lots of reasons for the paper not doing well, some good, some bad:
  • The design was complex and hard to explain. We spent ages on multiple rewrites of our design section to make it clear what we had done.
  • In addition, the design and methodology weren't perfect - we were both quite inexperienced. There are things I'd do differently. Of course, reviewers picked these up.
  • Our topic fell between stools: it was an economic experiment on a fundamentally political topic. It is a sad reality that interdisciplinary work is not easy to publish.
  • Relatedly: referees and academics are conservative. It is easier to answer a question they already consider important, than to introduce a new question and persuade them it is important. That's probably reasonable. The dominant themes of any literature are dominant for a reason.
  • Chen and Li's AER paper did what I have since learned is important - it created a building block. It deserves its placement. I still think we were out there doing something quite new, but sometimes you have to lead the academic horse to water.
Anyway, for all that, I still think that intergroup dynamics are under-researched, given that they may be involved in the devastating phenomena we touch on in our intro. So, I'm glad it's finally out!

Here's a picture of the basic result, which I'm sure has been up on this blog before. The slope of the solid line shows subjects' "upstream reciprocity" towards a fellow group member of their most recent opponent in a public goods game. The dashed line is the control, showing reciprocity towards someone in a different group.

Whistling in the dark

I remember 1992.

Everyone expected Labour to win and kick out Major. I sat and watched it with a friend from school. I was very Left wing, and in 1992 almost everyone my age (even Etonians) wanted the Tories out.

By 2am, it was clear that Labour was not winning. I took out a tiny, tiny speck of dope that I had left over and ate it in a feeble attempt to get high. Then we went to bed.

Anyway. I need to find some positives in this situation:
  • We will get rid of May, who has shown zero talent and zero charisma. 
  • Corbyn probably will not form a government.
  • If he does, it will be a weak one, and as he has shown zero talent for organization and management – as opposed to his huge talent for campaigning and speechmaking, for which, full respect – it will probably lead to swift disillusionment for the kids who are now out celebrating.
  • The Lib Dems might be able to demand a referendum on PR, and the mood of the country is such that it might vote yes this time.
  • A lot of young people have been enthused by politics. I'm not sure this is a particularly good thing, but at least they will be enjoying themselves.
  • Ruth Davidson has done really well in Scotland. (I've often thought that it would be quite funny, and really wind up the Left, if the Conservative party could have the first Jewish, the first female, the first gay and the first black Prime Ministers.)
  • The SNP are one step further from breaking up my country.
I will try to think about the negatives in the morning. At the moment it is just too grim. Oh, one more:
  • There were some excellent dogs at the polling station where I was a teller.

Why did Corbyn do so well? A little bit of political economy

Let's assume the exit poll is about true, and that Jeremy Corbyn has done even better than the polls thought – and he was already pulling far ahead of what people, including me, expected.

There are lots of things to say about this: failures in polling (again); Theresa May's incompetent campaign and feeble personality; Jeremy Corbyn's quality as a campaigner; the role of the internet.

I think one dog that very importantly did not bark is the Labour manifesto. Remember, Jeremy Corbyn is a passionately ideological Leftwinger. But the manifesto was in many ways rather moderate. It did not, for example, aim to spend much more than the Conservatives. It did not set out to reverse many Conservative welfare cuts.

A classic model in political science explains why parties move to the centre. Suppose the two parties are concerned only to win the elections. They will each promise a platform right in the middle of the electorate, at the famous "median voter" - the person in the middle, who has half the electorate to the left of her and half the electorate to the right. Why? Because if one candidate moves to the left of this person, then (at least!) everyone on the right votes for the other candidate, who wins a 50% majority. And if one candidate moves to the right, then everyone on the left votes for the other candidate who again wins.

Here's an ASCII art picture. It shows a line representing political preferences from Left to Right. The voters are at x. The median voter is marked with a *, with two voters on her left and two on her right. Voters vote for the party that is closest to them. Both parties will propose a platform at *. Any party that moved left or right would get three voters preferring the other candidate.


A natural response to this is "oh, but politicians are idealists! Or at least, Jeremy is. Jeremy cares about policy, not just getting elected." Well, stop swooning over Jeremy for a second, and suppose that is true. Suppose you are Jeremy Corbyn and deeply want policy to be as left wing as possible. You will still move to the centre. For, if you do not, and lose, then you will get the policies implemented by your right wing opponent.

This seems to be what has happened. Corbyn moderated his manifesto. That made Labour palatable to voters who would never have tolerated Corbyn's own ideal policies.

In a sense, you could say that despite appearances, the ghost of Blair still haunts the Labour party. Even with Corbyn as leader, they are forced to go along with a lot of the consensus of the past forty years.

(Thank God! ... But this is a post about the "horse race", not the outcome.)

The original model of the median voter is the "Downsian" model, made famous by Anthony Downs' An Economic Theory of Democracy (1957); but actually first suggested by Hotelling (1929) "Stability in competition". The point about "idealistic" politicians was first made, I think, by Donald Wittman (1929) "Candidates with policy preferences: A dynamic model" – sorry no ungated version.

How I will vote

I will not be voting for Labour this Thursday. Here is why:
  • The Labour manifesto promises to nationalize the railways. Our rail service in the UK is far from perfect, but for me at least, it provides a reasonable way to get around. I remember the days of British Rail, with no affection whatsoever. There were fewer trains than now. Connections were slower. Trains were dirty, and so were stations. You could not hear station announcements. Staff were unhelpful. Railway and train food was a byword for badness. Not surprisingly, passenger numbers on UK railways steadily declined. After privatization, they immediately started to increase again. Those who do not remember the past are condemned to repeat it. Please, let’s not. 

Rail passenger numbers under nationalization and privatization. Can you spot the difference? Graph by Absolutelypuremilk and Tompw, via Wikimedia. Data from ATOC.

  • Labour also wants to abolish university tuition fees. The tertiary education sector is one of the UK’s success stories. It attracts students from all over the world, whose fees help to subsidize our own. Continental systems, which are mostly fee-free, do not. There is a generous system of student loans available, so students from poor families, who feel that they will benefit from a university education, need not be afraid that they cannot afford it. Indeed, the introduction of higher fees in 2010 did not reduce rates of participation, which are extremely high, and also has not affected participation by poorer students, which went up by 42% between 2005-2014. Higher fees have made students demanding, and have encouraged universities to provide courses that they want. There are bad aspects to this, but overall I think it is a good thing. In terms of self-interest, higher fees help to pay my salary. Lastly, students end up wealthier than non-students, so abolishing fees means either reducing funding for education, or shifting the cost from richer to poorer people. Abolishing tuition fees is a bad idea.
  • Immigration to the UK is historically at high levels. I think it should be less, for reasons I won’t detail here, but which, for the avoidance of doubt, do not include being a hate-filled racist. Neither party has a clear plan for making immigration less, but at least the Conservatives have a clear goal of doing so. They are the less bad alternative.
  • Let’s not fanny about: Jeremy Corbyn has a clear history of sympathizing with terrorists. He had contacts with the IRA. He shared a platform with people wanted for murder. He attended a wreath-laying ceremony at the Paris grave of a Palestinian terrorist involved in the Munich hostage-taking and murder of 11 Jews. His claims to have been doing this for peace are disingenuous: Jeremy Corbyn, the backbench Labour MP, could have no influence on bringing peace to the Middle East, and not much on Ireland. It is much more likely that he saw these groups, who espoused hard Left ideologies, as political allies. This, on its own, is an excellent reason for him never to be Prime Minister.
So, I am going to vote Conservative on Thursday, not with vast enthusiasm, but because the alternative is worse. If you agree with me, thank you for reading this message; if you disagree, thank you still more. And whichever way you are going to vote, do vote!