Tuesday, 16 January 2018

Days when I'm glad to be in an economics department

Several of my political economy students found an article in the Journal of Public Policy (ungated link), on differences between the EU and US in lobbying. The key evidence, from interviews with corporate lobbyists, and doubtless obtained with great difficulty, is presented in this table:






Sophisticated data analysts will notice a certain element is lacking.


> tmp <- as.table(matrix(c(30,20,15,32,36,14), 3, 2))
> tmp
   A  B
A 30 32
B 20 36
C 15 14
> chisq.test(tmp)

    Pearson's Chi-squared test

data:  tmp
X-squared = 2.7411, df = 2, p-value = 0.254


From the article:

In the U.S. we see less compromise and more 'winner-take-all' outcomes and the winners, more often than not, are industry representatives (which will be discussed in more detail below).
Data is then broken up by subcategories...


... and a number of interesting observations are made.

This article has 247 citations.

:-/

Friday, 12 January 2018

Generic sins for student essays



When I mark essays, some types of comments come up repeatedly. I've written them down here. This lets me write down the best possible explanation once, instead of many times.

A note on style and marks
I will often correct your style, partly because I hate bad writing, and partly because one of the most important things you can learn is how to write well. Bad writing will not usually in itself give you lower marks: this is not an English course. But if bad writing is a sign of incoherent or vague ideas, then you will lose marks from that. Also, sometimes the writing is so bad that I literally do not know what the author is saying. See below, what mean?

ACADEMESE
Don’t try to write like an academic. Academics are lousy writers. Students pretending to be academics are even worse. Write as if you were explaining something to a friend, in the clearest and simplest way possible.
BAD: The work of X can keep such criticism at bay
BETTER: X’s work shows this criticism is wrong

BAD: Mill puts forth the notion of “…”
BETTER: Mill says “…”
(“such” is usually academese. Would you say “such criticism” in the pub? Sounds like such bullshit.)
BAD: it was the posit of a rational pursuit of goods which made analysis economic...
BETTER: economic analysis typically assumed that people pursued their goals rationally
(In general, don't use nouns when you can use verbs.)
BAD: numerous
BETTER: many

Basically, if you sound pompous, you sound stupid. So try not to.

JARGON
Relatedly, don't use jargon unless you need to.
BAD: This is exemplary of a non-Pareto optimal realisation of policy.
BETTER: This is bad.

VERBOSE
Express each thought in as few words as possible. See also padding. Edit yourself. The delete key is your friend:
authoritarian regimes can be characterised by their willingness to bribe and use violence...
PADDING
Don’t pad out your word count. It’s obvious, and it won’t gain you marks, because I don’t want your essay to be long. I want it to be short, so I can finish marking the damn thing and do some research/go to the pub/take a candlelit bath with relaxing music.

FENCE-SITTING
Don’t use language to cover up uncertainty. Take a position, or honestly say you don’t know.

WHAT MEAN?
Your obscurity, fence-sitting or academese are so bad that I literally don’t understand what you are trying to say. Naturally, this is not good for your mark.

HUGE QUOTE
If you want to cite someone's paper, don't just quote great chunks of it: explain it in your own words. This makes your essay more coherent, and reassures me that you have understood what the other person is saying - not just copy-pasted it.

In the worst form of this disease, the whole essay is a patchwork of quotes held together by joining text. Why bother?

LAUNDRY LIST
An essay structure which lists a large number of arguments, without leading the reader from one to another, or explaining their role in the argument. More coherent than collection of paragraphs, but still hard to read.
RANDOM READING
Don’t just read papers/blog posts at random. Instead, start reading from what has been recommended. Put effort into searching for relevant papers - which aren’t necessarily ones with relevant-seeming titles. Try to work out whether a paper or book is important (citation counts help somewhat). The ability to find relevant papers is part of what we want you to learn.
Informal sources can be very useful but need to be consumed critically. If you only have blog posts and newspaper articles in your bibliography, then you probably have not read enough.

COLLECTION OF PARAGRAPHS
Individual paragraphs in your essay make sense, but they have no discernible relationship to one another. Each paragraph should build on the last, like stones in an arch:Image result for arch stones 
Your work is more like:

Image result for rubble 

COLLECTION OF SENTENCES
Like collection of paragraphs, but at a lower level. Individual sentences make sense, but I don't see how they relate to each other to form a coherent paragraph.


SECTION HEADINGS
These are a bad idea in a student essay. They're overkill for 2500 words, and they are often a sign that you don't have a single, clear thread of argument.

Tuesday, 5 December 2017

Scientism and my shrink


Some time ago I started seeing a psychotherapist, a Jungian whom a friend had recommended. My excellent research assistant, a psychology PhD, was surprised and scornful: “You realise that’s not real scientific psychology?”

Jung with pipe

She was right, of course. Jung is taken no more seriously than Freud by modern psychologists. There’s no evidence that Jungian psychology is practically effective either. Until the rise of cognitive behavioural therapy, no school of therapy did better than any other in scientific trials, or even better than just talking to a friend. With apologies to lay people, we can write this down in an equation:

ATEJung    =     E[x | J = 1] – E[x | J = 0]     =     0                                 (1)

where x is mental health, E[x | J = 1] is the expected level of a person’s mental health given a spell of Jungian therapy, and E[x | J = 0] , is the expected level of their health after no treatment (or, say, after some more reasonable control, like talking to a friend). ATE is the Average Treatment Effect, the average effect on someone of having a Jungian therapist; equivalently, the difference between their health after Jungian therapy and after the alternative.

But I stayed with my therapist all the same. My RA was right to be shocked at such an unscientific attitude, no?

Some things about my guy seemed to differentiate him from the average therapist, Jungian or not. He was extremely intelligent, thoughtful and calm, and I’d developed a warm relationship with him. I felt that I’d learned some things about myself and perhaps this was helping with my problems.
Here’s an equation for what matters to me:

TE[i] = (xi | J = 1) – (xi | J = 0)                          (2)

where TE[j] is the treatment effect on the psychological health of unit j, the difference between their psychological health after seeing my therapist and after the alternative; and i represents myself. Equation (1) is just the average of equation (2), taken over some appropriate population of patients and therapists. And if it is estimated right, then your best guess of (2) for a random individual and a random therapist is (1); in other words, by scientific standards Jungian therapy is useless.

But of course, I am not a random individual to myself, and my therapist is also not randomly chosen. I know or believe many things about me and him, which may lead me to a different estimate of (2). Some of these will be the data of my own experience, others will be intuitions, or perhaps what I’ve heard from my friend. It’s not obvious how I should deal with the scientific information embodied in equation (1). It is not something I should just ignore, and it certainly comes out of a more careful and objective process than my own scraps of intuition and gossip. But that does not mean those scraps are worthless. Very little of the knowledge we live by day-to-day is scientific, but we get by well enough.

These ideas are relevant to the debate on expertise. Here’s Simon Wren-Lewis on expertise:
In reality ignoring expertise means dismissing evidence, ignoring history and experience, and eventually denying straightforward facts.
With respect, this is one-sided, and even arrogant and dangerous [1]. For instance, a person who worries that their job may be taken by a migrant is not proved wrong by even theoretically perfect research showing that immigration on average does not reduce native employment [2]. Yes, people can be misled by xenophobia or biased newspaper reporting. They may also know specific things about their town, or their company, that researchers do not. Those pieces of knowledge will not have been reached by careful scientific experimentation. But decentralized, embodied information about specific particular conditions is, among other things, what makes free markets work [3]. If all knowledge were expert knowledge, socialism would have outrun capitalism.

Another point is specific to social science [4]. Humans live in history, which is a river that you cannot step in twice: conditions are always changing. What we are really interested in is the effect of certain policies in future. But the only data we have is from the present and the past. Statisticians understand the risk of extrapolating from the data – assuming that something’s behaviour will remain predictable in conditions beyond the boundaries of what one has so far observed [5]. Well, if time is a relevant variable, all social science is extrapolation from the future to the past, and sometimes it fails. Relationships that once held cease to do so, perhaps suddenly. To understand such a world, the observer often has to make a choice: gather a respectably-sized sample, perhaps reaching back far into the past [6]; or look at what’s happening now and make a risky but relevant guess. Past averages; or straws in the wind? 

This often divides scientists from journalists. Social scientists want to make well-founded generalizations and are trained to pay little regard to journalists’ anecdotes. Journalists can legitimately retort that they have a better instinct for what matters today. Neither side is always right. I haven’t mentioned yet how little we truly know, perhaps how little there is to know, about many vital matters of macro social science. Put it this way: until they are a little better at predicting financial crises, or the short run effects of Brexit, economists will fulminate in vain against journalists who don’t take their other predictions seriously.

The idea that everything to be known must be known by scientific methods has a name: it is called scientism. But scientism is not scientific.


Notes and references

[1] Incidentally, Professor Wren-Lewis gave the choice of Corbyn as Labour leader as an example of ordinary people (Labour members) ignoring expertise. I also used to think that was a bad idea for Labour. Neither of us look very expert now, do we?
[2] There’s a debate between George Borjas and others [1, 2] on migration, which hinges, among other things, on how much to "borrow strength" between different social groups, so as to predict one group's outcome from another's.
[3] Here is Hayek's classic argument about markets, "The Use of Knowledge in Society". It's short and easy to read.
[4] This is why Professor Wren-Lewis is wrong to argue that ignoring experts on Brexit is "exactly equivalent to giving considerable publicity to a report from some climate change denial outfit". The equivalence is a bit looser than that.
[6] A good example is the very interesting dataset of financial crises collected by Reinhard and Rogoff for their book This Time Is Different. As their subtitle boasts, it reaches back through Eight Centuries of Financial Folly. It was certainly wrong to think the noughties' boom economy was different from any previous period, but it might reasonably be different from the conditions of the fourteenth century.

The “river you cannot step in twice” line comes from the Ancient Greek philosopher Parmenides, who said that you cannot step in the same river twice.





Monday, 13 November 2017

IHYSP: Reuben et al. 2014 on gender stereotypes in maths

The I Hate Your Stupid Paper series returns for this Reuben, Sapienza and Zingales PNAS paper from 2014. Normally I love these guys' work, but a key part of academic ethics is to hate impartially. So.

Does discrimination contribute to the low percentage of dwarves in the high jump business? We designed an experiment to isolate discrimination’s potential effect. Without provision of information about candidates other than their appearance, those of full height are twice as likely to be hired for a high-jump task as dwarves…. We show that implicit stereotypes (as measured by the Implicit Association Test) predict not only the initial bias in beliefs but also the suboptimal updating of height-related expectations when performance-related information comes from the subjects themselves….  
... it remains important from a policy point of view to determine whether discrimination exists and, if it does, what can be done to reduce it. For this reason, we designed an experiment in which supply-side considerations did not apply (job candidates were chosen randomly and could not opt out), and thus possible differences in preference could not lead to differences in performance quality (and thus qualification).  
We used a laboratory experiment in which subjects were “hired” to perform a jumping task: jumping over as many six inch poles as possible over a period of 4 min. We chose this task because of the strong evidence that it is performed equally well by dwarves and others. Nevertheless, it belongs to an area—high jumping — about which there is a pervasive stereotype that dwarves have inferior abilities….

Our results revealed a strong bias among subjects to hire tall people for the jumping task...
 To clear something up straight away: no, I am not suggesting that women in maths are like dwarves in the high jump. The point of the experiment is to adjudicate whether there is really “unfair” or “irrational” bias against hiring women on the basis of maths competence. This is why the authors don’t just look at hiring rates in the real world. Instead they construct a task on which, by design, men and women perform equally. In the real world – this is the rhetoric – it would be hard to know whether employers are biased against women in science, or just have correct expectations about future performance. But in the lab we can conduct a fair test. Is not hiring women for maths like not hiring dwarves for the high jump? Or is it based on unfair prejudice? The latter, because women perform equally well on this task and still get discriminated against. Quoting from the real paper:
The effect of this [i.e. gender] stereotype on the hiring of women has been shown to be important in at least one field experiment. However, that study was unable to rule out the possibility that the decision to hire fewer women is the rational response to the lower effective quality of women’s future performance because of underinvestment by women caused by inferior career prospects or stereotype threat. For this reason, we used a laboratory experiment in which we could ensure there was no quality difference between sexes, because women performed equally well on the task in question, whether or not they were hired.

The problem is fairly obvious. If you are told to hire for a high jump task, you ain’t going to hire dwarves. If you think that men and women don’t perform equally well in maths, you won't hire women for a maths task. This will hold whether your belief is an irrational prejudice, a scientifically-validated fact of brain development, a sad but contingent truth of our society, or anything in between. Unless you are certain that the particular maths task is one which men and women do equally well at, you may as well follow your priors. Thus, the experiment doesn’t tell us which world we live in: the prejudice world, or the short-person-high-jump world. All it tells us is that subjects’ own experience of the maths task (they all took part in it, which to be fair is a plus point) was not enough to override their prior beliefs. That is irrational only against the benchmark of a genius who is omniscient about human behaviour.

The experiment could be improved by proving to subjects that men and women perform equally at the task in question, and then seeing whom they hired. But then it would become uninteresting for a different reason – the very probable null result would, again, be uninformative about what happens in real world hiring committees.

Summary: lab experiments may yet teach us a lot about gender differences and gender discrimination. But not this one.

Tuesday, 24 October 2017

On the proportion of Black students at Oxford


So David Lammy has accused Oxford of not doing enough to recruit black students.

Is this true?

First of all, the 2011 UK census data shows the numbers of black 18 year olds, out of all 18 year olds. It's 57,428 out of 1,460,156, so about 4%. If only 1.5% of Oxford students are black, on the face of it we have a problem.

But whose problem is it? I googled for "A level results by ethnicity" and found this Freedom of Information request. (How cool is it, by the way, that these requests are available on an easy-to-find, functioning website?) The data here is a couple of years out of date, but it's a start.

Let's be realistic and assume you need 3 A grades to get to Oxford. 395,401 pupils took A levels. Of them, 12.5% got 3 A* or A grades. 8,532 of those pupils were black. Of them, 4.9% got 3 A* or A grades. That means about 418 black students got these grades, out of about 49425 students in all: 0.8%.

From this five minutes analysis, if anything, Oxford is doing rather well in the numbers of black students it admits. The problem is that the UK school system is not turning out enough well-qualified black students.

For any journalists reading, I hope this demonstration shows how easy it is for you to check the claims politicians make. Go and do likewise!

Update: a picture is worth a thousand words.

Wednesday, 20 September 2017

Area genetics



Here is an interesting map of the UK.



The colours relate to the genetics of people born in each county. Specifically, they show you the average Educational Attainment Polygenic Score (EA PS) of residents from within our sample. EA PS is a DNA measure that can be used to predict a person's level of education (e.g. do they leave school at 16, or get a university degree). Red is the worst, pale yellow is the best.

The black outline shows areas of former coalmining. Coal employment has been declining since the 1920s, and by the 1970s, these areas were often socially deprived.
I won't say much more for now!


Tuesday, 19 September 2017

We preregistered an experiment and lived



For my school experiment with Jinnie, we decided to pre-register our analyses. That seemed like the modern and scientifically rigorous thing to do.

There are different preregistration venues for economists. osf.io is very complete and allows you to upload many kinds of resources, then "freeze" them as a public preregistration. aspredicted.org is at the other extreme, it justs asks you 9 questions about your project. The AEA also runs a registry for randomized controlled trials at www.socialscienceregistry.org.

For this project, we decided to use osf.io. We were pretty serious. We uploaded not just a description of our plans, but exact computer code for what we wanted to do. Here's our preregisration on osf.io.

This was the first time I have preregistered a project. We ran into a few hurdles:
  • We preregistered too late, after we'd already collected data.
This was pure procrastination and lack of planning on our part. Of course it means that we could have run 100 analyses, then preregistered the analysis that worked.
  • Our  preregistered code had bugs.
This was true even though it worked on the fake data we'd used to test it. Luckily we were able to upload a corrected version, but if you've frozen the files you uploaded, this would be a problem.
  • Our analysis was not the right one.
The data looked odd and our results weren't significant! Now we faced a dilemma. The correct thing to do would be to admit defeat. You preregister, your results are insignificant... go home. However, it was also reasonably clear that we had assumed our dependent variable would look one way (a nice, normally-ish distributed variable), and in fact it looked completely different (huge spikes at certain values, some weird and very influential outliers).

We were sure that statistically, we should do a different analysis. But of course, then we were in the famous garden of forking paths. So we compromised: we changed the approach, but added an appendix with our initial analysis, and retrying it with some fairly minimal changes (e.g. removing  outliers). In fact, even just clustering our standard errors appropriately would give us a significant result, though again, that wasn't in the original plan.

Bottom line: you are an imperfect researcher. Your initial plan may just be mistaken, and as you think about your project, you may improve on it. Your code may fail. And the data may reveal that your assumptions were wrong. These can raise awkward choices. It is easy to convince yourself that your new analysis, which just happens to get that coveted significance star, is better than your original plan.

Despite these problems, I'm glad we preregistered. This did discipline our analysis. We've tried to keep a clear separation between questions in our analysis plan; and exploratory questions which we thought of later, or which seminar participants suggested to us. For example, we have a result where children are more influential on each other if they have many shared friends. Interesting, and it kind of makes sense among our adolescent subjecs, but it is exploratory. So, I'd want to see it replicated elsewhere before being fully persuaded this was a real result. By contrast, I am quite confident in our main result, which follows the spirit though not the letter of our plan.

In many cases, preregistering one's code may be over the top. It's better to state clearly and accurately the specific hypotheses you're going to test. There's no way you can be fully specific, but that's fine – the goal is to reduce your degrees of freedom by a reasonable amount. So, I would probably favour the quick aspredicted.org style, over the more complex osf.io style, unless I was running a really involved project.

I've just preregistered an observational analysis of some genetic data. It's over at aspredicted.org, number 5584. Just waiting for my authors to approve...