Wednesday, 28 June 2006

‘Big Brother’ eyes make us act more honestly

‘Big Brother’ eyes make us act more honestly

From the article:
"We all know the scene: the departmental coffee room, with the price list for tea and coffee on the wall and the “honesty box” where you pay for your drinks – or not, because no one is watching.
"In a finding that will have office managers everywhere scurrying for the photocopier, researchers have discovered that merely a picture of watching eyes nearly trebled the amount of money put in the box....
"... In previous experiments, people consistently appeared to behave more generously than they needed to for their own self-interest, even when told their actions were anonymous. This has led an influential school of economists to argue that altruism in humans is innate, rather than being based on cynical self-interest.
"But if just a photocopied pair of eyes can treble honesty, the Newcastle team suspects that these previous experiments may somehow have been spoiled by subliminal cues that made people feel they were being watched. "

Sunday, 25 June 2006

One more juicy R tip

If you're like me, you get tired of writing:

mydata[complex.row.condition, complex.column.condition] <- mydata[complex.row.condition, complex.column.condition]+1

or similar. One solution is:

crc <- complex.row.condition
ccc <- complex.column.condition

which saves writing out the complex conditions more than once. But it also makes it very cluttered to read and unclear what you are doing.

My alternative solution is to place this function definition in your file:

inplace <- function (f, arg=1) eval.parent(call("<-",substitute(f)[[arg+1]], f),2)

Now instead of writing, e.g.

foo[bar,baz] <- foo[bar,baz]*2

you can just write

inplace(foo[bar,baz] *2)

or instead of

foo[bar,baz] <- paste(foo[bar,baz], 1:10)


inplace(paste(foo[bar,baz], 1:10))

The second argument of the inplace function allows you to use it when your target for assignment is not the first argument of your inner function. For example:

inplace(sub("old", "new", foo[bar,baz]), 3)

is the same as

foo[bar,baz] <- sub("old", "new", foo[bar,baz])

What remains valid in Marxist economics?

Tyler Cowen has a go at answering this question and comes up with five ideas. They are a bit focused on early Marx for me. I would suggest:

  1. His focus on explaining institutions rather than just assuming them. This is still key to modern political economy.
  2. His class-based framework for understanding politics. This is quite a simple schema: the bourgeoisie can ally with the aristocracy or the proletariat. It's still enormously productive. Modern political science, since Olson, has been much less happy with the assumption that classes, which are groups, can automatically solve their collective action problem and come together to defend their interests. But humans seem to be better at this than theory would predict: in particular, they're willing to punish each other to maintain group norms, even at cost to themselves.
  3. Ideology. To find out why an idea is popular it is always worth asking: whose interest is it in?

Tuesday, 13 June 2006

copied and pasted from the UCEA website

I know I'm slightly behind the times with this. The recent AUT strike has interested me, partly because my landlord is a keen supporter. I wanted to find proper time-series data on university lecturers' pay scales, but it would just take too long to put it together - unfortunately, the national statistics web interface is not much use.

Anyway, this is just nicked from a UCEA (employers') press release, so if you disagree, feel free to say why. My italics.

11. During the 1980s and 1990s, declining university funding did not allow for real terms
improvements in academic pay. However, since 2000/01, the sector’s finances have
started to improve. New pay negotiating machinery was introduced at the same time
and average earnings for academic staff have increased by 20.3% since 2001. Over
the same period, the unit of funding per student (representing the income universities
and colleges of higher education get per student they teach) has increased by just
under 16%.
12. In real terms (deflated by the RPI), since 2001 the pay of higher education teaching
professionals has increased by 8.7% compared with 3.9% for all employees, and 6.2%
for all professional occupations.
13. According to the Office of National Statistics (ONS), the average annual earnings of
full-time ‘higher education teaching professionals’ were £40,657 in 2005, placing
academics in the top 20% of earners in the UK. That compares with national average
earnings of £28,210 for all full-time employees, and £36,894 for all professional
Employers also contribute additional sums – equivalent to 14% of salaries – to
generous final salary pension schemes.

At Essex, the Trots or someone have posted up posters representing the lecturers as Oliver Twist and the Vice-Chancellor as a ruthless workhouse manager: "MORE PAY! I promised you more, but now I say NO!!"

I though point 13 in particular might provide a little context.
I hate to disturb all you bleeding heart liberals, but there is a simple statistical fact here: for three people to commit suicide on the same night, out of spontaneous, independent desperation, is a blinding coincidence. It seems pretty clear that these suicides were planned.

Having said that, the US rhetoric about an "act of war", driven by a half-conscious analogy with suicide-bombing, is disingenuous. You might as well call Jan Palach a terrorist. Perhaps the three deaths were planned as a protest against the Guantanamo regime. If so, as holding people indefinitely without trial is indeed unjust, cruel and totalitarian, it is hard not to sympathize with their point.

Monday, 12 June 2006

R tips

I've been working with large datasets in R and thought I would share some tips. Of interest to statisticians only!

1. On Windows, get tinn-R. You don't want to be working with the basic R editor. On Linux, use emacs or vi as you prefer - emacs is supposedly pretty good.

2. Record everything you do - don't rely on saving the history for this, as it will be indecipherably messy. The ideal is that anyone should be able to replicate your results, from publicly available data, just by running your file of commands. It also helps a lot when you lose data and have to go back and redo it.

3. You will probably end up with a lot of temporary variables. To know which variables you can safely delete, give temporary variables a dot at the end of their name, like:

for (yr. in dataset$years)

4. Save shortened versions of often-used commands in your Rprofile file (on Windows, this will be in c:\program files\R\R-\etc\

For example, I like to type hs instead of If I just put

hs <-

in, I will get an error on startup. This is because is in the utils package which is not loaded until after the Rprofile has been executed. So you have to be a little sneaky:

setHook(packageEvent("utils", "onLoad"), function (...) {
hs <<-

This means that when utils gets loaded, the function gets assigned to hs. The double-headed arrow, by the way, is for global assignment. Otherwise, the assignment would only happen in the body of our function, which would be useless.

5. You can create new operators. For example, I like to be able to type
1:5 %-% 3
and get c(1, 2, 4, 5), i.e. the numbers from 1 to 5 with 3 removed. Another line in my
"%-%" <<- setdiff
The setdiff function does what I want, but with %-% it's quicker and more intuitive to read. Similarly

"%like%" <<- function(x,y) grep(y,x, perl=T)
means I can type state[state$name %like% "Al.*",] and get data for Alabama and Alaska.

6. Statisticians tend to want to put everything in one huge table. So for example, if they have 50000 Eurobarometer respondents and they want to use respondent's nation's GDP as an independent variable, they'll create a big table with 50000 rows:
name | nation | GDP | ... other national variables
This is fine until you want to take the log of GDP and it takes the computer five minutes to create 50000 new variables, most of which are duplicates. Take a hint from database administration: keep national variables in a separate data frame, with one row for each nation. Then merge them once you have created all your independent variables, before you start running regressions.

(If you want to know more about how to create good databases, here's a good guide.)

7. Tired of typing brackets all the time to run simple commands? Here's a neat hack:
print.command <- function (x) {
default.args <- attr(x, "default.args")
if (! length(default.args)) default.args <- list()
print(, default.args, envir=parent.frame()))

class(ls) <- c("command", class(ls))
class(search) <- c("command", class(search))

Now you can type ls or search at the command line without brackets. The magic here is that the end result of any command line is printed, i.e. the print method is called on the object. If we give ls a class of "command", the function
print.command gets called when we evaluate ls. This then runs the function in the command line environment. To set up default arguments, do, e.g.:

attr(ls, "default.args") <- list(all=T)

It would be nice to be able to type fix foo instead of fix(foo), but I don't think it's possible. Correct me if you know better.

NEW 8. You don't have to save everything in one workspace. This is the easiest way to go at first, but when your data becomes large and takes minutes to load, you can separate it into different workspaces and load only the bits you need. To do this, instead of save.image, use

save(foo1, file="foo1.RData")
save(foo2, file="foo2.RData")

et cetera. You then load these in the normal way.

That's your lot! For more, check out or R tips. And of course, the occasionally grumpy but always enlightening R-help mailing list.


A friend of mine is involved in Backlash, which was set up to combat the UK Government's proposed legislation banning some kinds of extreme pornography. The site has some punchy arguments.