An Interesting Experiment
How do people succeed in academia?
I have notebooks filled with theories about this question, but I’ve increasingly come to realize that insights of this type — built on gut instinct, not data — are close to worthless. Most knowledge work fields are complex. Breaking into their upper levels requires a deliberate effort and precision that is poorly matched to the blunt, feel-good plans we devise in bouts of blog-inspired reflection.
This was on my mind when, earlier this week, I went seeking empirical insight into the above prompt, and ended up designing a simple experiment:
- I started by identifying well-known professors in my particular niche of theoretical computer science.
- For each such professor, I studied their former graduate students. I was looking for pairs of students who earned their PhD around the same time and went on to research positions, but then experienced markedly different levels of success in the field.
- Once I had identified such a pair, I studied the first four years of their CVs — the crucial pre-tenure period — measuring the following variables: quantity of publications, venue of publications, and citation of published work in the period.
Each such pair provided an example of a successful and non-successful early academic career. Because both students in a pair had the same adviser and graduated around the same time, I could control for variables that are largely outside the control of a graduate student, but that can have a huge impact on their eventual success, including: school connections, quality of research group, and the value of the adviser’s research focus.
The difference in each pair’s performance, therefore, should be due to differences in their own strategy once they graduated. It was these strategy nuances I wanted to understand better.
Here’s what I found…
- The successful young professors published a lot. On average, they published 25 conference papers during their first four years. The non-successful professors published only 10. (Recall, in computer science, it’s competitive conference publications, not journal publications, that matter.) There was, however, high variance in these numbers. I was struck more by the floor function: the successful professors all published at least 4 conference papers a year (with some, but not all, publishing quite a bit more).
- Neither the successful nor non-successful professors strayed far from the key conferences in their niche. In theoretical computer science, each niche has its own publication venues, arranged in tiers of quality. There are also a small number of more general venues, which cover all of theoretical computer science, and which are quite competitive and prestigious. Neither of the groups I studied published much in the elite general venues. Both groups published mainly in the quality venues within their niche.
- The biggest differentiating factor between the two groups was citations. For each professor, I counted the citations for their five most cited papers published during their first four years (according to Google Scholar). The difference was staggering. The successful professors’ most cited papers from this period received, on average, over 1000 references. For the non-successful professors, the number was closer to 60.
As mentioned, I have notebooks filled with different strategies for succeeding in my research, with each such strategy focusing on a different element that struck me as important at the time.
My above experiment sweeps these compelling sounding ideas off the proverbial table, and replaces them with an approach backed by data. What matters, it tells me, is something we can call: quality cited papers. In more detail, how many papers per year are you publishing that: (a) are in quality venues; and (b) attracting citations?
This metric can tell me if I’m improving or not from year to year. Similarly, it provides clear feedback on which of my research directions should be dropped and which emphasized. When deciding whether to join a project, for example, I should start by estimating the expected impact on my quality cited papers value for the year. When deciding whether to apply for a particular grant, the same question should guide the decision.
This metric, in other words, plays the role for a young professor that batting average plays for a young baseball player. You might not like what it has to say, but it’s saying what you need to hear.
Quantitative Career Planning
The above experiment is a case study of a bigger idea that intrigues me. In knowledge work, we spend shockingly little time trying to understand the reality of how people in our positions succeed. Perhaps, as I’ve argued recently, we prefer our own answers to the truth, as our answers tend to sidestep any efforts that are too hard.
But it’s also possible that we simply need a better method for seeking these insights. The process above, which we can call quantitative career planning (a reference to the quantified self movement that encapsulates it), is an example of what these better methods might look like.
(Photo by bytemarks)
Isn’t it interesting that this was pretty much how Google’s PageRsnk algorithm started out? In hindsight it makes sense that the more you’re referenced, the more relevant, useful and interesting you are in your field. The only issue I see is being able to predict this score ahead of time, since it seems to be more of a post-hoc metric rather than something that can be directly controlled.
Rather than attributing the difference to poor planning, couldn’t it be the non-successful folks just weren’t that good at writing papers for conferneces in their niche, thus causing them not to advance?
> different levels of success
how did you measure their success? number of cited quality papers? 😉
Speed to tenure, if I’m reading the post correctly
This is a great post Cal! Instead of just talking to the experts in my field about what makes them successful, I should also try to see the progressions of their careers to learn it for myself.
I’m going to do a similar study for my own field.
Thanks Cal and great response Gopi.
I’m currently pursuing this alternative way of being successful simply by reading bios, papers/books written etc… by successful folks (including Cal Newport) instead of meeting them thanks to the internet. What I found so simple but key is: They applied Discipline in everything they did to reach where they are now. You can take someone to lunch, etc…but IF the individual does Not have the Discipline to get things done, it all comes down to day dreaming and talk.
This is interesting work. I’m curious if you found significant outliers? People with low publications who were successful nonetheless. Also I’m slightly afraid that chasing high reference counts might be misleading in the long run. It seems like it would be too easy to start pursuing research fads — areas that seem hot and get lots of interest (publications and citations) for a few years and then fall away with little lasting impact. While this might boost your career enough to get tenure, you may end up regretting it later. How do you avoid becoming too fashion driven? (I believe this was something Feynman warned against)
Isn’t this just Hirsch’s h-index rejigged for your specific field? ie. looking at conference papers rather than journal papers?
It also doesn’t provide insight on HOW to get from A to B ie. I might aim to publish 4+ high-impact papers in top-tier journals each year, and for them to be cited a lot but the $64K question is how to actually achieve those goals.
Picking research topics that have a better chance of being accepted for publication and where the results are likely to be significant enough to be cited a lot may be a necessary condition. But its insufficient. As EJ points out, the ability to write good papers is also a requisite, as will the ability to successfully conduct such significant research.
But it does help avoid career-limiting mistakes, such as picking research topics that are too ho-hum to get published in top journals/conferences or to be referenced by others. And does provide a minimal publication rate target. It also provides a simple, quantitative measure of whether or not you are ‘making the grade’ as an early-career academic.
ps. For those of us not in the computer science field, it should be noted that the average h-index for top researchers varies quite a lot across different fields.
Are you sure you’ve managed to avoid post hoc ergo propter hoc? If you haven’t, then trying to achieve academic success by working to increase the number of quality cited papers could be a poor, ineffective, or even, in theory, completely counter-productive strategy.
I’m wondering how you determined “successful” and “non-successful” up front (had tenure, didn’t, got grants, didn’t, were paid more, or more well respected). What is the result of the citations? They are prolific and cited, but I’m curious what the impact of this is. Could it be that a person who did a smaller, more focused project had a bigger impact in their field, in their lab, etc? There is no indicator here of what “success” is. Is success defined as “people know your research”? It could be viewed as a bit circular. And raises many questions of the relevance of academic research and how we quantify it. Did these people make a product? File a patent? How does one measure impact of research? Is this related to “success”? Since tenure is based on publications, then those who want it are going to try for this. Is this the only successful marker? Reminds me of the earlier article on those who got their papers in Science or a more public venue, which seems, qualitatively, to make more sense in terms of impact. How do these people define success, and is it something we agree with? Thanks for the info.
Now I see tenure is in the article title.
The cause and effect can be the other way around, i.e., the author being successful may cause their papers being cited more.
There could be hidden variables, i.e., certain traits that cause both large citations and an academic success.
It’s good to see you becoming more empirical – I always thought your insights on having impact in research were fascinating, but I was suspicious since they seemed to be based on intuition or anecdote. The field of scientometrics may have more insights to add here.
I’m not sure that you’ve found out why some people succeed and others don’t – your result is more descriptive than explanatory. For example, perhaps the less successful academics tried to get lots of articles published in top conferences in their niche, but they just didn’t have the ability for some reason.
All the same, you’ve now got a nice metric to test different strategies against, and that’s really useful.
Cal,
Notice the parallels between your professional mission (applying distributed algorithm theory to new settings) and the direction of your writing mission (applying scientific reasoning to decode “patterns of success”)?
Just an outsider’s casual, spur of the moment thought.
Your metric is actually remarkably close to the definition of “impact factor” for journals.
I like the idea to calculate your own impact factor.
Cal, have you decoded the formula for visibility in academia, more specifically, what do you think are the factors that make a blockbuster paper?
The question is, how do you get highly cited papers (in any field). Unfortunately, this can lead science down very dangerous paths. Possible ways of getting lots of citations:
(i) Write brilliant original papers that everyone admires
(ii) Write papers in areas where lots of people are working so you have more chance of getting cited
(iii) Work hard to publicise your papers at conferences
(iv) Schmooze people so they all know who you are and cite your work
Obviously, other than (i), all of these strategies have major drawbacks:
(ii) encourages derivative work which is similar to many other researchers and encourages back-patting with little innovation
(iii) discriminates against workers with (e.g.) young families or from countries where there is limited budget to access top conferences
(iv)discriminates against people who are ‘different’ and do not fit into scientific society, or do not want to/are bad at playing the political game.
What we should really be asking, is how can we ensure that as many scientists as possible hit criterion (i). If you can do that, you are guaranteed tenure, and scientific progress will be all the better for it.
David,
Helen Sword in “Stylish Academic Writing” explores the writing style of academics in leading publications that are named as being especially clear writers by their colleagues. She got numbers. This might be one starting point.
On a broader level, one of our senior faculty suggested that a blockbuster paper is one that is framed in such a way as to be of use outside the direct niche (quote: “Having an accessible ontology”).
For a detailed research study, and this might first be done field-by-field, it might be worth looking at highly cited journal or conference article, it’s editor (as in “the gatekeeper”) and its specific scientific, semantic and linguistic merits; this might then be contrasted by the counterfactual (same journal, editor, even same issue).
Must be aware, though, of clear nepotism in, for example, grant applications, which are considered to be an issue according to a paper in – I believe – Nature, the author and title of which I don’t remember.
This is definitely possible. It’s hard to tell whether the non-successful folks deployed bad strategy or just simply didn’t have the skill level or will to succeed with the right strategy.
Good question. My main criteria was speed with which they obtained tenure (and then full professorship after that), combined with some standard distinctions in my field.
Part of what makes citations a nice metric is that it’s a proxy for a lot of important things about your research. For example, if you chase an established fad and add incremental improvement, you won’t earn much citations. Higher citation accounts seem to require either: (a) a substantial advancement in an established direction; or (b) a new direction that has obvious importance.
These are both much harder than simply working backwards from what you already know what to do — a default behavior I find to be dangerously alluring. The hard citation metric protects you from this trap.
I would describe my writing mission as trying to figure out how to succeed more in my professional mission!
It’s hard to predict. There’s research, for example, that shows that the best predictor of blockbuster papers are total number of papers the researcher publishes. Frans Johansson, in his new book (which is where I heard about this study), attributes this finding to the need to make lots of bets to figure out what will work. I tend to agree. If you’re actively trying to increase your quality cited paper count, you will produce a larger number of big papers. But you won’t know in advance which papers these will be.
Not to be harsh, but when I hear people complain that writing successful papers requires shifty shortcuts, it sounds to me like sour grapes. In my experience, it really boils down to hard work. To make an impact, first of all, you have to really understand what the best people in your field are doing. This is really hard. Then you have to work on lots of projects in parallel, continually trying to mix and match your collections of hard problems and promising techniques. For a lot of star scientists, there is an inflection point where their toolbox of techniques, and knowledge of the field, is large enough that they suddenly start making lots of breakthroughs. This can seem prodigious, but there’s almost always this foundation of years of hard work behind it.
I think this answer still misses the point that the question raised. I don’t think he is questioning the hard work etc. Simply that it has inherent pitfalls that can lead to many citations of derivative work or networking skills etc.
I’m interested in how you get the data and convert them to numeric variables?
Did you download their CV websites and use regular expression to extract information?
Kudos to you for keeping a steady blog post rate while being a new dad. Your fixed-schedule must come in very handy these days!
Cal,
Perhaps not in your field, but in many others, one way to get cited a lot is to write a survey of all the work in the field.
Cheers,
Mary
More and more journals are counting the number of times an article is accessed on line. This number might be useful as it would respond much quicker than citations.
On the one hand, good research; on the other hand, I’ve been aware of this (as have the professors I’ve known…and I was a faculty brat, so that’s probably as many as were in your sample) for over 50 years now. I would hardly qualify it as ground-breaking.
Of course, it’s also worth remembering that you can be cited as the exemplar of bad work in a field. As long as you’re cited…
On the one hand, good research; on the other hand, I’ve been aware of this (as have the professors I’ve known…and I was a faculty brat, so that’s probably as many as were in your sample) for over 50 years now. I would hardly qualify it as ground-breaking.
Of course, it’s also worth remembering that you can be cited as the exemplar of bad work in a field. As long as you’re cited…
Cal–
I’m curious what subset of schools these professors were at? In my field, the selection of ‘success’ happens much earlier than starting an independent position.
Cal, I think it may be as you say, especially in computer science or mathematics, where you have rigorous standards for what works and what does not. But in plenty of other fields (economics, psychology…), standards are not that objective, so there the dangers that DKS mentioned are very real. Remember behaviorism and Freudian psychology? These days, just look into, say, academic sociology and you’ll see the same thing.
PageRank strikes again
This could also be relevant. “How Many Ph.D.’s Actually Get to Become College Professors?” https://www.theatlantic.com/business/archive/2013/02/how-many-phds-actually-get-to-become-college-professors/273434/
Hey Cal,
Great post. I’m a MD-PhD student beginning my PhD portion of training. I want to be systematic about how I attack problems in the lab. Could you elaborate more on the notebooks you kept with successful strategies? Perhaps even post a sample in a future post?
Isn’t #citations pretty explicitly the criterion that tenure committees use to evaluate assistant professors, and also the one that study sections use to score grants? I don’t know, it all seems a little circular somehow. The point about overall productivity predicting the big wins is more interesting (and contradicts a lot of fashionable grumpiness about how professors should strive to publish fewer, better papers).
Perhaps, this paper might be of some help: Predicting Citation Counts Using Text and Graph Mining
Has anyone had first year professors? I struggled and was doing well at an A for the final. It is very difficult because she hasn’t been consistent with her tests. She changes them all up. This final was super detailed and I did not expect this. How do you prepare for new professors and prepare for the unexpected?
I feel like I always am unprepared for things on my tests. Do you just make sure you understand everything said in lecture?
I’m late to this discussion but have thought about this topic also. I think Cal gives some brutally honest and very helpful career advice. On the other hand, I can’t help but think of Gregor Mendel’s long-ignored paper that is the foundation of genetics today. Or Darwin taking 20 years to write that one dang paper — but it was a real doozy, wasn’t it? Or Higgs talking about how nowadays he would have been under too much publication pressure to come up w/ his BIG (as in, Nobel big) idea.
Point is that this is great career advice but let’s not confuse it with a strategy for scientific breakthroughs.
The problem is to have those highly cited papers, you cannot predict which one will be highly cited, thus more is better than less, usually. Unfortunately (or fortunately), this is still not enough. I have seen many people “producing” many papers, but they were not cited at the end.
Personally, I prefer people who publish less, but can prove that those few papers matter. As a rule of thumb, for every 10 papers you should expect at least one which is cited >50 times after 5 years (depends on the discipline, but it is easy set the threshold if you have some insight in your field, let say 5% of best cited authors H-index).
work more on what you can control, publish more and citations will add up over time. if you can pay for open access which can increase citations!
Cal,
Your profile says that you have “Over 60 peer-reviewed publications”. That is super-impressive! However, doing a Google Scholar search with your name does not return any journal articles (although it does show all of your books). Could you please provide 5 or 6 examples/titles of your peer-reviewed articles? It would be very insightful to see the type of articles that prolific researchers like yourself publish. Thanks in advance.
My academic computer science papers are published under my full name: Calvin Newport