<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Paul Litvak</title>
	<atom:link href="http://www.paullitvak.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.paullitvak.com</link>
	<description>Postprandial otter working at Google; Slow takes from the shallow water</description>
	<lastBuildDate>Thu, 16 May 2013 21:53:35 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Why social science grad students would make great product managers</title>
		<link>http://www.paullitvak.com/2013/05/16/why-social-science-grad-students-would-make-great-product-managers/</link>
		<comments>http://www.paullitvak.com/2013/05/16/why-social-science-grad-students-would-make-great-product-managers/#comments</comments>
		<pubDate>Thu, 16 May 2013 21:53:35 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://www.paullitvak.com/?p=111</guid>
		<description><![CDATA[After my interview with InDecision Blog, a number of graduate students emailed asking me about careers in technology (hey, I asked for it). They were a very impressive lot from top universities, but their programming skills varied quite a bit. Some less technically minded folks were looking at careers in technology aside from data scientist. [...]]]></description>
				<content:encoded><![CDATA[<p>After my <a href="http://indecisionblog.com/2013/03/25/into-the-wild-paul-litvak/">interview</a> with InDecision Blog, a number of graduate students emailed asking me about careers in technology (hey, I asked for it). They were a very impressive lot from top universities, but their programming skills varied quite a bit. Some less technically minded folks were looking at careers in technology aside from data scientist. Enough of them asked specifically about product management, so I thought I would combine my answers for others who might be interested.</p>
<p><strong>What does a product manager do?</strong><br />
<a href="https://www.kennethnorton.com/essays/leading-cross-functional-teams.html">Brings the donuts.</a> The nice thing about social science grad students for whom reading about product managers is news is that we can skip over the aggrandized misconceptions about product management that many more familiar with the technology space might harbor. The product manager is the person (or persons) that stands at the interface between an engineering team building a product and the outside world (here includes not only the customers/users of the product, but also the other teams within a given company who might be working on related products). The product manager is in charge of protecting the &#8220;vision&#8221; of the product. Sometimes they come up with that vision, but more often than not, the scope of what the product should be and what features it needs to have today, next week, or next year is something that emerges out of interactions between the engineers, the engineers&#8217; manager, the product manager, company executives, etc etc. The product manager is really just the locus of where that battle plays out. So obviously there is a great need for politicking at times as well. </p>
<p>But wait, there&#8217;s more! Once the product is actually launched, it is typically still worked on and improved (or fixed). So the product manager is also the person that gets to figure out how to prioritize the various additional work that could be done. But how do they figure out what needs to be changed or fixed? This is one of the places where research comes in! So someone like me might do analysis on the data of people&#8217;s actual usage of the product (the product manager prioritized getting the recording of people&#8217;s actions properly instrumented, right? RIGHT?). Or a qualitative researcher might conduct interviews of users in the field and try and abstract an understanding from that. Either way, the product manager has to make sense of all this incoming information and figure out how to allocate resources accordingly.  </p>
<p><strong>Why would social science graduate students be good at that?</strong><br />
Perhaps you can see where I&#8217;m going with this. Products are increasing in scope. Even a simple app has potentially tens of thousands of users. Quantitative methods are becoming increasingly important for understanding what customers do. In such an environment, being savvy about data is hugely advantageous. In the same way that many product managers benefit from computer science degrees without coding on a daily basis, product managers will benefit from knowing statistics, along with domain expertise in psychology, sociology, anthropology even if they aren&#8217;t the ones collecting and analyzing the data themselves. It will help them ask the right questions and to when to trust results, and when to be more skeptical. It will help them operationalize their measures of success more intelligently.</p>
<p>The soft skills of graduate school also translate more nicely. Replace &#8220;crazy advisor&#8221; with &#8220;manager&#8221; (hopefully a good one) and replace &#8220;fellow graduate students&#8221; with &#8220;other product managers&#8221; and many of the lessons apply. Many graduate social scientists will have plenty of experience with being part of a lab and engaging in large-scale collaborative projects. Just like in graduate school, a typical product manager will spend hours fine tuning slide decks and giving high stakes presentations meant to convince skeptical elders of the merit of a certain course of research (replace with: feature, product, or strategy).</p>
<p>Finally, building technology products is a kind of applied social science. You start with a hypothesis about a problem that people are having that you can solve. Of course, as a social scientist, the typical grad student understands just how fraught this is! Anthropologist readers of James Scott and Jane Jacobs and economists who love their Hayek will have a keen appreciation for spontaneous order (&#8220;look! users are using this feature in a totally unexpected way!&#8221;), as well as the difficulties of a priori theories of users&#8217; problems or competencies. In fact, careful reading of social science should make a fledging PM pretty skeptical of grand theories. For instance&#8211;should interfaces be simpler or more complicated? How efficient should we make it to do some set of common actions? If everything is easily accessible from one click on the front page, will there be overload of too many buttons? Is that simpler or more complicated? These sorts of debates, much like debates about the function of particular social institutions or legal proscriptions, are not easily solved with simple bromides like &#8220;less is always better&#8221;, or &#8220;more clear rules, less discretion&#8221; (I am reading <a href="http://www.amazon.com/gp/product/1476726590/ref=as_li_ss_tl?ie=UTF8&#038;camp=1789&#038;creative=390957&#038;creativeASIN=1476726590&#038;linkCode=as2&#038;tag=paulitsblo-20">Simpler: The Future of Government</a><img src="http://www.assoc-amazon.com/e/ir?t=paulitsblo-20&#038;l=as2&#038;o=1&#038;a=1476726590" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> by Cass Sunstein right now, and he makes this point very well with respect to regulations). The ethos of the empirical social scientist is to look for incremental improvements bringing all of our particularist knowledge to bear on a problem, not to solve everything with one sweeping gesture. This openness is exactly the right mentality for a product manager, in my opinion. </p>
<p><strong>Conclusion</strong><br />
I hope I have at least partially convinced you that as an empirical social scientist, you would make a great product manager. Now the question is, how do I convince someone in technology of that?  The short and most truthful answer is, I&#8217;m not 100% certain. It might take some work to break into project management, but I see lots of people with humanities background doing it, so it can&#8217;t be that hard (One of my favorite Google PMs is an English PhD). One thing I would suggest is carefully framing your resume to emphasize your PM-pertinent skills&#8211;things like, group project management, public speaking experience, making high stakes presentations, etc. You might also consider making a small persuasive deck to show as a portfolio example of a situation where you convinced someone of something (your dissertation proposal could work?). This would be a great start. Another thing is consider more junior PM roles initially&#8211;as a PhD coming out of grad school you are still going to make a fine salary as an entry-level product manager. If you apply these principles I have no doubt that you will quickly move up. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.paullitvak.com/2013/05/16/why-social-science-grad-students-would-make-great-product-managers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Shameless self promotion</title>
		<link>http://www.paullitvak.com/2013/03/25/shameless-self-promotion/</link>
		<comments>http://www.paullitvak.com/2013/03/25/shameless-self-promotion/#comments</comments>
		<pubDate>Mon, 25 Mar 2013 14:14:04 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.paullitvak.com/?p=106</guid>
		<description><![CDATA[An interview with me, about life in business vs. the academy&#8230; http://indecisionblog.com/2013/03/25/into-the-wild-paul-litvak/]]></description>
				<content:encoded><![CDATA[<p>An interview with me, about life in business vs. the academy&#8230;</p>
<p><a href="http://indecisionblog.com/2013/03/25/into-the-wild-paul-litvak/">http://indecisionblog.com/2013/03/25/into-the-wild-paul-litvak/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.paullitvak.com/2013/03/25/shameless-self-promotion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why a focus on p-hacking is misplaced, or the coming co-evolution</title>
		<link>http://www.paullitvak.com/2013/03/14/why-a-focus-on-p-hacking-is-misplaced-or-the-coming-co-evolution/</link>
		<comments>http://www.paullitvak.com/2013/03/14/why-a-focus-on-p-hacking-is-misplaced-or-the-coming-co-evolution/#comments</comments>
		<pubDate>Thu, 14 Mar 2013 19:13:46 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[academia]]></category>
		<category><![CDATA[evolution]]></category>
		<category><![CDATA[meritocracy]]></category>

		<guid isPermaLink="false">http://www.paullitvak.com/?p=99</guid>
		<description><![CDATA[There has been a lot of recent work on p-hacking (making things statistically significant through taking advantage of analysis degrees-of-freedom), which I think is good (it&#8217;s starting to make people aware of the scope of the problem facing social psychology and related fields); however, I think people are missing something fundamental. As Tal Yarkoni recently [...]]]></description>
				<content:encoded><![CDATA[<p>There has been a lot of recent work on p-hacking (making things statistically significant through taking advantage of analysis degrees-of-freedom), which I think is good (it&#8217;s starting to make people aware of the scope of the problem facing social psychology and related fields); however, I think people are missing something fundamental. </p>
<p>As Tal Yarkoni recently <a href="http://www.talyarkoni.org/blog/2013/03/12/the-truth-is-not-optional-five-bad-reasons-and-one-mediocre-one-for-defending-the-status-quo/" title="the truth is not optional: five bad reasons (and one mediocre one) for defending the status quo" target="_blank">pointed</a> out (and as I pointed out in a previous blog post), the incentives in the academy are messed up. Success in funding, in getting a job, etc, all hinges on your ability to produce positive results. When you livelihood literally depends on getting a positive result, it&#8217;s very hard to avoid putting your thumb on that scale.</p>
<p>So the solutions thus far proffered involve things like &#8220;publishing your data&#8221; and other such controls that will purport to &#8220;solve&#8221; this problem. However, the deep problem with this can be illustrated with a hypothetical computer program called &#8220;the Fake-ulator&#8221; (I thought about actually writing this program&#8211;but I think the thought experiment is enough for now). Version 1 is just a beta, so it only works for Likert scales. But the idea is simple enough&#8211;if we scour the literature for Likert scale data and effects we quickly realize that simple random draws from a response distribution will be easy to spot. Humans have lots of unique biases that lead to systematic patterns in response data like Likert scale data. So, the authors of the Fake-ulator have scoured the literature and have built a random data generator that generates data that looks indistinguishable statistically from real human response data! Better yet, you can input an effect size and generate beautiful (but not too beautiful) data that is statistically significant. You can even generate a fake file drawer, since many of these fake experiments will be &#8220;failures&#8221;! But hey, since your fake effect is positive, random fake experiments on average will find your effect. So with a computer program like this, you could easily imagine someone faking all of their data in a way that no one would ever notice.</p>
<p>Now what keeps me up at night is, does this computer program already exist? Did we only catch the really dumb fakers who didn&#8217;t take the time to do it the right way? One objection might be that anyone smart enough to do this will just run the studies&#8211;I think this is wrong. Actually running the studies leaves things up to chance. If you really want a 6-figure tenure track job at Harvard or Princeton, real data just won&#8217;t do! </p>
<p>The point of this is just to say that we need more than just clever statistics and safeguards&#8211;until we fundamentally change the incentives of science to reward process instead of outcome, we aren&#8217;t going to solve this problem. We are only going to make it much harder to determine if something is real or not. The adaptations are already upon us!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.paullitvak.com/2013/03/14/why-a-focus-on-p-hacking-is-misplaced-or-the-coming-co-evolution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>In which, to my surprise, I agree with Megan McArdle and comment on meritocracy</title>
		<link>http://www.paullitvak.com/2013/02/22/in-which-to-my-surprise-i-agree-with-megan-mcardle-and-comment-on-meritocracy/</link>
		<comments>http://www.paullitvak.com/2013/02/22/in-which-to-my-surprise-i-agree-with-megan-mcardle-and-comment-on-meritocracy/#comments</comments>
		<pubDate>Fri, 22 Feb 2013 22:21:30 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[academia]]></category>
		<category><![CDATA[diversity]]></category>
		<category><![CDATA[meritocracy]]></category>

		<guid isPermaLink="false">http://www.paullitvak.com/?p=90</guid>
		<description><![CDATA[This link, which of course touches on many of the same themes as Chris Hayes&#8217; Twilight of the Elites, points out that an increasingly metrics focused way of weeding out potential candidates for some elite group leads to a narrowing of the backgrounds and viewpoints of that elite. This happens as applicants increasingly narrow their [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.thedailybeast.com/articles/2013/02/21/america-s-new-mandarins.html">This</a> link, which of course touches on many of the same themes as Chris Hayes&#8217; <a href="http://www.amazon.com/Twilight-Elites-America-after-Meritocracy/dp/0307720454" title="Twilight of the Elites" target="_blank">Twilight of the Elites</a>, points out that an increasingly metrics focused way of weeding out potential candidates for some elite group leads to a narrowing of the backgrounds and viewpoints of that elite. This happens as applicants increasingly narrow their focus of study to optimize their chances of success (the gaokao in China is another modern day example of this&#8211;there are many).</p>
<p>This connects up to a comment that <a href="https://pinboard.in/u:cshalizi/public/" title="Shalizi's links" target="_blank">Cosma Shalizi</a> made regarding my previous post on <a href="http://www.paullitvak.com/2013/02/18/simgradschool-a-study-in-new-faculty-hiring-practices/">SimGradSchool</a>, objecting that &#8220;I like this a lot, but suspect the assumption of a unidimensional ability score misses a lot of why shit is fucked up and bullshit in the current academic job market.&#8221; I think I understand Cosma&#8217;s objection more broadly, and it connects directly to the notion of cognitive diversity.</p>
<p>If you read Scott Page&#8217;s terrific book on diversity, <a href="http://press.princeton.edu/titles/8353.html">The Difference</a>, he utilizes simulation to compellingly argue that the key to solving difficult problems is having a diversity of viewpoints drawn from a large pool of possible ways of thinking. Cosma and Henry Farrell have made a similar <a href="http://crookedtimber.org/2012/05/23/cognitive-democracy/">argument</a> for the benefits of democracy&#8211;that a the voting mechanism of democracy is the best way to solve the problem of aggregating preferences and solving complex coordination problems among agents. </p>
<p>So, I think these arguments point to another deeper problem for a unidimensional perspective on research ability. Discovery in science requires a diversity of viewpoints to make progress. If we make all the undergrads come from the same background (e.g. research assistant at a top lab from the beginning of undergrad, poster presentations at relevant conferences, etc.), or new faculty (come from these 10 schools and have 2 JPSPs / psych science journal articles), the problem is that we are going to get too narrow of a pool of potential researchers. One of the unique strengths of my graduate program at CMU was that they took students from many different backgrounds (I basically did a psych/econ grad degree with 0 econ classes, 2 psych classes and a philosophy/cs major). I think it definitely gave us a unique perspective. More broadly, I worry about whether a grades/test scores focused society is going to quash the very creativity that has been so central to innovation. Imagine Steve Jobs trying to get a job today in tech as a dropout from Reed with some calligraphy coursework and no technical major&#8211;not happening. </p>
<p>Of course, the problem remains&#8211;what do you do with the flood of applicants? You still have a sorting problem. How do you select for cognitive diversity in the right way? This has become an increasingly large <a href="http://www.nytimes.com/2013/01/28/business/employers-increasingly-rely-on-internal-referrals-in-hiring.html?pagewanted=2&#038;hp&#038;_r=0&#038;pagewanted=all">problem</a> at tech companies which are leaning on referrals even more than before. I have a few thoughts about this that I will share in an upcoming blog post.</p>
<p>Now the only problem is I probably took the window out of Cosma&#8217;s sails and he won&#8217;t blog about me anymore <img src='http://www.paullitvak.com/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' />  </p>
]]></content:encoded>
			<wfw:commentRss>http://www.paullitvak.com/2013/02/22/in-which-to-my-surprise-i-agree-with-megan-mcardle-and-comment-on-meritocracy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SimGradSchool, a study in new faculty hiring practices</title>
		<link>http://www.paullitvak.com/2013/02/18/simgradschool-a-study-in-new-faculty-hiring-practices/</link>
		<comments>http://www.paullitvak.com/2013/02/18/simgradschool-a-study-in-new-faculty-hiring-practices/#comments</comments>
		<pubDate>Mon, 18 Feb 2013 16:31:16 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[academia]]></category>
		<category><![CDATA[simulation]]></category>

		<guid isPermaLink="false">http://www.paullitvak.com/?p=16</guid>
		<description><![CDATA[[Attention conservation notice: 2000+ words about hiring in academia including an overly complex numerical model. Navel gazing surely to follow.] Like many other social science graduate students who have graduated in the last 5-10 years, I have experienced the ratcheting up of competition. It&#8217;s interesting to think about the changing landscape of the competition for [...]]]></description>
				<content:encoded><![CDATA[<p><strong>[Attention conservation notice: 2000+ words about hiring in academia including an overly complex numerical model. Navel gazing surely to follow.]</strong> <a class="simple-footnote" title="You might have noticed me copying a few tropes from another, much more intelligent blogger than me. Well, imitation flattery blah blah." id="return-note-16-1" href="#note-16-1"><sup>1</sup></a></p>
<p>Like many other social science graduate students who have graduated in the last 5-10 years <a class="simple-footnote" title="You might ask, &#8220;why are you writing this, Paul? Are you actually embittered about academia given your current life?&#8221; Well, overly invasive individual, I am not trying to make some grand claim about how I was jobbed. For one thing, I didn&#8217;t even go on the job market. For another, if I had, I probably would have benefited at least partially from some of the effects I discuss. The issue for me had more to do with student debt, being an immigrant with no assets, socio-economic class, being genuinely passionate about technology from a young age, and discovering an interest in business as well as the desire to have a wider impact on people&#8217;s lives." id="return-note-16-2" href="#note-16-2"><sup>2</sup></a>, I have experienced <a title="Game's the same, just got more fierce" href="http://http://www.youtube.com/watch?v=XozWJgzgnT8">the ratcheting up of competition</a>. It&#8217;s interesting to think about the changing landscape of the competition for scarce tenure-track faculty positions in the last 30 years. On the one hand, the successes of identity politics have brought us an increasingly diverse (in terms of race, gender, nationality) academy. On the other, at least in the behavioral science, the sheer number of competitors and the need for impartial measures of candidate quality has narrowed the range of what top candidates&#8217; dossiers look like <a class="simple-footnote" title="I would also guess it has affected the SES of successful academics&#8211;if you are carrying a lot of student debt, it makes the choice to become an academic very difficult. I don&#8217;t know of any research on this, however." id="return-note-16-3" href="#note-16-3"><sup>3</sup></a>. To be clear, I am not advocating for us to go back to the good old old days where knowing the right person and a pat on the back is all that mattered <a class="simple-footnote" title="Of course, in some ways it still very much matters. I&#8217;m not aware of any systematic study, but I&#8217;ve observed that the children of professors have an extremely high success rate in academia. Part of that is surely being well taught and genetically more likely to be intelligent. But there is another factor&#8211;children of academics seem more savvy about the game, are always well positioned with the right advisor, in the right program, the right research, etc. Hard to say which of these factors is most important." id="return-note-16-4" href="#note-16-4"><sup>4</sup></a>. But I do think that what we have now is also far from the optimal strategy in selecting the most promising behavioral scientists and giving them the best resources to succeed.</p>
<p>So what does the hiring game today look like? A few recent papers give us some clues. First, there&#8217;s <a href="http://http://gppreview.com/2012/12/03/superpowers-the-american-academic-elite/">this</a> analysis of the political science hiring market, finding that the graduates of top programs (Harvard, Stanford, etc) dominate the job market. So much of the hiring game is already decided when you go through graduate admissions at the outset. In my experience advisor choice definitely is key&#8211;pick a good advisor and your chances increase. Another key factor is publication&#8211;it used to be the case that a top grad student from a top department maybe had one top tier journal article to their name. Now you typically see top candidates with 3-4 such papers, plus a host of &#8220;secondary&#8221; publications (like textbooks or edited volumes, which because they are not peer-reviewed, are typically worth less). </p>
<p>As readers of this are likely aware, the behavioral sciences (and biological sciences, to a lesser extent) are facing somewhat of a replicability crisis, with many key findings in our field being <a href="http://www.nature.com/polopoly_fs/7.6716.1349271308!/suppinfoFile/Kahneman%20Letter.pdf">called</a> into question. Did these games exist before? Yes, in all likelihood. However, what has changed is the environment of hyper-motivation. Loewenstein and Rick coined this term to describe the feeling of being &#8220;in a hole&#8221; or disadvantaged in a competition. The desire to &#8220;level the playing field&#8221; or &#8220;get back to even&#8221; is shown to be a prime motive in cheating behavior <a class="simple-footnote" title="Incidentally, you can see an instance of this in a recent paper on teacher incentives. The incentive that worked the best to motivate teachers was giving them a bonus and then threatening to take it away unless students&#8217; scores increased. Of course, they increased. The lesson of hyper-motivation is that we shouldn&#8217;t be so quick to employ incentives such as these&#8211;they may be more likely to lead to cheating." id="return-note-16-5" href="#note-16-5"><sup>5</sup></a>. So in a world where you constantly feel like you don&#8217;t have enough papers to succeed and get a top job and will be consigned to some far corner of the country or a low paying post-doc, you are all the more likely to cheat. Oh, and lest you think that whether a promising researcher gets into a top tier school or a second tier school doesn&#8217;t really matter&#8211;there is strong evidence that it does. See <a href="http://http://papers.nber.org/papers/w12157">this</a> paper that shows that an exogenous shock to hiring (a recession), affects the long term productivity of economists that are hired. So people are legitimately trying to maximize the career opportunities when they cut corners to get a few more papers out.</p>
<p><b>A Simple Model of Graduate School and Hiring</b><br />
Struck by this system in which your success is determined by your pedigree and the number of papers you publish, I sought out to construct a simple numerical model of graduate school. I wanted to see how your underlying quality as a research correlated with hiring outcomes in a stylized environment. I posted the R simulation code that I wrote below so that other people can examine and even extend my model if they wish. Note that what follows is a high level summary of how the model works and my general findings from playing with it. Readers interested in the details of how it works are advised to check out the code, which is fairly well-commented. </p>
<p>(<b>TL;DR Summary</b><br />
The number of papers has at best a moderate correlation with your underlying quality as a researcher. The lion&#8217;s share of selection comes from advisors picking the best students. If graduate school sorting is reasonably good, hiring will be reasonably good.)</p>
<p>The basic idea of the model is that each graduate student is represented by an underlying quality parameter. This parameter determines the average true effect sizes of experiments they run. So in this representation, a better researcher is someone who proposes more effects with larger average effect size. I realize this is a bit of an oversimplification, but setting things up this way has some nice properties. Essentially we can generate some proposed effects from a pool of researchers, set up some very simple well-powered or under-powered studies  to test those effects and simulate how many positive results they get. </p>
<p>The way I set up the course of graduate school then consisted in two parts. First, graduate students are assigned an advisor of some varying quality score. I had an external parameter modify the degree of correlation between the advisor quality and the underlying graduate student quality. So either advisors are great judges of talents, or not so good (I leave it to the reader to decide what the underlying value for the parameter should be). Graduate students then either run somewhat under-powered or somewhat well-powered studies, the number of which range over the number someone might expect to run in a typical grad student career (This can range from just a few studies all the way up to 100. Yes, there are definitely graduate students who run 100 studies over the course of graduate school.). So there are a number of different factors that contribute to the number of successful (aka t-test reveals p < .05) studies a graduate student in my simulation ends up with--their sample sizes, the number of studies they manage to pull off, and of course, their underlying quality. From there, I created a weighted average of the advisor's quality and the number of successful studies each graduate student had. This weighted score is a measure of job market desirability. In a crude way, it measures how well each student would fare on the market. </p>
<p>Assuming you believe this is a reasonable model (see objections below if/when you don't), you can then correlate the weighted hiring score to the original quality score and ask, under a different set of model parameters, what is the correlation between hiring score and quality score? If the given hiring procedure were doing a reasonable job, then you would expect the correlation would be high. If bad, then low.</p>
<p>What do actual runs of the model yield? As you can see from the code below, I set up a pool of 1000 graduate of students of different quality, assigned them to advisors with fairly high correlation between student and advisor quality (I set it at .5 and .7 for the various models), randomized how many studies they ran and what kind of sample sizes (25 or 50 per condition). Basically what you find is a moderate correlation between hiring score and (~ .6). However, interestingly, most of that is driven by advisor selection. The correlation between positive results and your quality as a student was significantly lower (~.35). Now, you might think that's pretty good (that is a medium effect size how psychologists typically define it). But the question you should be asking yourself is--relative to what? Also, this model assumes no p-hacking or file-drawer effects. The effects posited by experimenters are assumed to be independent so dropping failed studies isn't a problem. In real life, the correlation is likely to be much smaller, not larger.</p>
<p><strong>Some Obvious Objections</strong><br />
Now one obvious objection is that this is a gross oversimplication of the job market. There is an extensive interview process, so if someone&#8217;s research does not adequately reflect their skills, then the interview process would remove that. Notice, however, that the people a typical institution chooses to interview in the first place is largely a weighted average of the papers and advisor&#8211;no one is interviewing a candidate from University of Random with no papers. So even if there are other factors that affect selection that are more closely correlated with underlying quality, those other factors do not come into play until the end of the process where arguably the other promising students have already been weeded out. Also there is a huge hindsight bias coming into play&#8211;you know already that this person has successful research, so you are more likely to believe that what they are saying is plausible. </p>
<p>Another objection is that we actually do want to reward the people who run more studies&#8211;we want to reward some element of &#8220;can do gumption&#8221; beyond just researcher skill, and your positive result count is a nice indicator of that (since it partly reflects the sheer amount of work you put in). I have some sympathy for this argument, but given what I wrote above about hypermotivation and the various file drawer issues, I&#8217;m not sure we want to reward people just for running a lot of studies. Also, its not clear to me that the difference between someone who ran &#8220;many&#8221; and &#8220;few&#8221; studies translates into differences of effort expended&#8211;you might run more studies because you have more grant money at a prestigious university, or a better set up subject pool, or better undergrad research experience, or a very organized advisor&#8211;many reasons that have nothing to do with your overall quality as a researcher, even assuming that ceteris parabis, more studies = harder worker.</p>
<p><strong>Conclusions</strong><br />
So improvements in our hiring algorithm definitely could potentially increase the quality of scientific output. Arguably there is a Moneyball opportunity here&#8211;the correlation between IQ and SAT/ACT stores <a href="http://www.iapsych.com/iqmr/koening2008.pdf">range</a> from around .6 to .8 <a class="simple-footnote" title="I&#8217;m just using IQ as an example. Not that I believe in IQ as a great measure of intelligence. See another brilliant post by Shalizi here on this topic." id="return-note-16-6" href="#note-16-6"><sup>6</sup></a>. So one might imagine a hiring process built around evaluating research before it&#8217;s executed, or a general purpose test meant to measure knowledge of the social sciences, or your ability to think about and present a research problem and solution with a few day&#8217;s notice. Surely a battery of tests that attempted to measure quality of research hypotheses a priori could potentially be much more successful in identifying and promoting the best scientists rather than relying on the small sample size of a few results done at the start of their careers. </p>
<p><code></p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">library</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;plyr&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># Indicates the number of graduate students in the simulation.</span>
num.<span style="">students</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">1000</span>
&nbsp;
<span style="color: #228B22;"># Student quality is drawn from real values from 1-5 which correspond</span>
<span style="color: #228B22;"># to the alpha parameter in a beta distribution with the</span>
<span style="color: #228B22;"># beta parameter set to 10. The mean of this distribution is</span>
<span style="color: #228B22;"># alpha / (alpha + beta) and indicates the average true effect size</span>
<span style="color: #228B22;"># of any experiment that a student runs. </span>
students <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">runif</span><span style="color: #080;">&#40;</span>num.<span style="">students</span><span style="color: #080;">&#41;</span> <span style="color: #080;">*</span> <span style="color: #ff0000;">4</span> <span style="color: #080;">+</span> <span style="color: #ff0000;">1</span>
&nbsp;
<span style="color: #228B22;"># Experiments denotes how many experiments the students run</span>
<span style="color: #228B22;"># On average, a student runs 18 experiments in graduate school.</span>
<span style="color: #228B22;"># Because the distribution is a log-normal, there are a few</span>
<span style="color: #228B22;"># outliers who run a lot of studies (~100). This lines up roughly</span>
<span style="color: #228B22;"># with the author's observations and conversation with other</span>
<span style="color: #228B22;"># students.</span>
experiments <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">floor</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rlnorm</span><span style="color: #080;">&#40;</span>num.<span style="">students</span>, <span style="color: #ff0000;">2.75</span>, .65<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># Subjects denotes how many subjects the students run per</span>
<span style="color: #228B22;"># condition in their experiments. For simplicity, students</span>
<span style="color: #228B22;"># can either run a large amount per condition (50), or a</span>
<span style="color: #228B22;"># small amount.</span>
subjects <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">25</span>,<span style="color: #ff0000;">50</span><span style="color: #080;">&#41;</span>, num.<span style="">students</span>, <span style="color: #0000FF; font-weight: bold;">replace</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># Inputs</span>
<span style="color: #228B22;"># ss: vector of students</span>
<span style="color: #228B22;"># r: quality of admissions (correlation between the quality of the </span>
<span style="color: #228B22;">#    students and the quality of the advisors)</span>
<span style="color: #228B22;"># Outputs</span>
<span style="color: #228B22;"># vector of scores which represent advisor quality, on average correlated with</span>
<span style="color: #228B22;"># students with probability r</span>
run.<span style="">admissions</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>ss, r<span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
  admit.<span style="">scores</span> <span style="color: #080;">&lt;-</span> <span style="color: #080;">&#40;</span>ss <span style="color: #080;">*</span> r<span style="color: #080;">&#41;</span> <span style="color: #080;">+</span> <span style="color: #080;">&#40;</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">runif</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>ss<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">*</span> <span style="color: #ff0000;">4</span> <span style="color: #080;">+</span> <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span> <span style="color: #080;">*</span>
                  <span style="color: #0000FF; font-weight: bold;">sqrt</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">-</span><span style="color: #080;">&#40;</span>r<span style="color: #080;">^</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
  admit.<span style="">scores</span>
<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;"># Inputs</span>
<span style="color: #228B22;"># df: data.frame consisting of a vector for student quality, a vector for</span>
<span style="color: #228B22;"># the number of experiments each student runs in graduate school, and a</span>
<span style="color: #228B22;"># vector for the number of subjects per condition the student runs.</span>
<span style="color: #228B22;"># Outputs</span>
<span style="color: #228B22;"># The number of successful experiments each student has at the end</span>
<span style="color: #228B22;"># of grad school </span>
run.<span style="">grad</span>.<span style="">school</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
  out <span style="color: #080;">&lt;-</span> adply<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>, <span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> run.<span style="">research</span><span style="color: #080;">&#40;</span>s <span style="color: #080;">=</span> x$quality, 
                                               n <span style="color: #080;">=</span> x$num.<span style="">exp</span>,
                                               p <span style="color: #080;">=</span> x$num.<span style="">subjects</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">names</span><span style="color: #080;">&#40;</span>out<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>out<span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;num.positives&quot;</span>
  out <span style="color: #080;">&lt;-</span> out<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span>out$quality, decreasing <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>,<span style="color: #080;">&#93;</span>
  out
<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;"># Inputs</span>
<span style="color: #228B22;"># s: avg true effect size of a particular student</span>
<span style="color: #228B22;"># p: number of subjects per condition</span>
<span style="color: #228B22;"># Outputs</span>
<span style="color: #228B22;"># 1: indicates a successful study, 0: indicates a failure</span>
run.<span style="">study</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>s, p<span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
  m <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rbeta</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,s,<span style="color: #ff0000;">10</span><span style="color: #080;">&#41;</span>
  x <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span>p<span style="color: #080;">&#41;</span>
  y <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span>p,<span style="color: #0000FF; font-weight: bold;">mean</span> <span style="color: #080;">=</span> m<span style="color: #080;">&#41;</span>
  p.<span style="">value</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">t.<span style="">test</span></span><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span>$p.<span style="">value</span>
  out <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">ifelse</span><span style="color: #080;">&#40;</span>p.<span style="">value</span> <span style="color: #080;">&lt;=</span> .05, <span style="color: #ff0000;">1</span>, <span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span>
  out
<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;"># Inputs</span>
<span style="color: #228B22;"># s: avg effect size represents student quality</span>
<span style="color: #228B22;"># n: number of experiments run in grad school</span>
<span style="color: #228B22;"># p: number of subjects per condition in each study  </span>
<span style="color: #228B22;"># Outputs</span>
<span style="color: #228B22;"># number of successful results (i.e. papers published)</span>
run.<span style="">research</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>s, n, p<span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
  out <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">unlist</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">lapply</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>n, FUN <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> run.<span style="">study</span><span style="color: #080;">&#40;</span>s, p<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
  out
<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;"># Inputs</span>
<span style="color: #228B22;"># df: dataframe containing advisors' quality as well as the number</span>
<span style="color: #228B22;"># of positive results that the student obtained.</span>
run.<span style="">hiring</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
  <span style="color: #0000FF; font-weight: bold;">df</span>$z.<span style="">exp</span> <span style="color: #080;">&lt;-</span> <span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>$num.<span style="">positives</span> <span style="color: #080;">-</span>  <span style="color: #0000FF; font-weight: bold;">mean</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>$num.<span style="">positives</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">/</span> <span style="color: #0000FF; font-weight: bold;">sd</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>$num.<span style="">positives</span><span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">df</span>$z.<span style="">adv</span> <span style="color: #080;">&lt;-</span> <span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>$advisor <span style="color: #080;">-</span> <span style="color: #0000FF; font-weight: bold;">mean</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>$advisor<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">/</span> <span style="color: #0000FF; font-weight: bold;">sd</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>$advisor<span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">df</span>$total_score <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">df</span>$z.<span style="">exp</span> <span style="color: #080;">+</span> <span style="color: #0000FF; font-weight: bold;">df</span>$z.<span style="">adv</span>
  <span style="color: #0000FF; font-weight: bold;">df</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">df</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>$total_score, decreasing <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>,<span style="color: #080;">&#93;</span>
  <span style="color: #0000FF; font-weight: bold;">df</span>
<span style="color: #080;">&#125;</span> 
&nbsp;
<span style="color: #228B22;">### Simulation Code Here</span>
advisors <span style="color: #080;">&lt;-</span> run.<span style="">admissions</span><span style="color: #080;">&#40;</span>students, .5<span style="color: #080;">&#41;</span>
s.<span style="">df</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>quality <span style="color: #080;">=</span> students, num.<span style="">exp</span> <span style="color: #080;">=</span> experiments,
                   num.<span style="">subjects</span> <span style="color: #080;">=</span> subjects, advisor <span style="color: #080;">=</span> advisors<span style="color: #080;">&#41;</span>
out.<span style="">df</span> <span style="color: #080;">&lt;-</span> run.<span style="">grad</span>.<span style="">school</span><span style="color: #080;">&#40;</span>s.<span style="">df</span><span style="color: #080;">&#41;</span>
hired.<span style="">df</span> <span style="color: #080;">&lt;-</span> run.<span style="">hiring</span><span style="color: #080;">&#40;</span>out.<span style="">df</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p></code></p>
<div class="simple-footnotes"><p class="notes">Notes:</p><ol><li id="note-16-1">You might have noticed me copying a few tropes from another, much more intelligent <a href="http://vserver1.cscs.lsa.umich.edu/~crshalizi/weblog/">blogger</a> than me. Well, imitation flattery blah blah. <a href="#return-note-16-1">&#8617;</a></li><li id="note-16-2">You might ask, &#8220;why are you writing this, Paul? Are you actually embittered about academia given your current life?&#8221; Well, overly invasive individual, I am not trying to make some grand claim about how I was jobbed. For one thing, I didn&#8217;t even go on the job market. For another, if I had, I probably would have benefited at least partially from some of the effects I discuss. The issue for me had more to do with student debt, being an immigrant with no assets, socio-economic class, being genuinely passionate about technology from a young age, and discovering an interest in business as well as the desire to have a wider impact on people&#8217;s lives. <a href="#return-note-16-2">&#8617;</a></li><li id="note-16-3">I would also guess it has affected the SES of successful academics&#8211;if you are carrying a lot of student debt, it makes the choice to become an academic very difficult. I don&#8217;t know of any research on this, however. <a href="#return-note-16-3">&#8617;</a></li><li id="note-16-4">Of course, in some ways it still very much matters. I&#8217;m not aware of any systematic study, but I&#8217;ve observed that the children of professors have an extremely high success rate in academia. Part of that is surely being well taught and genetically more likely to be intelligent. But there is another factor&#8211;children of academics seem more savvy about the game, are always well positioned with the right advisor, in the right program, the right research, etc. Hard to say which of these factors is most important. <a href="#return-note-16-4">&#8617;</a></li><li id="note-16-5">Incidentally, you can see an instance of this in a recent <a href="http://www.economics.harvard.edu/faculty/fryer/files/enhancing_teacher_incentives.pdf">paper</a> on teacher incentives. The incentive that worked the best to motivate teachers was giving them a bonus and then threatening to take it away unless students&#8217; scores increased. Of course, they increased. The lesson of hyper-motivation is that we shouldn&#8217;t be so quick to employ incentives such as these&#8211;they may be more likely to lead to cheating. <a href="#return-note-16-5">&#8617;</a></li><li id="note-16-6">I&#8217;m just using IQ as an example. Not that I believe in IQ as a great measure of intelligence. See another brilliant post by Shalizi <a href="http://vserver1.cscs.lsa.umich.edu/~crshalizi/weblog/523.html">here</a> on this topic. <a href="#return-note-16-6">&#8617;</a></li></ol></div>]]></content:encoded>
			<wfw:commentRss>http://www.paullitvak.com/2013/02/18/simgradschool-a-study-in-new-faculty-hiring-practices/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>First post</title>
		<link>http://www.paullitvak.com/2010/01/04/hello-world/</link>
		<comments>http://www.paullitvak.com/2010/01/04/hello-world/#comments</comments>
		<pubDate>Mon, 04 Jan 2010 22:53:04 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.paullitvak.com/?p=1</guid>
		<description><![CDATA[After a long hiatus I set this up so I could post random musings on topics I couldn&#8217;t put anywhere else. Coming very soon, a project I have been working on&#8211; SimGradSchool. &#160;]]></description>
				<content:encoded><![CDATA[<p>After a long hiatus I set this up so I could post random musings on topics I couldn&#8217;t put anywhere else. Coming very soon, a project I have been working on&#8211; SimGradSchool.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.paullitvak.com/2010/01/04/hello-world/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
