<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-8377255335801820601</id><updated>2012-02-19T11:00:42.058-08:00</updated><title type='text'>kvh</title><subtitle type='html'>my blog, wherein i take extreme positions on loaded issues just for page views. just kidding, it's mostly boring math essays.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.kenvanharen.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default'/><link rel='alternate' type='text/html' href='http://www.kenvanharen.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>kvh</name><uri>http://www.blogger.com/profile/05988298721002241103</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>6</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-8377255335801820601.post-2112898033356502973</id><published>2012-02-17T10:24:00.000-08:00</published><updated>2012-02-17T10:24:15.877-08:00</updated><title type='text'>What's the best machine learning algorithm?</title><content type='html'>&lt;div&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;People often want to know what "the best" machine learning algorithm is. This kind of question has the same answer in every field: “it depends”. If we could shelve our pedantry for a moment though, and give an answer, that answer would be &lt;b&gt;Random Forests&lt;/b&gt;. A Random Forest is a machine learning procedure that trains and aggregates a large number of individual &lt;a href="http://en.wikipedia.org/wiki/Decision_tree"&gt;decision trees&lt;/a&gt;. It works for any generic classification or regression problem; is robust to different variable input types, missing data, and outliers; &lt;a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.4944"&gt;has been shown to perform extremely well across large classes of data&lt;/a&gt;; and scales reasonably well computationally (it’s also &lt;a href="https://cwiki.apache.org/MAHOUT/random-forests.html"&gt;map-reducible&lt;/a&gt;). Perhaps best of all, it requires little tuning to get good results. Robustness and ease-of-use are not often appreciated as they should be in machine learning (not to the extent cool-sounding names are, anyways), and it's hard to beat tree ensembles, and Random Forests in particular, on these dimensions.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Arial;"&gt;&lt;span style="font-size: 15px; white-space: pre-wrap;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Arial; font-size: 15px; white-space: pre-wrap;"&gt;Random forests work by generating (typically hundreds) of decision trees in a specific random way such that each is &lt;i&gt;de-correlated&lt;/i&gt; with the others. Since each decision tree is a low-bias, high-variance estimator, and each is relatively uncorrelated with the others, when we aggregate their predictions we get a final prediction with low bias AND low variance. &lt;a href="http://nlp.stanford.edu/IR-book/html/htmledition/the-bias-variance-tradeoff-1.html"&gt;Magic&lt;/a&gt;. The trick is in getting trees trained on the same dataset to be uncorrelated. This is accomplished by using randomly sampled subsets of features for evaluation at each node in each tree and a randomly sampled subset (bootstrap) of data points to train each tree.&lt;/span&gt;&lt;span style="font-family: Arial;"&gt;&lt;span style="font-size: 15px; white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;Put simply, if you have a machine learning problem and you don’t know what to use, you should use random forests. Here, in table form (courtesy of &lt;a href="http://www-stat.stanford.edu/~tibs/ElemStatLearn/"&gt;Hastie, Tibshirani and Friedman&lt;/a&gt;), is why:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-aixI6Uwci8E/TxYFS5JMpfI/AAAAAAAAAaI/43DzKHSIbro/s1600/Screen+shot+2012-01-17+at+3.32.42+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="283" src="http://4.bp.blogspot.com/-aixI6Uwci8E/TxYFS5JMpfI/AAAAAAAAAaI/43DzKHSIbro/s320/Screen+shot+2012-01-17+at+3.32.42+PM.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;br /&gt;Random forests inherit most of the good attributes of "Trees" in the above chart, but in addition also have state-of-the-art predictive power. Their main drawbacks are a lack of good interpretability, something that most other highly predictive algorithms do even worse on; and computational performance -- if you need something for real-time production, it could be hard to justify using random forests and spending the time to evaluate hundreds or thousands of trees.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Arial;"&gt;&lt;span style="font-size: 15px; white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial;"&gt;&lt;span style="font-size: 15px; white-space: pre-wrap;"&gt;If you are interested in playing around, grab the &lt;a href="http://cran.r-project.org/web/packages/randomForest/index.html"&gt;R package&lt;/a&gt;.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial;"&gt;&lt;span style="font-size: 15px; white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;I recently heard the president of &lt;a href="http://www.kaggle.com/"&gt;Kaggle&lt;/a&gt;, Jeremy Howard,&lt;/span&gt;&lt;span style="font-family: Arial; font-size: 15px; white-space: pre-wrap;"&gt; mention that Random Forests seem to show up in a disproportionate number of winning entries in their data mining competitions.&amp;nbsp;&lt;/span&gt;&lt;span style="font-family: Arial; font-size: 15px; white-space: pre-wrap;"&gt;Cross-validation, I call that.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;More reading:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;a href="http://cs.ecs.baylor.edu/~hamerly/courses/5325_10s/papers/decision_trees/banfield07ensemble.pdf" style="font-family: Times; font-size: medium; white-space: normal;"&gt;A Comparison of Decision Tree Ensemble&amp;nbsp;Creation Techniques&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml06.pdf"&gt;An Empirical Comparison of Supervised Learning Algorithms&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Arial;"&gt;&lt;span style="font-size: 15px; white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial;"&gt;&lt;span style="font-size: 15px; white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family: Arial; font-size: 15px; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8377255335801820601-2112898033356502973?l=www.kenvanharen.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.kenvanharen.com/feeds/2112898033356502973/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.kenvanharen.com/2012/02/whats-best-machine-learning-algorithm.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/2112898033356502973'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/2112898033356502973'/><link rel='alternate' type='text/html' href='http://www.kenvanharen.com/2012/02/whats-best-machine-learning-algorithm.html' title='What&apos;s the best machine learning algorithm?'/><author><name>kvh</name><uri>http://www.blogger.com/profile/05988298721002241103</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-aixI6Uwci8E/TxYFS5JMpfI/AAAAAAAAAaI/43DzKHSIbro/s72-c/Screen+shot+2012-01-17+at+3.32.42+PM.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8377255335801820601.post-7762587470270580520</id><published>2012-02-15T15:43:00.000-08:00</published><updated>2012-02-16T12:15:26.406-08:00</updated><title type='text'>Recurrent -- A python library for natural language parsing of recurring events</title><content type='html'>For a project I'm working on I needed the ability to turn a natural language phrase like "every other saturday starting next month" into &lt;a href="http://www.kanzaki.com/docs/ical/rrule.html"&gt;iCalendar-standard RRULEs&lt;/a&gt;. I couldn't find a python library that implemented this, so I built it. Check it out on &lt;a href="https://github.com/kvh/recurrent"&gt;github&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Here are some example input phrases and output recurrence rules:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;'on weekdays' =&amp;gt;&amp;nbsp;'RRULE:BYDAY=MO,TU,WE,TH,FR;INTERVAL=1;FREQ=WEEKLY'&lt;/li&gt;&lt;li&gt;'daily starting march 3rd until april 5th' =&amp;gt; 'DTSTART:20120303\nRRULE:FREQ=DAILY;INTERVAL=1;UNTIL=20120405'&lt;/li&gt;&lt;li&gt;'the first and third friday of every month' =&amp;gt;&amp;nbsp;'RRULE:BYDAY=1FR,3FR;INTERVAL=1;FREQ=MONTHLY'&lt;/li&gt;&lt;li&gt;'once a year&amp;nbsp;on the fourth thursday in november' =&amp;gt;&amp;nbsp;'RRULE:BYMONTH=11;BYDAY=4TH;INTERVAL=1;FREQ=YEARLY'&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;It's an alpha release currently, so please submit any issues you find.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8377255335801820601-7762587470270580520?l=www.kenvanharen.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.kenvanharen.com/feeds/7762587470270580520/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.kenvanharen.com/2012/02/recurrent-python-library-for-natural.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/7762587470270580520'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/7762587470270580520'/><link rel='alternate' type='text/html' href='http://www.kenvanharen.com/2012/02/recurrent-python-library-for-natural.html' title='Recurrent -- A python library for natural language parsing of recurring events'/><author><name>kvh</name><uri>http://www.blogger.com/profile/05988298721002241103</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8377255335801820601.post-7620610579933453500</id><published>2012-01-05T13:47:00.000-08:00</published><updated>2012-01-31T13:50:53.389-08:00</updated><title type='text'>Basic Income Guarantee</title><content type='html'>Job losses from this recession are staggering.&amp;nbsp;&lt;a href="http://www.calculatedriskblog.com/2012/01/seasonal-retail-hiring-duration-of.html"&gt;calculated risk&lt;/a&gt; has a nice visualization of the magnitude of man hours lost&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-EmiWHMve7Tc/TwcK3pIAWGI/AAAAAAAALzs/bN-MhoCYXs0/s1600/EmployRecessAlignedDec2011.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="207" src="http://3.bp.blogspot.com/-EmiWHMve7Tc/TwcK3pIAWGI/AAAAAAAALzs/bN-MhoCYXs0/s320/EmployRecessAlignedDec2011.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;These losses have not been spread evenly of course -- the less educated have taken the brunt&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-8W0Yxa_IArM/TwcsTmkEX1I/AAAAAAAAL0U/7TgSXMAIfKw/s1600/EducationUnemploymentDec2011.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="227" src="http://1.bp.blogspot.com/-8W0Yxa_IArM/TwcsTmkEX1I/AAAAAAAAL0U/7TgSXMAIfKw/s320/EducationUnemploymentDec2011.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Even under optimistic scenarios (which we are not currently experiencing), it will take 5 to 10 years to get these jobs back. If they ever come back.&lt;br /&gt;&lt;br /&gt;This means a lot of people without work and income. The question then is: what should we do about this? Should we do anything?&lt;br /&gt;&lt;br /&gt;One option is to give everyone a basic income. The &lt;a href="http://en.wikipedia.org/wiki/Basic_income_guarantee"&gt;basic income guarantee&lt;/a&gt; is an&lt;span style="background-color: white; font-family: sans-serif; line-height: 19px;"&gt;&amp;nbsp;&lt;/span&gt;&lt;span style="font-family: inherit;"&gt;&lt;span style="background-color: white; line-height: 19px;"&gt;"&lt;/span&gt;&lt;i style="background-color: white; line-height: 19px;"&gt;unconditional&lt;/i&gt;&lt;span style="background-color: white; line-height: 19px;"&gt;, government-insured guarantee that all citizens will have enough income to meet their basic needs." It differs specifically from a negative income tax, like our current social programs, in that &lt;i&gt;every &lt;/i&gt;individual (not household) receives the benefit, regardless of wealth or income, unconditionally.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Why would this work? There are a bunch of reasons, which I'll get into, but more pressing is the one big reason why people think it &lt;i&gt;won't&lt;/i&gt; work: people will lose the motivation to work. I think this belief represents a misunderstanding of human motivation. People work harder -- go to college, put in the hours -- because of innate desires for status and social approval, not out of terror of meeting their basic needs. In fact, under pressure of homelessness or starvation, people will often be forced to make sub-optimal long-term decisions, like foregoing investment&amp;nbsp;(schooling)&amp;nbsp;in themselves or their children.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Anyways, here are the pros and cons as I see them, and my rebuttals.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;Cons&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Reduced incentive to work&lt;/b&gt;&lt;br /&gt;I addressed this above, but there is hard evidence on this.&amp;nbsp;Manitoba, Canada, actually tried a basic income program for 5 years, known as&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Mincome"&gt;Mincome&lt;/a&gt;. The &lt;a href="http://www.irpp.org/po/archive/jan01/hum.pdf"&gt;results&lt;/a&gt;:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;blockquote class="tr_bq"&gt;On the whole, the research results were&amp;nbsp;encouraging to those who favour a GAI [Guaranteed Annual Income]. The&amp;nbsp;reduction in work effort was modest: about one&amp;nbsp;per cent for men, three per cent for wives, and&amp;nbsp;five per cent for unmarried women. These are&amp;nbsp;small effects in absolute terms.&lt;/blockquote&gt;&lt;/blockquote&gt;Smaller and shorter-term studies in the US have found slightly larger effects, around 5% total drop in employment. I think this is a real draw back. It's easy to see how demand for certain low-paying jobs would decrease, raising prices and reducing output for those goods below an optimal level. The good part about basic income is that its distortion of incentives is very simple and fairly predictable, unlike most federal social programs that target specific actions or demographics and create myriad twisted incentives and unintended consequences.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Redistribution&lt;/b&gt;&lt;br /&gt;You may or may not see this as a con, but in the end a basic income is a redistribution of wealth from the rich to the poor. (Yes the "rich" get the income too but they are paying for their own in taxes as well as the poor's.) You can make a moral argument against redistribution, but I think it is a weak one. Luck plays an enormous role in any one person's outcome. To the extent we can smooth out luck, I think we should. A basic income guarantee seems like a reasonable step in the right direction.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Inflation&lt;/b&gt;&lt;br /&gt;Some people argue a basic income would simply raise prices. Again, this money is not just printed, it comes from a (presumably progressive) tax, so any affect on the money supply would come from marginal consumption differences at high and low income levels, which do exist empirically, but the effect would be small and predictable (i.e. preventable).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Political hazard&lt;/b&gt;&lt;br /&gt;It would be hard politically to ever reduce the basic income, and very politically profitable to increase it. This would quickly result in another runaway social program (we have enough of those already). I think this is very easily solved by tying the basic income amount to hard economic and government indicators (inflation, gdp, debt ratio) in a way that guarantees both the solvency of the state and utility of the income. The rate setting could never be something that was easily changed by politicians.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;Pros&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Simplification&lt;/b&gt;&lt;br /&gt;I am a HUGE proponent of &lt;i&gt;simplified&lt;/i&gt;&amp;nbsp;(not smaller) government. Currently we have&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;medicaid&lt;/li&gt;&lt;li&gt;medicare&lt;/li&gt;&lt;li&gt;CHIP&lt;/li&gt;&lt;li&gt;social security&lt;/li&gt;&lt;li&gt;foodstamps&lt;/li&gt;&lt;li&gt;unemployment benefits&lt;/li&gt;&lt;li&gt;Section 8 housing&lt;/li&gt;&lt;li&gt;TANF&lt;/li&gt;&lt;li&gt;and more...&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;All of these could be replaced by a &lt;i&gt;dead simple&lt;/i&gt;&amp;nbsp;basic income program: cut everyone over 18 with a social security number a check. The opportunities for the government to screw it up are minimal; the bureaucracy would be tiny.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Increased education&lt;/b&gt;&lt;/div&gt;Research on the Mincome program found significant increases in education.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Reduced poverty, homelessness&lt;/b&gt;&lt;br /&gt;This is of course the big benefit. The negative externalities of poverty are large (&lt;a href="http://npc.umich.edu/publications/u/working_paper06-42.pdf"&gt;cite&lt;/a&gt;, &lt;a href="http://www.americanprogress.org/issues/2007/01/pdf/poverty_report.pdf"&gt;cite&lt;/a&gt;). Reducing it benefits everyone. Even if you are rich, have no feelings of empathy for the poor, and only want the best for yourself (in short you are an asshole), I still think it is in your best interest, economically and politically, to keep poverty low.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I was prompted to write about this by my growing belief that human labor, as a factor of production, will eventually and inevitably be priced out of&amp;nbsp;&lt;i&gt;most&lt;/i&gt;&amp;nbsp;of the economy. Machines have been and will continue to produce more and more of our goods without our input. This is an &lt;i&gt;awesome&lt;/i&gt; thing, and it doesn't mean humans won't find other productive work to fill their time, it just means as a proportion of our total output, our contribution will be continually less significant. This means a lot of surplus value, value that could be productively put towards a basic income instead of, say, capitalists' (bulging) pockets.&lt;br /&gt;&lt;br /&gt;Now I sound like a Marxist.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8377255335801820601-7620610579933453500?l=www.kenvanharen.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.kenvanharen.com/feeds/7620610579933453500/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.kenvanharen.com/2012/01/basic-income-guarantee.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/7620610579933453500'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/7620610579933453500'/><link rel='alternate' type='text/html' href='http://www.kenvanharen.com/2012/01/basic-income-guarantee.html' title='Basic Income Guarantee'/><author><name>kvh</name><uri>http://www.blogger.com/profile/05988298721002241103</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-EmiWHMve7Tc/TwcK3pIAWGI/AAAAAAAALzs/bN-MhoCYXs0/s72-c/EmployRecessAlignedDec2011.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8377255335801820601.post-4394140198462757258</id><published>2011-12-16T15:28:00.000-08:00</published><updated>2012-01-31T22:52:51.611-08:00</updated><title type='text'>How to save Hacker News</title><content type='html'>There seems to be increasing agreement that HN is slowly descending into another reddit (no offense reddit, you're great). I'm sure someone has suggested this before, but I think the fix is simple: split upvoting into two distinct actions:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;I &lt;b&gt;agree&lt;/b&gt;&amp;nbsp;with this comment/post&lt;/li&gt;&lt;li&gt;This comment/post was &lt;b&gt;informative&lt;/b&gt; and &lt;b&gt;high-quality&lt;/b&gt;&lt;/li&gt;&lt;/ol&gt;Here is a poorly conceived implementation:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-U8amJ8Z6iQQ/TxYU1Iu0nvI/AAAAAAAAAaQ/Ulf_V1ByLvc/s1600/Screen+shot+2012-01-17+at+4.38.42+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-U8amJ8Z6iQQ/TxYU1Iu0nvI/AAAAAAAAAaQ/Ulf_V1ByLvc/s1600/Screen+shot+2012-01-17+at+4.38.42+PM.png" /&gt;&lt;/a&gt;&lt;/div&gt;(this comment "informs" me, this comment "conforms" with my beliefs/values)&lt;br /&gt;&lt;br /&gt;The ranking (on the front page and for comments, or just for comments) could then be some mix of the two votes with options for sorting by different mixes: "most informative", "most conforming". The popularity of the single binary voting dimension on aggregation sites is no doubt due to simplicity and the maximum mental effort a community member is willing to expend on evaluating any given link/comment. &lt;i&gt;But&lt;/i&gt;, I don't think this is too much to ask. Especially for a community like HN. I think this would have a tremendous impact on the quality of comments (and links).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8377255335801820601-4394140198462757258?l=www.kenvanharen.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.kenvanharen.com/feeds/4394140198462757258/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.kenvanharen.com/2011/12/how-to-save-hacker-news.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/4394140198462757258'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/4394140198462757258'/><link rel='alternate' type='text/html' href='http://www.kenvanharen.com/2011/12/how-to-save-hacker-news.html' title='How to save Hacker News'/><author><name>kvh</name><uri>http://www.blogger.com/profile/05988298721002241103</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-U8amJ8Z6iQQ/TxYU1Iu0nvI/AAAAAAAAAaQ/Ulf_V1ByLvc/s72-c/Screen+shot+2012-01-17+at+4.38.42+PM.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8377255335801820601.post-1937288011214850981</id><published>2011-11-20T17:44:00.000-08:00</published><updated>2012-01-17T17:45:17.818-08:00</updated><title type='text'>Introducing Scientifiqa, the science-based Q&amp;A site</title><content type='html'>I just launched &lt;a href="http://scientifiqa.com/"&gt;Scientifiqa&lt;/a&gt;, yet another general Q&amp;amp;A site (and stackexchange clone). What makes this one different? &lt;b&gt;Every answer must be a citation (and summary) of a peer-reviewed academic article or survey&lt;/b&gt;. The hope is that this requirement will ensure high quality discussions and give people an easy and quick way to come up to speed on current scientific understanding in a particular area.&lt;br /&gt;&lt;br /&gt;Some questions already posted:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://scientifiqa.com/questions/9/where-did-the-1s-new-wealth-come-from"&gt;Where did the 1%'s new wealth come from?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://scientifiqa.com/questions/15/do-antioxidants-reduce-the-risk-of-cancer"&gt;Do antioxidants reduce the risk of cancer?&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Or &lt;a href="http://www.scientifiqa.com/questions/ask/"&gt;ask your own question&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Please share any suggestions or feedback!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8377255335801820601-1937288011214850981?l=www.kenvanharen.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.kenvanharen.com/feeds/1937288011214850981/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.kenvanharen.com/2012/01/introducing-scientifiqa-science-based-q.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/1937288011214850981'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/1937288011214850981'/><link rel='alternate' type='text/html' href='http://www.kenvanharen.com/2012/01/introducing-scientifiqa-science-based-q.html' title='Introducing Scientifiqa, the science-based Q&amp;A site'/><author><name>kvh</name><uri>http://www.blogger.com/profile/05988298721002241103</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8377255335801820601.post-8623704577486291795</id><published>2011-10-03T13:38:00.000-07:00</published><updated>2012-02-06T11:04:23.556-08:00</updated><title type='text'>A better arXiv</title><content type='html'>&lt;span id="internal-source-marker_0.7660917197354138"&gt;&lt;span style="font-family: inherit;"&gt;&lt;span style="font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt;Imagine if arXiv was a fully functional online community.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt;Here’s one vision:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: inherit;"&gt;&lt;span style="font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;&lt;b&gt;Member profiles&lt;/b&gt;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;The community would be open to anyone (possibly contingent upon approval by existing members.) Profiles would hook into Linkedin, Mendeley, Researchgate, etc to pull in your education, institutions and publications.&lt;/span&gt;&lt;br /&gt;&lt;span style="white-space: pre-wrap;"&gt; &lt;/span&gt;&lt;br /&gt;&lt;span style="white-space: pre-wrap;"&gt;&lt;b&gt;Reputation&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;Every member of the community would have a (private) reputation score based on their “impact” in their field. All actions on the site would be weighted by a member’s reputation, it’s calculation based on:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;published papers (traditional impact scores)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;education, employment and institutions (verified by community)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;actions within the arXiv community (peer reviews, upvoted comments, well-rated submissions)&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;This would obviously be controversial, and would have to be handled openly and with care, but I think it is necessary in order to keep quality high and to entice well-respected academics to engage in an online forum.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;&lt;b&gt;Peer review&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;Members could write and request (public or private) reviews of submissions. This would include “rating” the submission across various dimensions. Possible dimensions:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;methodology&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;novelty&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt;expected impact&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;&lt;b&gt;Comments&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;There would also be an outlet for less formal discussion, with upvoting and threading.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;&lt;b&gt;Article discovery&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;There would be support for sorting &lt;/span&gt;&lt;span style="white-space: pre-wrap;"&gt;and filtering submissions by ratings and reviews. There could be a published “journal” of best articles every month, possibly selected by weighted vote, if people felt the need for a formal publication.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;In addition there would be a personalized feed of submissions, based on topic filters and a recommender system.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;&lt;b&gt;Open access&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt;Most importantly, &lt;/span&gt;&lt;b style="white-space: pre-wrap;"&gt;ANYONE &lt;/b&gt;&lt;span style="white-space: pre-wrap;"&gt;could read the output of this community. Even the taxpayer/college student who funded it all...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="white-space: pre-wrap;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="font-family: inherit;"&gt;In addition to these community features, there could be added capabilities for submitting papers. This could be simple things like support for attaching code, data or arbitrary media; or a&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family: inherit; white-space: pre-wrap;"&gt; more significant overhaul of a “paper”, turning it into what is essentially a web page, with inline links for references, expandable sections, embedded media, etc. Unfortunately,  there is no standardized typesetting solution for the web yet, so some work might be required to make this happen.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: inherit;"&gt; &lt;span style="font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt;If this got built and people adopted it, I think &lt;/span&gt;&lt;/span&gt;&lt;span style="white-space: pre-wrap;"&gt;it could deliver a swift, fatal blow to the academic publishing industry -- something it desperately needs. (When I say swift, I mean swift in academic terms)&lt;/span&gt;&lt;span style="font-family: inherit;"&gt;&lt;br /&gt;&lt;span style="font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt;I built &lt;/span&gt;&lt;a href="http://science.io/" style="font-weight: bold;"&gt;&lt;span style="color: #000099; font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt;science.io&lt;/span&gt;&lt;/a&gt;&lt;span style="font-weight: normal; vertical-align: baseline; white-space: pre-wrap;"&gt; with this end vision in mind. If you are involved with arXiv or are interested in making this or something like it happen, I’d love to chat. Get in touch &lt;a href="mailto:kvh@science.io"&gt;kvh@science.io&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8377255335801820601-8623704577486291795?l=www.kenvanharen.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.kenvanharen.com/feeds/8623704577486291795/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.kenvanharen.com/2011/10/better-arxiv.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/8623704577486291795'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8377255335801820601/posts/default/8623704577486291795'/><link rel='alternate' type='text/html' href='http://www.kenvanharen.com/2011/10/better-arxiv.html' title='A better arXiv'/><author><name>kvh</name><uri>http://www.blogger.com/profile/05988298721002241103</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
