<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Aug 26, 2017 at 7:18 AM, Thomas Levine <span dir="ltr"><<a href="mailto:_@thomaslevine.com" target="_blank">_@thomaslevine.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I don't follow your use case chart; could you give an example?<br>
<br>
It sounds, though, like you and your friends are talk past each other.<br>
They discuss job requirements, and you discuss software design. Dijkstra<br>
commented on this common conflict.<br>
<br>
Simplicity is a great virtue but it requires hard work to achieve it<br>
and education to appreciate it. And to make matters worse: complexity<br>
sells better.<br>
<a href="https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD896.html" rel="noreferrer" target="_blank">https://www.cs.utexas.edu/<wbr>users/EWD/transcriptions/<wbr>EWD08xx/EWD896.html</a><br>
<br>
Ignoring the need to sell newfangled complexity, I find generic search<br>
engines, strict text parsers, and summary statistics to be far more<br>
practical, effective, and reliable than the overwhelming majority of<br>
branded machine learning products. This is just like how I prefer the<br>
reliability and ease-of-use of OpenBSD over the novelty, complexity, and<br>
opacity of most of the other popular contemporary operating systems.<br>
<br>
______________________________<wbr>_________________<br>
talk mailing list<br>
<a href="mailto:talk@lists.nycbug.org">talk@lists.nycbug.org</a><br>
<a href="http://lists.nycbug.org/mailman/listinfo/talk" rel="noreferrer" target="_blank">http://lists.nycbug.org/<wbr>mailman/listinfo/talk</a><br>
</blockquote></div><br></div><div class="gmail_extra"><span style="font-size:12.8px">Ignoring the need to sell newfangled complexity, I find generic search</span><br style="font-size:12.8px"><span style="font-size:12.8px">engines, strict text parsers, and summary statistics to be far more</span><br style="font-size:12.8px"><span style="font-size:12.8px">practical, effective, and reliable than the overwhelming majority of</span><br style="font-size:12.8px"><span style="font-size:12.8px">branded machine learning products. This is just like how I prefer the</span><br style="font-size:12.8px"><span style="font-size:12.8px">reliability and ease-of-use of OpenBSD over the novelty, complexity, and</span><br style="font-size:12.8px"><span style="font-size:12.8px">opacity of most of the other popular contemporary operating systems.</span><br style="font-size:12.8px"></div><div class="gmail_extra"><span style="font-size:12.8px"><br></span></div><div class="gmail_extra"><span style="font-size:12.8px">One of the reasons machine learning is becoming popular is because big data is commodity now. Users can store vast amounts of data in "the cloud" BigQuery, Hadoop, Redshift, Oracle RAQ, etc. Ten years ago the dominant challenge was</span><span style="font-size:12.8px"> cheaply storing and querying data, but now that is becoming commodity. An enterprise can buy tableau and use Amazon Redshift, and have an analyst (or a non technical product manager) give them summary statistics untill all parties are blue in the face, 500 scheduled reports running a day. </span><span style="font-size:12.8px"><br></span></div><div class="gmail_extra"><span style="font-size:12.8px"><br></span></div><div class="gmail_extra"><span style="font-size:12.8px">You can not replace summary statistics with machine learning. </span><span style="font-size:12.8px">A classic machine learning tool is linear regression which you can use to make predictions. </span><span style="font-size:12.8px">You take a dataset and you train a model. That model can be used to make predictions. </span></div><div class="gmail_extra"><span style="font-size:12.8px"><br></span></div><div class="gmail_extra"><span style="font-size:12.8px">For example given: users with tens/hundreds/thousands of attributes (age, gender,...) and a bid request with (</span><span style="font-size:12.8px">tens/hundreds/thousands) of attributes (time of day, url,...), what attributes can be used to predict the final bid price? Running one process (Linear Regression) that tells what attribute or combinations of attributes predicts the price, COULD BE easier/more simple then having humans attempt to figure it out by producing different sets of summary statistics and collectively deciding what to optimize on, and constantly re-evaluating the rules as the landscape changes. </span></div><div class="gmail_extra"><span style="font-size:12.8px"><br></span></div><div class="gmail_extra"><span style="font-size:12.8px">Obviously there is hype cycle, not every problem needs machine learning to solve. But get readdyyy for a ::shocker:: not everything is solved by BSD port tree.</span></div><div class="gmail_extra"><br></div></div>