This past weekend I was at Kalamazoo College for my five-year college reunion. I mentally prepared myself to feel really old but came out of it feeling young and refreshed instead. Funny how that works out.

Senior year housemates!

All weekend, I was continually reminded of how sharp and engaged K students are. In one conversation with a fellow Spanish major, we discussed data science, science and technology policy, Buenos Aires, and the works of Gabriel García Márquez —– that last one in Spanish, of course. I got to talk about geothermal heating for buildings, facilitating social change in Michigan, film production in NYC, the challenges of living and working on a farm, streaking the Quad. It was great! We’re the product of an intense liberal arts education, and it shows: we’re documentary filmmakers, organic farmers, Googlers, librarians, pharmacists, energy efficiency consultants, women’s rights advocates, public servants, scientists — and all of us endlessly interested in the world around us.

But enough. Why am I bringing this up in a data blog, if not just to boast about my alma mater? Well, in an older post I summarized what data science is all about, and one of the recurring themes I found was that hacking and statistics skills only get you so far (essentially, that’s machine learning). To be a successful data scientist, you also need domain expertise –— or, to put it another way, you need to know how to ask good questions, which demands either existing familiarity with the topic or the ability to familiarize yourself quickly. This, I feel, is where a liberal arts education proves its worth. Even if you don’t know much about a topic, you have a broad base of knowledge from which you can make comparisons and inferences and a well-trained ability to bring yourself up to speed.

Domain expertise vs. machine learning was the topic of an interesting debate at the Strata Conference back in February (when I was writing my dissertation and not paying attention to such things, so it’s new to me) that featured many big names in data science. Here’s a summary if you don’t want to watch the whole thing. They don’t conclude one way or another on which skill set is most useful for data scientists (as usual, it depends), and they certainly don’t discuss the benefits of a K College education as applied to this issue, but if nothing else, it’s good food for thought. I’ve been chewing on it all weekend. To close, here’s a quote from LinkedIn’s data science job application, way back in 2008:

We’re looking for superb analytical minds of all levels to expand our small team that will build some of the most innovative products at LinkedIn. No specific technical skills are required. We will help you learn SQL, Python, and R. You should be extremely intelligent, have a quantitative background, and be able to learn quickly and work independently. This is a perfect job for somebody who is really smart, driven, and extremely skilled at creatively solving problems. You’ll learn statistics, data mining, programming, and product design. But, you’ve got to start with something we can’t teach: intellectual sharpness and creativity.