Data Science of Wine. Why?

The wine industry is awash with data. At any one time, there are hundreds of  thousands of wines across the globe, each associated with features such as varietal, price, year, ratings, reviews, and yield as well as some of the drivers such as climate and soil. It is a rich data set reflecting a complex, fickle, and sensitive product across space and time. However, I see very relatively data science happening, at least in public forums.

Sure, there’s some simple exploratory data analysis such as vintage charts that tally a mysterious and subjective rating over time. The UCI wine quality data set is a common sight in data science courses. And, there are a few people doing some interesting things with  visualization (such as this and this) and recommendations engines. However, it is surprisingly little given the size of the industry and the global breath and appeal of wine.

This blog is an attempt to satisfy my own curiosity, develop some predictive models and data-driven insights, and to teach myself a few new machine learning tricks. I love wine. I love data science. So, why not?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s