Contemporary Analysis

Data Science

Nate Watson

March Machine Learning Mayhem

Machine Learning and the NCAA Men’s Basketball Tournament Methodology

 <<This article is meant to be the technical document following the above article. Please read the following article before continuing.>>

“The past may not be the best predictor of the future, but it is really the only tool we have”

 

Tadd Wood

When should you Update Predictive Models?

New clients often have questions about why and how frequently CAN needs to update their predictive models.  Predictive models need to be updated because everyday new data is being created.  For example, your customers are buying more, subscribing or unsubscribing.  The environment is constantly changing.  While predictive models can handle a lot of new new data, overtime environmental changes build up causing predictive models to lose their effectiveness.  After a month, quarter, or a year it is necessary to update predictive models with new data.

As these new patterns emerge its important to periodically take time to investigate your data, update your models, and challenge your assumptions about your business. But how often should you do this?

Tadd Wood

How Much Data Do I Need For Predictive Analytics?

Before beginning any predictive analytics project, its essential to investigate the breadth and depth of data available. However, at what point is it acceptable to say you have enough data to start?

The politically correct answer to this question is that it depends. Depends on what though?

Well for starters, certain types of data science and predictive analysis projects require more specific data requirements. In an extreme case, predicting survival rates of people or machines may require data spanning their entire lifespan. However, in most cases, data requirements are less stringent.

In most cases taking a snapshot of 3 to 5 years worth of data can yield a breadth of patterns surrounding consumer and business behavior. Why?

Tadd Wood

What Is Data?

We get asked all the time at CAN "what is data?"  "Data" is a term to describe facts, processes, or events that are able to be recorded and measured. Whether descriptive or quantitative, nearly anything can be converted into data. Facebook profiles, sales numbers, interest rates, zip codes, twitter tweets, emails, DNA sequences, and flight tracking information are all examples of data - and we have a lot of it. Data is collected from many different places, and while humans can collect data, machines and technology can collect far more and do it quicker. Computing systems are designed to collect massive amounts of data on the processes they observe or facilitate, yet most of this is never used. Data sits idle because no one has figured how to use it. Technology on the processing side and collecting side have nearly caught up and this is starting to make all the difference.

Thanks to these advances in computer processing power and storage capacity, 90% of the data available to humankind were nonexistent 2 years ago. Think about that for a minute. In other words, data are this age's most abundant raw material.

Categories

See all