History of data science

Written by Ryan Morrison

History of data science: pre-20th century

Gary Mulder

As a Data Scientist at TruNarrative, my goal is to tell the story of fraud through the lens of Data Science. We have created a suite of products and services that inform organisations and people what a true fraudster or criminal is, how he or she behaves, and what a genuine customer relationship based on trust and mutual benefit looks like.

A key differentiator between nimble companies using state-of-the-art information technologies and the traditional financial institutions of the 20th century is insightful, immediate and actionable knowledge about a company’s customers. Data has many stories to tell about your customers, and my task is to share TruNarrative’s data stories with our partners in preventing fraud and providing best of breed customer experiences.

In this 3-part blog series, I will explore what data science is, and take you through the history of the practice, which ultimately ends in today’s machine learning abilities. But data science actually began much longer before you’d have thought it to…

 

What is Data Science?

Data Science is, in the simplest of terms, the study of data through model building. Data is anything that we can measure, so as soon as we invented counting and numbers we started measuring the world and creating number-based models of the world.

One can imagine a hunter notching a tree every time she made a kill. The notches on the tree are a simple model of her hunting success. Our hunter could extend her model by using the length of the notch to represent the size of the kill. Now she has a more complex and possibly more useful model of her hunting success (tweet this).

A model of successful hunting

 

Sumerian clay tablets and accounting

Jumping forward quite a few millennia, the invention of bookkeeping was a major advance in economics. The majority of clay tablets found in ancient Sumer over 5,000 years ago were accounting-related. As human society grew increasingly complex, correspondingly the amount of data and the models created grew in complexity.

Much like the notches on a tree would tell the story of a hunter’s success, clay tablets would tell the story of how much grain was collected from the people of Sumer (tweet this). Accurate clay tablet record keeping now allowed the high priests of Sumer to know how much tax to collect.

This explosion in data enabled the farming revolution, cities, nation states and, inevitably, capital gains tax. As Benjamin Franklin once quipped “in this world nothing can be said to be certain, except death and taxes”. The Sumerian peasants would likely have wholly agreed.

Sumer clay tablets were used for taxation

 

Babbage’s Analytical Engine

In the 19th century, Charles Babbage’s Difference Engine was the first modern computing machine used for computing tables of numbers. These tables of numbers were used for accurate navigation of English shipping.

Babbage’s subsequent design for his Analytical Engine represents the first programmable mechanical computer which, in theory, was capable of computing any model imaginable (tweet this).  It was Ada Lovelace, daughter of Lord Byron, who saw the potential of the universal computer in the design of the Analytical Engine and was the first computer programmer and data scientist.

A fully working reproduction of Babbage’s Difference Engine

This takes us to the end of the 20th century. The next revolution is data science would come soon after World War II, when the inventor of the modern computer set his sights on weather modelling.

Keep a lookout for the 2nd blog of this 3-part series, ‘History of data science: 20th century and the modern computer’, coming soon.