The “science” in data science for business doesn’t exactly match with how the word is used in physics, chemistry, social sciences or other academic disciplines.
Data Scientists do a good job of designing their experiments in data to come up with predictive models using statistically proven techniques. However, the analogy to the physical and social sciences breaks down for most applications over business processes. It is extremely difficult to design scientifically valid experiments with customers, order processing, warehouse management and so on. Management does not like to hear you will be experimenting with customers! Except maybe in marketing but definitely not in order processing, sales, finance, etc…
Data Science is more akin to the observational part of science disciplines. Think about geology and how a geologist would go about interpreting layers of different rocks. There is no experiment they can conduct to validate their theories of how the different types of rock came to be where they are. Instead, the geologist has to create a theory of what happened in the past and then make a prediction that is proven or disproven in another, as yet, unseen layer or a different location to substantiate the theory. The geologist cannot re-create millions of years of rock formation across vast areas.
(I know there are parts of geology that are experimental, but I am focusing on the observational aspects.)
This is what data science in a real business looks like. You take part of the historical data, make a model that tells you something (gain insight) or it makes a prediction of the future you can act upon. You then use a different slice of the historical data to prove or disprove the model’s efficacy. Both parts rely on historical data already present in the enterprise databases.
What moves data science in business further outside the realm of regular science is the predictions and insight from data science changes future processes and behaviors in the business. That is why you are doing data science over business processes – to reduce costs and increase revenue.
Your business employer does not want to wait 2 more years to see if your data science model correctly predicts which customers are going to stay and which are going to leave for competitors, for example. The business wants that information now so they can act upon it. Acting on the prediction changes the future. This, of course, changes the data used for future models. The result is a very unscientific situation.
However, that is exactly the kind of positive influence that a business wants to get out of their data science predictive models.