Big data demands nonstop experimentation

The universe is an immensely tangled knot of correlations. Science is humanity’s tool for identifying which correlations reveal the deep, dark, dense knots of causation at the heart of it all.
Scientific models — also known as laws, theories, and hypotheses — are highly simplified tools for untangling threads from the correlation knot and testing their causal plausibility. Data science is the art of using statistical models to identify and validate the correlative factors at work. However, statistical models may lull data scientists into a false sense of validation, insofar as the models may fit the observational data closely but still miss the larger causative factors at work. When that happens, the model gives the illusion of insight but lacks predictive validity. It becomes a hindsight optimization tool.

Leave a Comment

Your email address will not be published.

You may also like

Crayon Yoda

Pin It on Pinterest