Definition:Exploratory Data Analysis

From ProofWiki
Jump to navigation Jump to search

Definition

Exploratory data analysis is the process of performing preliminary examination of raw data in order to gain insight into their general structure and characteristics, and often identify outliers.

Tools used in this activity include:

For multivariate data or particularly large data sets, kernel density estimation methods can be used.

Having performed an exploratory data analysis, the data engineer may be better placed to make a decision as to what formal statistical methods may then be appropriate.


Also known as

Exploratory data analysis is often referred to by its abbreviation EDA.


Also see

  • Results about exploratory data analysis can be found here.


Historical Note

The discipline of exploratory data analysis was formally expounded in $1977$ by John Wilder Tukey in his book Exploratory Data Analysis.


Sources