Chapter 7. Statistics and modeling: concepts and foundations

 

This chapter covers

  • Statistical modeling as a core concept in data science
  • Mathematics as a foundation of statistics
  • Other useful statistical methods such as clustering and machine learning

Figure 7.1 shows where we are in the data science process: statistical analysis of data. Statistical methods are often considered as nearly one half, or at least one third, of the skills and knowledge needed for doing good data science. The other large piece is software development and/or application, and the remaining, smaller piece is subject matter or domain expertise. Statistical theory and methods are hugely important to data science, but I’ve said relatively little about them so far in this book. In this chapter, I attempt to present a grand overview.

Figure 7.1. An important aspect of the build phase of the data science process: statistical data analysis

7.1. How I think about statistics

7.2. Statistics: the field as it relates to data science

7.3. Mathematics

7.4. Statistical modeling and inference

7.5. Miscellaneous statistical methods

Exercises

Summary

sitemap