Data Science in Julia

by Christopher Pyles

Welcome! This online book is intended to give you an introduction to the forms and packages in the open source language Julia that are useful in data science. This is not intended to give you an thorough introduction to the Julia language, but is intended to build on top of a good understanding of its basic structures and syntax. To learn the language, the Julia website has some great resources to help you get started.

Why Julia?

This book focuses on Julia because it integrates many of the great packages of Python with a model that allows for efficient computing and memory use. It borrows some syntax from the proprietary language MATLAB and incorporates that with more streamlined code and the advantageous package community of Python.


This course covers a range of topics related to the practice of data science in Julia. While it will cover the implementation of many statistical models and other complex algorithms, proving and providing the theory behind these models is beyond the scope of this book. This book is mainly intended as a way for individuals already experienced in the field to learn Julia. This book will cover implementation of topics including but not limited to:

  • Julia’s Package DataFrames
  • Pandas in Julia
  • Plotting in PyPlot and Seaborn
  • Linear Algebra
  • Implementing Models
  • (Basic) Machine Learning with ScikitLearn