Data Science in Julia

by Christopher Pyles

Welcome! This online book is intended to give you an introduction to the forms and packages in the open source language Julia that are useful in data science. It begins with a (short) introduction to the language’s basic types and syntax, before moving on to Julia’s own packages for working with rectangular data and Julia implemetations of Python packages.

Why Julia?

This book focuses on Julia because it integrates many of the great packages of Python with a model that allows for efficient computing and memory use. It borrows some syntax from the proprietary language MATLAB and incorporates that with more streamlined code and the advantageous package community of Python.


This course covers a range of topics related to the practice of data science in Julia. While it will cover the implementation of many statistical models and other complex algorithms, proving and providing the theory behind these models is beyond the scope of this book. This book is mainly intended as a way for individuals already experienced in the field to learn Julia. This book will cover implementation of topics including but not limited to:

  • Julia’s Package DataFrames
  • Pandas in Julia
  • Plotting in PyPlot and Seaborn
  • Linear Algebra
  • Implementing Models
  • (Basic) Machine Learning with ScikitLearn