Course Description
In this course, the student will learn to identify,
evaluate, and capture business analytic opportunities that
create value for an organization. Theoretical data analytics
methods, as well as case studies on successful analytics
applications, will be covered. Basic descriptive analytics
methods are reviewed along with a quick introduction to
using Python in analyzing large data sets. Predictive
analytics techniques including clustering, classification,
and regression, are covered in detail. Prescriptive
analytics applications on utilization simulation and
optimization over large data to improve business decisions
are presented. Case studies emphasize financial applications
such as portfolio management and automated trading.
Pre-requisites
-
By Course: STAT 230, INDE 301, INDE 302, INDE 303,
INDE 504
-
By Topic: Programming, Probability and statistics,
engineering economy, optimization theory, stochastic
processes, Monte Carlo simulations
Course Objectives
-
A practical ability to carry out data analysis using
computational tools on problems of applied nature.
-
An understanding of the importance of data analytics in
the decision making process of modern organizations.
-
An appreciation of the challenges in applying data
analytics in practice.
-
An exposure to modern applications of data analytics,
especially in Finance.
-
An overview of the main predictive analytics tools such
regression, classification, and clustering.
Learning Outcomes
-
Analyze data sets with Python and perform basic
descriptive analytics.
-
Identify the suitable data analytics tools that assist
organizations making data driven decisions.
-
Understand and implement basic linear and nonlinear
regression, clustering, classification and other
predictive analytics techniques.
-
Apply familiar prescriptive analytics tools such as
simulation and optimization in large-data contexts.
-
Utilize predictive and prescriptive analytics in modern
applications, especially financial planning and trading.
Topics Covered
-
What is Statistical Learning, assessing Model Accuracy
- Simple and Multiple Linear Regression
-
Classification (Logistic Regression and Linear
Discriminant Analysis)
-
Re-sampling Methods (Cross-validation and Boostrapping)
- Linear Model Selection and Regularization
- Tree-Based Methods
- Support Vector Machines
-
Unsupervised learning (K-means and Hierarchical
Clustering, Principal Component Analysis)
- Case studies and applications
Software and Coding
-
The class will mostly use Python 2.7, a powerful
programming language commonly used in academia and
business.
-
This course is not a Python programing class. We will
provide limited software instruction, in-class
demonstration, and code to accompany lectures and
assignments. Like any programming language, Python is best
learned through practice.
-
You should install Python as soon as possible and
familiarize yourself with basic operations. We recommend
to install Anaconda, a distribution containing Python and
most of the packages required for data analysis. You can
download it
here. The more proficiency you can gain prior to class, the
more you will get out of the sessions. There are many
tutorials online such us
this one.
Schedule
- Every Thursday, between 2pm and 5pm
- One lecture and one lab session each week.
Evaluation Method
- Exam: 40%
- Labs / Project: 40%
- In-class Quizzes: 15%
- Moodle Questionnaires: 5%
Contacts:
teaching@algotraders.org