CloseHelpPrint
Kies de Nederlandse taal
Course module: 2IMW30
2IMW30
Foundations of data mining
Course info
Course module2IMW30
Credits (ECTS)5
Category-
Course typeGraduate School
Language of instructionEnglish
Offered byEindhoven University of Technology; Mathematics and Computer Science; Computer Science;
Is part of
Computer Science and Engineering
Contact persondr.ir. J. Vanschoren
Telephone8638
E-mailj.vanschoren@tue.nl
Lecturer(s)
Co-lecturer
dr. A. Driemel
Other course modules lecturer
Co-lecturer
prof.dr. M. Pechenizkiy
Other course modules lecturer
Responsible lecturer
dr.ir. J. Vanschoren
Feedback and reachability
Other course modules lecturer
Contactperson for the course
dr.ir. J. Vanschoren
Other course modules lecturer
Academic year2016
Period
3  (06/02/2017 to 23/04/2017)
Starting block
3
TimeslotA: A - Mo 1-4, We 9-10, Th 5-8
Course mode
Fulltime
RemarksLast year this subject has been taught: 2016/2017.
Registration openfrom 15/06/2016 up to and including 15/01/2017
Application procedureYou apply via OSIRIS Student
Explanation-
Registration using OSIRISYes
Registration open for students from other department(s)Yes
Pre-registrationNo
Waiting listNo
Number of insufficient tests-
Number of groups of preference0
Learning objectives
Short description: Machine learning is the science of making computers act without being explicitly programmed. Instead, algorithms are used to find patterns in data. It is so pervasive today that you probably use it dozens of times a day without knowing it, for instance in web search, speech recognition, and (soon) self-driving cars. It is also a crucial component of data-driven industry (Big Data), scientific discovery, and modern healthcare.
Learning objectives: In this class, you will learn the foundations of how data mining and machine learning work internally, understand when and how to use key concepts and techniques, and gain hands-on experience in getting them to work for yourself. You'll learn about the theoretical underpinnings of data analysis, and leverage that to quickly and powerfully apply this knowledge to tackle new problems.
Upon completion of this course you will be able to:
  • Identify and classify data mining problems.
  • Understand the theoretical foundations of data analysis.
  • Build and evaluate predictive models and clusterings.
  • Use data mining tools such as R and Python to build machine learning systems.
    While there are no strict requirements, it is highly recommended to have a working knowledge of statistics, and to have programming experience. Programming is part of the assignments. The course will mostly feature examples from R and Python.
  • Content
    This course provides a broad introduction to machine learning, datamining, and statistical pattern recognition. Topics include:
    • Similarity and distances for non-Euclidean data (e.g., text documents)
    • Efficient nearest neighbor search in non-Euclidean spaces
    • Challenges of high-dimensional data analysis; metric embeddings and dimensionality reduction
    • Unsupervised learning (clustering, hierarchical clustering, clustering in metric spaces)
    • Supervised learning (classification, decision trees, Bayesian learners, support vector machines, kernel methods, ensemble methods)
    • Evaluation of predictive models (cross-validation, overfitting, ROC space, bias/variance theory)
    Entrance requirements
    Entrance requirements tests
    -
    Assumed previous knowledge
    While there are no strict requirements, it is highly recommended to have a working knowledge of statistics, and to have programming experience. Programming is part of the assignments. The course will mostly feature examples from R and Python.
    Previous knowledge can be gained by
    -
    Resources for self study
    -
    Bachelor College or Graduate School
    Graduate School
    URL study guide
    http://www.win.tue.nl/~jvanscho/index.html#datamining
    URL study guide
    http://www.win.tue.nl/~jvanscho/index.html#datamining
    Required materials
    -
    Recommended materials
    Charu Aggarwal: Data Mining - The Textbook
    Course materials will be provided along the duration of the course via Sakai.
    Hopcroft, Kannan: Computer Science Theory for the Information Age
    Matousek: Embedding Finite Metric Spaces into Normed Spaces, from Lectures on Discrete Geometry.
    Peter Flach: Machine Learning. https://www.cs.bris.ac.uk/~flach/mlbook/
    Instructional modes
    Lecture

    General
    -

    Remark
    -
    Lecture with notebook / PC

    General
    -

    Remark
    -
    Tests
    Assignment(s)
    Test weight100
    Minimum grade6
    Test typeAssignment(s)
    Number of opportunities1
    OpportunitiesBlock 3
    Test duration in minutes-

    Assessment
    -

    Remark
    Graded individual assignments and written tests.

    CloseHelpPrint
    Kies de Nederlandse taal