Skip navigation

COMP7410 Data Mining and Matching

Offered By School of Computer Science
Academic Career Graduate Coursework
Course Subject Computer Science
Offered in Spring Session 2010
Unit Value 3 units
Course Description

Large amounts of data are increasingly being collected by public and private organisations, and research projects. The Internet as well provides a very large source of information about almost every aspect of human life and society.

This course provides an overview of the technologies and concepts used for data mining and matching. It focuses on the practical aspects of these techniques rather than the mathematical and statistical foundation.

More information is on the course webpage: http://cs.anu.edu.au/courses/COMP7410/.

Learning Outcomes

On completion of this module, participants should have gained a understanding of the basic concepts and techniques used in data mining and data matching, including:

  1. the data mining process, how data mining is defined, application areas, disciplines involved, and the major challenges in data mining;
  2. data issues relevant to data mining (size, complexity, types and formats), data warehousing, data cleaning and pre-processing;
  3. unsupervised learning techniques like cluster analysis and association rules mining (including the basic methods used);
  4. supervised learning techniques (classification and prediction), including the basics of decision tree induction and how to measure classifier accuracy;
  5. schema integration, data matching (deterministic and probabilistic linkage), the importance of data cleaning, deduplication and geocoding.
Indicative Assessment

10% Online quizzes

20% Online discussion

40% Exam

30% Case Study Project 

Course Classification(s) TransitionalTransitional courses are designed for students from a broad range of backgrounds and learning achievements, which provide for the acquisition of generic skills; or an informed understanding of contemporary issues; or fundamental knowledge for transition to Advanced or Specialist courses.
Areas of Interest Business Information Systems, Computer Science, and Information Technology
Requisite Statement

Incompatible with COMP8400 Algorithms and Techniques for Data Mining

Recommended Courses

Assumed knowledge:

  • Basic understanding of spreadsheets and databases (tables, attributes, records).
  • Knowledge of working with windows based computer systems.
Consent Required Consent is required prior to enrolling in this course.
Academic Contact peter.christen@anu.edu.au

The information published on the Study at ANU 2010 website applies to the 2010 academic year only. All information provided on this website replaces the information contained in the Study at ANU 2009 website.

Updated:   13 Nov 2015 / Responsible Officer:   The Registrar / Page Contact:   Student Business Solutions