国外优秀信息科学与技术系列教学用书:数据挖掘(概念与技术)(影印版)

Foreword
Preface
Chapter 1 Introduction
1.1 What Motivated Data Mining? Why is it important?
1.2 So,What is Data Mining?
1.3 Data Mining-On What Kind of Data?
1.3.1 Relational Databases
1.3.2 Data Warehouses
1.3.3 Transactional Databases
1.3.4 Advanced Database Systems and Advanced Database Applications
1.4 Data Mining Functionalities-What Kinds of Patterns Can Be Mined?
1.4.1 Concept/Class Description:Characterization and Discrimination
1.4.2 Association Analysis
1.4.3 Classification and Prediction
1.4.4 Cluster Analysis
1.4.5 Outlier Analysis
1.4.6 Evolution Analysis
1.5 Are All of the Patterns Interesting?
1.6 Classification of Data Mining Systems
1.7 Major Issues In Data Mining
1.8 Summary
Exercises
Bibliographic Notes
Chapter 2 Data Warehouse and OLAP Technology for Data Mining
2.1 What is a Data Warehouse?
2.1.1 Differences between Operational Database Systems and Data Warehouses
2.1.2 But, Why Have a Separate Data Warehouse?
2.2 A Multidimensional Data Model
2.2.1 From Tables and Spreadsheets to Data Cubes
2.2.2 Stars, Snowflakes,and Fact Constellations:Schemas for Multidimensional Databases
2.2.3 Examples for Defining Star, Snowflake, and Fact Constellation Schemas
2.2.4 Measures:Their Categorization and Computation
2.2.5 Introducing Concept Hierarchies
2.2.6 OLAP Operations in the Multidimensional Data Model
2.2.7 A Starnet Query Model for Querying Multidimensional Databases
2.3 Data Warehouse Architecture
2.3.1 Steps for the Design and Construction Of Data Warehouses
2.3.2 A Three-Tier Data Warehouse Architecture
2.3.3 Types of OLAP Servers:ROLAP versus MOLAP versus HOLAP
2.4 Data Warehouse Implementation
2.4.1 Efficient Computation of Data Cubes
2.4.2 Indexing OLAP Data
2.4.3 Efficient Processing of OLAP Queries
2.4.4 Metadata Repository
2.4.5 Data Warehouse Back-End Tools and Utilities
2.5 Further Development of Data Cube Technology
2.5.1 Discovery-Driven Exploration of Data Cubes
2.5.2 Complex Aggregation at Multiple Granularities:Multifeature Cubes
2.5.3 Other Developments
2.6 From Data Warehousing to Data Mining
2.6.1 Data Warehouse Usage
2.6.2 From On-Line Analytical Processing to On-Line Analytical Mining
2.7 Summary
Exercises
Bibliographic Notes
Chapter 3 Data Preprocessing
3.1 Why Preprocess the Data?
3.2 Data Cleaning
3.2.1 Missing Values
3.2.2 Noisy Data
3.2.3 Inconsistent Data
3.3 Data Integration and Transformation
3.3.1 Data Integration
3.3.2 Data Transformation
3.4 Data Reduction
3.4.1 Data Cube Aggregation
3.4.2 Dimensionality Reduction
3.4.3 Data Compression
3.4.4 Numerosity Reduction
3.5 Discretization and Concept Hierarchy Generation
3.5.1 Discretization and Concept Hierarchy Generation for Numeric
3.5.2 Concept Hierarchy Generation for Categorical Data
3.6 Summary
Exercises
Bibliographic Notes
Chapter 4 Data Mining Primitives, Languages, and System Architectures
4.1 Data Mining Primitives: What Defines a Data Mining Task?
4.1.1 Task-Relevant Data
4.1.2 The Kind of Knowledge to be Mined
4.1.3 Background Knowledge: Concept Hierarchies
4.1.4 Interestingness Measures
4.1.5 Presentation and Visualization of Discovered Patterns
4.2 A Data Mining Query Language
4.2.1 Syntax for Task-Relevant Data Specification
4,2.2 Syntax for Specifying the Kind of Knowledge to be Mined
4.2.3 Syntax for Concept Hierarchy Specification
4.2.4 Syntax for Interestingness Measure Specification
4.2.5 Syntax for Pattern Presentation and Visualization Specification
4.2.6 Putting it All Together-An Example of a DMQL Query
4.2.7 Other Data Mining Languages and the Standardization of Data Mining Primitives
4.3 Designing Graphical User Interfaces Based on a Data Mining Query Language
4.4 Architectures of Data Mining Systems
4.5 Summary
Exercises
Bibliographic Notes
Chapter 5 Concept Description: Characterization and Comparison
5.1 What is Concept Description?
5.2 Data Generalization and Summarization-Based Characterization
5.2.1 Attribute-Oriented Induction
5.2.2 Efficient Implementation of Attribute-Oriented Induction
5.2.3 Presentation of the Derived Generalization
5.3 Analytical Characterization: Analysis of Attribute Relevance
5.3.1 Why Perform Attribute Relevance Analysis?
5.3.2 Methods of Attribute Relevance Analysis
5.3.3 Analytical Characterization: An Example
5.4 Mining Class Comparisons: Discriminating between Different Classes
5.4.1 Class Comparison Methods and Implementations
5.4.2 Presentation of Class Comparison Descriptions
5.4.3 Class Description: Presentation of Both Characterization and Comparison
5.5 Mining Descriptive Statistical Measures in Large Databases
5.5.1 Measuring the Central Tendency
5.5.2 Measuring the Dispersion of Data
5.5.3 Graph Displays of Basic Statistical Class Descriptions
5.6 Discussion
5.6.1 Concept Description: A Comparison with Typical Machine Learning Methods
5.6.2 Incremental and Parallel Mining of Concept Description
5.7 Summary
Exercises
Bibliographic Notes
Chapter 6 Mining Association Rules in Large Databases
6.1 Association Rule Mining
6.1.1 Market Basket Analysis: A Motivating Example for Association Rule Mining
6.1.2 Basic Concepts
6.1.3 Association Rule Mining: A Road Map
6.2 Mining Single-Dimensional Boolean Association Rules from Transactional Databases
6.2.1 The Apriori Algorithm: Finding Frequent Itemsets Using Candidate Generation
6.2.2 Generating Association Rules from Frequent Itemsets
6.2.3 Improving the Efficiency of Apriori
6.2.4 Mining Frequent Itemsets without Candidate Generation
6.2.5 Iceberg Queries
6.3 Mining Multilevel Association Rules from Transaction Databases
Chapter 7 Classification and Prediction
Chapter 8 Cluster Analysis
Chapter 9 Mining Complex Types of Data
Chapter 10 Applications and Trends in Data Mining
Appendix A An Introduction to Microsofts OLE DB for Data Mining
Appendix B An Introduction to DBMiner
Preface
Chapter 1 Introduction
1.1 What Motivated Data Mining? Why is it important?
1.2 So,What is Data Mining?
1.3 Data Mining-On What Kind of Data?
1.3.1 Relational Databases
1.3.2 Data Warehouses
1.3.3 Transactional Databases
1.3.4 Advanced Database Systems and Advanced Database Applications
1.4 Data Mining Functionalities-What Kinds of Patterns Can Be Mined?
1.4.1 Concept/Class Description:Characterization and Discrimination
1.4.2 Association Analysis
1.4.3 Classification and Prediction
1.4.4 Cluster Analysis
1.4.5 Outlier Analysis
1.4.6 Evolution Analysis
1.5 Are All of the Patterns Interesting?
1.6 Classification of Data Mining Systems
1.7 Major Issues In Data Mining
1.8 Summary
Exercises
Bibliographic Notes
Chapter 2 Data Warehouse and OLAP Technology for Data Mining
2.1 What is a Data Warehouse?
2.1.1 Differences between Operational Database Systems and Data Warehouses
2.1.2 But, Why Have a Separate Data Warehouse?
2.2 A Multidimensional Data Model
2.2.1 From Tables and Spreadsheets to Data Cubes
2.2.2 Stars, Snowflakes,and Fact Constellations:Schemas for Multidimensional Databases
2.2.3 Examples for Defining Star, Snowflake, and Fact Constellation Schemas
2.2.4 Measures:Their Categorization and Computation
2.2.5 Introducing Concept Hierarchies
2.2.6 OLAP Operations in the Multidimensional Data Model
2.2.7 A Starnet Query Model for Querying Multidimensional Databases
2.3 Data Warehouse Architecture
2.3.1 Steps for the Design and Construction Of Data Warehouses
2.3.2 A Three-Tier Data Warehouse Architecture
2.3.3 Types of OLAP Servers:ROLAP versus MOLAP versus HOLAP
2.4 Data Warehouse Implementation
2.4.1 Efficient Computation of Data Cubes
2.4.2 Indexing OLAP Data
2.4.3 Efficient Processing of OLAP Queries
2.4.4 Metadata Repository
2.4.5 Data Warehouse Back-End Tools and Utilities
2.5 Further Development of Data Cube Technology
2.5.1 Discovery-Driven Exploration of Data Cubes
2.5.2 Complex Aggregation at Multiple Granularities:Multifeature Cubes
2.5.3 Other Developments
2.6 From Data Warehousing to Data Mining
2.6.1 Data Warehouse Usage
2.6.2 From On-Line Analytical Processing to On-Line Analytical Mining
2.7 Summary
Exercises
Bibliographic Notes
Chapter 3 Data Preprocessing
3.1 Why Preprocess the Data?
3.2 Data Cleaning
3.2.1 Missing Values
3.2.2 Noisy Data
3.2.3 Inconsistent Data
3.3 Data Integration and Transformation
3.3.1 Data Integration
3.3.2 Data Transformation
3.4 Data Reduction
3.4.1 Data Cube Aggregation
3.4.2 Dimensionality Reduction
3.4.3 Data Compression
3.4.4 Numerosity Reduction
3.5 Discretization and Concept Hierarchy Generation
3.5.1 Discretization and Concept Hierarchy Generation for Numeric
3.5.2 Concept Hierarchy Generation for Categorical Data
3.6 Summary
Exercises
Bibliographic Notes
Chapter 4 Data Mining Primitives, Languages, and System Architectures
4.1 Data Mining Primitives: What Defines a Data Mining Task?
4.1.1 Task-Relevant Data
4.1.2 The Kind of Knowledge to be Mined
4.1.3 Background Knowledge: Concept Hierarchies
4.1.4 Interestingness Measures
4.1.5 Presentation and Visualization of Discovered Patterns
4.2 A Data Mining Query Language
4.2.1 Syntax for Task-Relevant Data Specification
4,2.2 Syntax for Specifying the Kind of Knowledge to be Mined
4.2.3 Syntax for Concept Hierarchy Specification
4.2.4 Syntax for Interestingness Measure Specification
4.2.5 Syntax for Pattern Presentation and Visualization Specification
4.2.6 Putting it All Together-An Example of a DMQL Query
4.2.7 Other Data Mining Languages and the Standardization of Data Mining Primitives
4.3 Designing Graphical User Interfaces Based on a Data Mining Query Language
4.4 Architectures of Data Mining Systems
4.5 Summary
Exercises
Bibliographic Notes
Chapter 5 Concept Description: Characterization and Comparison
5.1 What is Concept Description?
5.2 Data Generalization and Summarization-Based Characterization
5.2.1 Attribute-Oriented Induction
5.2.2 Efficient Implementation of Attribute-Oriented Induction
5.2.3 Presentation of the Derived Generalization
5.3 Analytical Characterization: Analysis of Attribute Relevance
5.3.1 Why Perform Attribute Relevance Analysis?
5.3.2 Methods of Attribute Relevance Analysis
5.3.3 Analytical Characterization: An Example
5.4 Mining Class Comparisons: Discriminating between Different Classes
5.4.1 Class Comparison Methods and Implementations
5.4.2 Presentation of Class Comparison Descriptions
5.4.3 Class Description: Presentation of Both Characterization and Comparison
5.5 Mining Descriptive Statistical Measures in Large Databases
5.5.1 Measuring the Central Tendency
5.5.2 Measuring the Dispersion of Data
5.5.3 Graph Displays of Basic Statistical Class Descriptions
5.6 Discussion
5.6.1 Concept Description: A Comparison with Typical Machine Learning Methods
5.6.2 Incremental and Parallel Mining of Concept Description
5.7 Summary
Exercises
Bibliographic Notes
Chapter 6 Mining Association Rules in Large Databases
6.1 Association Rule Mining
6.1.1 Market Basket Analysis: A Motivating Example for Association Rule Mining
6.1.2 Basic Concepts
6.1.3 Association Rule Mining: A Road Map
6.2 Mining Single-Dimensional Boolean Association Rules from Transactional Databases
6.2.1 The Apriori Algorithm: Finding Frequent Itemsets Using Candidate Generation
6.2.2 Generating Association Rules from Frequent Itemsets
6.2.3 Improving the Efficiency of Apriori
6.2.4 Mining Frequent Itemsets without Candidate Generation
6.2.5 Iceberg Queries
6.3 Mining Multilevel Association Rules from Transaction Databases
Chapter 7 Classification and Prediction
Chapter 8 Cluster Analysis
Chapter 9 Mining Complex Types of Data
Chapter 10 Applications and Trends in Data Mining
Appendix A An Introduction to Microsofts OLE DB for Data Mining
Appendix B An Introduction to DBMiner
Jiawei Han is director of the Intelligent Database Systems research Laboratory and professor in the School of Computing Science at Simon Fraser University.Well dnown for his research in the areas of data mining and data-base systems,he has served on program committees for dozens of international conferences and workshops and on editorial boards for several journals,including IEEE Transactiona on Knowledge and Data Engineering and Data Mining and Knowledge Discovery.
Micheline Damber is a researcher adn freelance technical writer with an M.S.in computer science.She is a member of the Intelligent Database Systems Research Laboratory at Simon Fraser University.
Micheline Damber is a researcher adn freelance technical writer with an M.S.in computer science.She is a member of the Intelligent Database Systems Research Laboratory at Simon Fraser University.
本书阐述了数据挖掘(通常称为数据库知识发现)的概念、方法和应用。从强调数据分析入手,介绍了数据库和数据挖掘的概念,指出数据挖掘是对大型数据库、数据构件库和其他大型信息资源中标识知识含义的那些类型的自动的或便捷的提取,并通过一个通用的框架回顾了当前的市场可供产品。数据挖掘是一个跨学科的知识领域,汲取了数据库技术、人工智能、机器学习、神经网络、统计学、模式识别、知识库系统、知识获取、信息检索、高性能计算、数据可视化等方面的成果,本书内容从数据库的视角,描述了数据挖掘系统的原型、结构、特征、方法,重点讲解了数据挖掘的可行性、实用性、有效性和大型数据库中模型发现的可测量性等问题。本书逐章讲解了数据分类、预测、联结和分组的概念和技术,这些专题都配有实例,对各类问题都分别列举了最佳算法,并对怎样运用技术给出了经过实践检验的实用型规则。这种讲述方式决定了本书的可读性强,能够使读者从中学到数据挖掘领域的知识,了解产业最新动向。本书适用于计算机科学系的学生、应用软件开发人员、商业领域的专家和相关知识领域的科技研究人员。
内容:1. 数据挖掘简介 2. 数据构件库和数据挖掘中的在线分析处理技术 3. 数据处理 4. 数据挖掘原型、语言和系统结构 5. 概念描述:特征与对比 6. 大型数据库中的挖掘联结规则 7. 分类和预测 8. 分组分析9. 挖掘复合数据类型 10. 数据挖掘应用及趋势 附录一 微软公司数据挖掘的对象链接和嵌入数据库 附录二 数据库挖掘器简介
内容:1. 数据挖掘简介 2. 数据构件库和数据挖掘中的在线分析处理技术 3. 数据处理 4. 数据挖掘原型、语言和系统结构 5. 概念描述:特征与对比 6. 大型数据库中的挖掘联结规则 7. 分类和预测 8. 分组分析9. 挖掘复合数据类型 10. 数据挖掘应用及趋势 附录一 微软公司数据挖掘的对象链接和嵌入数据库 附录二 数据库挖掘器简介
比价列表
公众号、微信群

微信公众号

实时获取购书优惠