搜索引擎：信息检索实践（英文版）

作者：（美）克罗夫特等著
出版：机械工业出版社 2009.10
丛书：经典原版书库
页数：520
定价：45.00 元
ISBN-13：9787111282471
ISBN-10：7111282477 去豆瓣看看

0 0暂无人评价...

　　1 Search Engines and Information Retrieval
　　1.1 What Is Information Retrieval?
　　1.2 The Big Issues
　　1.3 Search Engines
　　1.4 Search Engineers
　　2 Architecture of a Search Engine
　　2.1 What Is an Architecture
　　2.2 Basic Building Blocks
　　2.3 Breaking It Down
　　2.3.1 Text Acquisition
　　2.3.2 Text Transformation
　　2.3.3 Index Creation
　　2.3.4 User Interaction
　　2.3.5 Ranking
　　2.3.6 Evaluation
　　2.4 How Does It Really Work?
　　3 Crawls and Feeds
　　3.1 Deciding What to Search
　　3.2 Crawling the Web
　　3.2.1 Retrieving Web Pages
　　3.2.2 The Web Crawler
　　3.2.3 Freshness
　　3.2.4 Focused Crawling
　　3.2.5 Deep Web
　　3.2.6 Sitemaps
　　3.2.7 Distributed Crawling
　　3.3 Crawling Documents and Email
　　3.4 Document Feeds
　　3.5 The Conversion Problem
　　3.5.1 Character Encodings
　　3.6 Storing the Documents
　　3.6,1 Using a Database System
　　3.6.2 Random Access
　　3.6.3 Compression and Large Files
　　3.6.4 Update
　　3.6.5 BigTable
　　3.7 Detecting Duplicates
　　3.8 Removing Noise
　　4 Processing Text
　　4.1 From Words to Terms
　　4.2 Text Statistics
　　4.2.1 Vocabulary Growth
　　4.2.2 Estimating Collection and Result Set Sizes
　　4.3 Document Parsing
　　4.3.1 Overview
　　4.3.2 Tokenizing
　　4.3.3 Stopping
　　4.3.4 Stemming
　　4.3.5 Phrases and N-grams
　　4.4 Document Structure and Markup
　　4.5 Link Analysis
　　4.5.1 Anchor Text
　　4.5.2 PageRank
　　4.5.3 Link Quality
　　4.6 Information Extraction
　　4.6.1 Hidden Markov Models for Extraction
　　4.7 Internationalization
　　5 Ranking with Indexes
　　5.1 Overview
　　5.2 Abstract Model of Ranking
　　5.3 Inverted Indexes
　　5.3.1 Documents
　　5.3.2 Counts
　　5.3.3 Positions
　　5.3A Fields and Extents
　　5.3.5 Scores
　　5.3.6 Ordering
　　5.4 Compression
　　5.4.1 Entropy and Ambiguity
　　5.4.2 Delta Encoding
　　5.4.3 Bit-Aligned Codes
　　5.4.4 Byte-Aligned Codes
　　5.4.5 Compression in Practice
　　5.4.6 Looking Ahead
　　5.4.7 Skipping and Skip Pointers
　　5.5 Auxiliary Structures
　　5.6 Index Construction
　　5.6.1 Simple Construction
　　5.6.2 Merging
　　5.6.3 Parallelism and Distribution
　　5.6.4 Update
　　5.7 Query Processing
　　5.7.1 Document-at-a-time Evaluation
　　5.7.2 Term-at-a-time Evaluation
　　5.7.3 Optimization Techniques
　　5.7.4 Structured Queries
　　5.7.5 Distributed Evaluation
　　5.7.6 Caching
　　6 Queries and Interfaces
　　6.1 Information Needs and Queries
　　6.2 Query Transformation and Refinement
　　6.2.1 Stopping and Stemming Revisited
　　6.2.2 Spell Checking and Suggestions
　　6.2.3 Query Expansion
　　6.2.4 Relevance Feedback
　　6.2.5 Context and Personalization
　　6.3 Showing the Results
　　6.3.1 Result Pages and Snippets
　　6.3.2 Advertising and Search
　　6.3.3 Clustering the Results
　　6.4 Cross-Language Search
　　7 Retrieval Models
　　7.1 Overview of Retrieval Models
　　7.1.1 Boolean Retrieval
　　7.1.2 The Vector Space Model
　　7.2 Probabilistic Models
　　7.2.1 Information Retrieval as Classification
　　7.2.2 The BM25 Ranking Algorithm
　　7.3 Ranking Based on Language Models
　　7.3.1 Query Likelihood Ranking
　　7.3.2 Relevance Models and Pseudo-Relevance Feedback
　　7.4 Complex Queries and Combining Evidence
　　7.4.1 The Inference Network Model
　　7.4.2 The Galago Query Language
　　7.5 Web Search
　　7.6 Machine Learning and Information Retrieval
　　7.6.1 Learning to Rank
　　7.6.2 Topic Models and Vocabulary Mismatch
　　7.7 Application-Based Models
　　8 Evaluating Search Engines
　　8.1 Why Evaluate ?
　　8.2 The Evaluation Corpus
　　8.3 Logging
　　8.4 Effectiveness Metrics
　　8.4.1 Recall and Precision
　　8.4.2 Averaging and Interpolation
　　8.4.3 Focusing on the Top Documents
　　8.4.4 Using Preferences
　　……
　　9 Classification and Clustering
　　10 Social Search
　　11 Beyond Bag of Words
　　Reverences
　　Index

目　录作者简介内容简介

　　《搜索引擎：信息检索实践（英文版）》介绍了信息检索（1R）中的关键问题。以及这些问题如何影响搜索引擎的设计与实现，并且用数学模型强化了重要的概念。对于网络搜索引擎这一重要的话题，书中主要涵盖了在网络上广泛使用的搜索技术。
　　《搜索引擎：信息检索实践（英文版）》适用于高等院校计算机科学或计算机工程专业的本科生、研究生，对于专业人士而言，《搜索引擎：信息检索实践（英文版）》也不失为一本理想的入门教材。

比价列表

商家

评价 (245)

折扣

价格