國立虎尾科技大學 |

Exploiting the power of group differences : = using patterns to solve data analysis problems /

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	Exploiting the power of group differences :/ Guozhu Dong.
其他題名:	using patterns to solve data analysis problems /
作者:	Dong, Guozhu,
面頁冊數:	1 PDF (xv, 130 pages) :illustrations. :
附註:	Part of: Synthesis digital library of engineering and computer science.
標題:	Group theory. -
電子資源:	https://ieeexplore.ieee.org/servlet/opac?bknumber=8653552
ISBN:	9781681735030

Exploiting the power of group differences : = using patterns to solve data analysis problems /
Dong, Guozhu,1957-,

Exploiting the power of group differences :using patterns to solve data analysis problems /Guozhu Dong. - 1 PDF (xv, 130 pages) :illustrations. - Synthesis lectures on data mining and knowledge discovery,# 162151-0075 ;. - Synthesis digital library of engineering and computer science..

Part of: Synthesis digital library of engineering and computer science.

Includes bibliographical references (pages 101-123) and index.

1. Introduction and overview -- 1.1 Importance of group differences -- 1.2 Summary of chapters -- 1.2.1 Reading order of the chapters -- 1.3 Known uses of group differences via emerging patterns -- 1.4 Unique properties of emerging pattern based methods -- 1.5 Scenarios where emerging patterns are especially useful -- 1.6 Related topics not covered in this book --

Abstract freely available; full-text restricted to subscribers or individual document purchasers.

Compendex

This book presents pattern-based problem-solving methods for a variety of machine learning and data analysis problems. The methods are all based on techniques that exploit the power of group differences. They make use of group differences represented using emerging patterns (aka contrast patterns), which are patterns that match significantly different numbers of instances in different data groups. A large number of applications outside of the computing discipline are also included. Emerging patterns (EPs) are useful in many ways. EPs can be used as features, as simple classifiers, as subpopulation signatures/characterizations, and as triggering conditions for alerts. EPs can be used in gene ranking for complex diseases since they capture multi-factor interactions. The length of EPs can be used to detect anomalies, outliers, and novelties. Emerging/contrast pattern-based methods for clustering analysis and outlier detection do not need distance metrics, avoiding pitfalls of the latter in exploratory analysis of high dimensional data. EP-based classifiers can achieve good accuracy even when the training datasets are tiny, making them useful for exploratory compound selection in drug design. EPs can serve as opportunities in opportunity-focused boosting and are useful for constructing powerful conditional ensembles. EP-based methods often produce interpretable models and results. In general, EPs are useful for classification, clustering, outlier detection, gene ranking for complex diseases, prediction model analysis and improvement, and so on. EPs are useful for many tasks because they represent group differences, which have extraordinary power. Moreover, EPs represent multi-factor interactions, whose effective handling is of vital importance and is a major challenge in many disciplines. Based on the results presented in this book, one can clearly say that patterns are useful, especially when they are linked to issues of interest. We believe that many effective ways to exploit group differences' power still remain to be discovered. Hopefully this book will inspire readers to discover such new ways, besides showing them existing ways, to solve various challenging problems.

Mode of access: World Wide Web.

ISBN: 9781681735030

Standard No.: 10.2200/S00897ED1V01Y201901DMK016doiSubjects--Topical Terms:

527791
Group theory.
Subjects--Index Terms:

intrusion detection

LC Class. No.: QA174.7.D36 / D654 2019

Dewey Class. No.: 512.2

Exploiting the power of group differences : = using patterns to solve data analysis problems /
LDR:11238nam 2200925 i 4500 001 959750
003 IEEE
005 20190227152032.0
006 m eo d
007 cr cn |||m|||a
008 201209s2019 caua foab 001 0 eng d
020 $a 9781681735030 $q ebook
020 $z 9781681735047 $q hardcover
020 $z 9781681735023 $q paperback
024 7 $a 10.2200/S00897ED1V01Y201901DMK016 $2 doi
035 $a (CaBNVSL)swl000409039
035 $a (OCoLC)1088564473
035 $a 8653552
040 $a CaBNVSL $b eng $e rda $c CaBNVSL $d CaBNVSL
050 4 $a QA174.7.D36 $b D654 2019
082 0 4 $a 512.2 $2 23
100 1 $a Dong, Guozhu, $d 1957-, $e author. $3 1253063
245 1 0 $a Exploiting the power of group differences : $b using patterns to solve data analysis problems / $c Guozhu Dong.
264 1 $a [San Rafael, California] : $b Morgan & Claypool, $c 2019.
300 $a 1 PDF (xv, 130 pages) : $b illustrations.
336 $a text $2 rdacontent
337 $a electronic $2 isbdmedia
338 $a online resource $2 rdacarrier
490 1 $a Synthesis lectures on data mining and knowledge discovery, $x 2151-0075 ; $v # 16
500 $a Part of: Synthesis digital library of engineering and computer science.
504 $a Includes bibliographical references (pages 101-123) and index.
505 0 $a 1. Introduction and overview -- 1.1 Importance of group differences -- 1.2 Summary of chapters -- 1.2.1 Reading order of the chapters -- 1.3 Known uses of group differences via emerging patterns -- 1.4 Unique properties of emerging pattern based methods -- 1.5 Scenarios where emerging patterns are especially useful -- 1.6 Related topics not covered in this book --
505 8 $a 2. General preliminaries -- 2.1 Attributes, features, and variables -- 2.2 Data instances and datasets -- 2.3 Attribute binning and discretization -- 2.4 Patterns, matching datasets, supports, and frequent patterns -- 2.5 Equivalence classes, closed patterns, minimal generators, and borders -- 2.6 Illustrating examples --
505 8 $a 3. Emerging patterns and a flexible mining algorithm -- 3.1 Setting for group difference analysis -- 3.2 Basics of emerging patterns -- 3.3 BorderDiff: a simple, flexible emerging pattern mining algorithm -- 3.4 What emerging patterns can represent -- 3.5 Comparison with association rules, confidence, and odds ratio -- 3.6 Pointers to sections illustrating uses of emerging patterns -- 3.7 Traditional analysis of group differences -- 3.8 Discussion of related issues --
505 8 $a 4. CAEP: classification by aggregating multiple matching emerging patterns -- 4.1 Background materials on classification -- 4.2 The CAEP approach -- 4.2.1 CAEP's class-likelihood computation -- 4.2.2 CAEP's likelihood normalization -- 4.2.3 Emerging pattern set selection -- 4.2.4 The CAEP training and testing algorithms -- 4.3 A small illustrating example -- 4.4 Experiments and applications by other researchers -- 4.5 Strengths and uniqueness of CAEP -- 4.5.1 Strengths of CAEP -- 4.5.2 Uniqueness of CAEP -- 4.6 DeEPs: instance-based classification using emerging patterns -- 4.7 Relationship with other rule/pattern-based classifiers -- 4.8 Discussion --
505 8 $a 5. CAEP for classification on tiny training datasets, compound selection, and instance selection -- 5.1 CAEP performs well on tiny training data -- 5.1.1 Details on data used for compound selection -- 5.2 Using CAEP for compound selection -- 5.3 Iterative algorithm for extreme instance selection -- 5.4 Semi-supervised extreme instance selection vs. semi-supervised learning --
505 8 $a 6. OCLEP: one-class intrusion detection and anomaly detection -- 6.1 Background on intrusion detection, anomaly detection, and outlier detection -- 6.2 OCLEP: emerging pattern length-based intrusion detection -- 6.2.1 An observation on emerging pattern's length -- 6.2.2 What emerging patterns to use and their mining -- 6.2.3 OCLEP's training and testing algorithms -- 6.3 Experimental evaluation of OCLEP -- 6.3.1 Details of the NSL-KDD dataset -- 6.3.2 Intrusion detection on the NSL-KDD dataset -- 6.3.3 Masquerader detection on command sequences -- 6.4 Discussion --
505 8 $a 7. CPCQ: contrast pattern based clustering-quality evaluation -- 7.1 Background on clustering-quality evaluation -- 7.2 CPCQ's rationale -- 7.3 Measuring quality of CPs -- 7.4 Measuring diversity of high-quality CPs -- 7.5 Defining CPCQ -- 7.6 Mining CPs and computing the best N groups of CPs to maximize CPCQ values -- 7.7 Experimental evaluation of CPCQ -- 7.8 Discussion --
505 8 $a 8. CPC: pattern-based clustering maximizing CPCQ -- 8.1 Notations -- 8.2 Background on clustering and clustering evaluation -- 8.3 Problem setting and guiding ideas for CPC -- 8.4 Main technical measures -- 8.4.1 MPQ between two patterns -- 8.4.2 MPQ between a pattern and a pattern set -- 8.5 The CPC algorithm -- 8.6 General experimental evaluation of CPC -- 8.7 Text data analysis on blogs using CPC -- 8.8 Discussion --
505 8 $a 9. IBIG : ranking genes and attributes for complex diseases and complex problems -- 9.1 Basics of the gene-ranking problem -- 9.2 Background on complex diseases -- 9.3 Capturing interactions using jumping emerging patterns -- 9.4 The IBIG approach -- 9.4.1 High-level view of the IBIG approach -- 9.4.2 IBIG gene ranking based on a set of emerging patterns -- 9.4.3 Gene clubs and computing gene clubs -- 9.4.4 The iterative IBIG algorithm: IBIGi -- 9.5 Experimental findings on IBIG on colon cancer data -- 9.5.1 High-quality JEPs often involve lowly IG-ranked genes and IBIGi can find many of them -- 9.5.2 Significant gene-rank differences between IG and IBIG -- 9.6 Discussion --
505 8 $a 10. CPXR and CPXC: pattern aided prediction modeling and prediction model analysis -- 10.1 Background materials -- 10.2 Pattern aided prediction models -- 10.2.1 Fitting local models for logical subpopulations -- 10.2.2 Pattern aided prediction models -- 10.3 CPXP: contrast pattern aided prediction -- 10.4 Relationship with boosting and ensemble member selection -- 10.5 Diverse predictor-response relationships -- 10.6 Uses of CPXR and CPXC in experiments -- 10.6.1 Experiments on commonly used datasets -- 10.6.2 Applications for agriculture and healthcare predictions -- 10.7 Subpopulationwise conditional correlation analysis -- 10.8 Discussion --
505 8 $a 11. Other approaches and applications using emerging patterns -- 11.1 Compound activity analysis -- 11.2 Structure-activity relationship exploration and analysis -- 11.3 Metabolite biomarker discovery -- 11.4 Structural alerts for molecular toxicity -- 11.5 Identifying disease subtypes, and disease treatment planning -- 11.6 Safety and street crime analysis -- 11.7 Characterizing music families -- 11.8 Identifying interaction terms: adverse drug reaction analysis -- 11.9 Coupled hidden Markov model for critical patient care -- 11.10 Pose-based human activity recognition -- 11.11 Protein complex detection -- 11.12 Inhibitor prediction combining FCA and JEP -- 11.13 Instant activity recognition in video sequences -- 11.14 Birth defect detection -- 11.15 Surgery stage identification and feedback delivery -- 11.16 Sensor-based activity recognition -- 11.17 Online banking fraud detection -- 11.18 Other EP-based classification approaches and studies -- 11.19 Emerging patterns for classification over streaming data -- 11.20 Other studies and applications -- 11.21 Summary of uses: application domain perspective -- 11.22 Discussion --
505 8 $a Bibliography -- Author's biography -- Index.
506 $a Abstract freely available; full-text restricted to subscribers or individual document purchasers.
510 0 $a Compendex
510 0 $a INSPEC
510 0 $a Google scholar
510 0 $a Google book search
520 3 $a This book presents pattern-based problem-solving methods for a variety of machine learning and data analysis problems. The methods are all based on techniques that exploit the power of group differences. They make use of group differences represented using emerging patterns (aka contrast patterns), which are patterns that match significantly different numbers of instances in different data groups. A large number of applications outside of the computing discipline are also included. Emerging patterns (EPs) are useful in many ways. EPs can be used as features, as simple classifiers, as subpopulation signatures/characterizations, and as triggering conditions for alerts. EPs can be used in gene ranking for complex diseases since they capture multi-factor interactions. The length of EPs can be used to detect anomalies, outliers, and novelties. Emerging/contrast pattern-based methods for clustering analysis and outlier detection do not need distance metrics, avoiding pitfalls of the latter in exploratory analysis of high dimensional data. EP-based classifiers can achieve good accuracy even when the training datasets are tiny, making them useful for exploratory compound selection in drug design. EPs can serve as opportunities in opportunity-focused boosting and are useful for constructing powerful conditional ensembles. EP-based methods often produce interpretable models and results. In general, EPs are useful for classification, clustering, outlier detection, gene ranking for complex diseases, prediction model analysis and improvement, and so on. EPs are useful for many tasks because they represent group differences, which have extraordinary power. Moreover, EPs represent multi-factor interactions, whose effective handling is of vital importance and is a major challenge in many disciplines. Based on the results presented in this book, one can clearly say that patterns are useful, especially when they are linked to issues of interest. We believe that many effective ways to exploit group differences' power still remain to be discovered. Hopefully this book will inspire readers to discover such new ways, besides showing them existing ways, to solve various challenging problems.
530 $a Also available in print.
538 $a Mode of access: World Wide Web.
538 $a System requirements: Adobe Acrobat Reader.
588 $a Title from PDF title page (viewed on February 27, 2019).
650 0 $a Group theory. $3 527791
650 0 $a Pattern perception $x Data processing. $3 561255
650 0 $a Quantitative research. $3 635913
650 0 $a Data mining. $3 528622
650 0 $a Machine learning. $3 561253
653 $a intrusion detection
653 $a compound selection
653 $a complex disease analysis
653 $a extreme instance selection
653 $a factor ranking
653 $a prediction model analysis
653 $a group difference analysis
653 $a feature
653 $a multifactor interaction
653 $a diverse relationship
653 $a heterogeneity
653 $a boosting
653 $a ensemble
653 $a association rule
653 $a emerging pattern
653 $a contrast pattern
653 $a frequent pattern
653 $a distance metric
653 $a interpretability
653 $a data mining
653 $a machine learning
653 $a data analytic
653 $a classification
653 $a regression
653 $a clustering
653 $a anomaly detection
653 $a outlier detection
776 0 8 $i Print version: $z 9781681735047 $z 9781681735023
830 0 $a Synthesis digital library of engineering and computer science. $3 598254
830 0 $a Synthesis lectures on data mining and knowledge discovery ; $v # 16. $x 2151-0075 $3 1253064
856 4 2 $3 Abstract with links to resource $u https://ieeexplore.ieee.org/servlet/opac?bknumber=8653552