Advances in Knowledge Discovery and Data Mining: 15th by Bi-Ru Dai, Shu-Ming Hsu (auth.), Joshua Zhexue Huang,

By Bi-Ru Dai, Shu-Ming Hsu (auth.), Joshua Zhexue Huang, Longbing Cao, Jaideep Srivastava (eds.)

The two-volume set LNAI 6634 and 6635 constitutes the refereed court cases of the fifteenth Pacific-Asia convention on wisdom Discovery and information Mining, PAKDD 2011, held in Shenzhen, China in may perhaps 2011.

The overall of 32 revised complete papers and fifty eight revised brief papers have been conscientiously reviewed and chosen from 331 submissions. The papers current new rules, unique learn effects, and sensible improvement stories from all KDD-related parts together with info mining, desktop studying, synthetic intelligence and development acceptance, information warehousing and databases, data, knoweldge engineering, habit sciences, visualization, and rising parts comparable to social community analysis.

Show description

Read Online or Download Advances in Knowledge Discovery and Data Mining: 15th Pacific-Asia Conference, PAKDD 2011, Shenzhen, China, May 24-27, 2011, Proceedings, Part I PDF

Similar nonfiction_5 books

Additional info for Advances in Knowledge Discovery and Data Mining: 15th Pacific-Asia Conference, PAKDD 2011, Shenzhen, China, May 24-27, 2011, Proceedings, Part I

Sample text

Logically, when K is small (few features have been selected), the number of features remaining in a document will be even fewer. e. its size is zero) if none of its features are found in the top K features. The number of training examples will shrink if the number of zero-size documents increases. Intuitively, the number of documents with zero size will affect the classification accuracy negatively because these documents cannot contribute to the classifier training process (since there are no features in these documents).

Then, the size of RNN set of M is not larger than one, so M will not be selected as a sample, too. Therefore, there is only one sample, J, in class 3. The final sample set of RNNR-AL1 is shown in Table 1. 3 The Choosing Strategy of RNNR-L1 (Selecting All, Larger Than ONE) For observing how the absorption strategy affects the classification accuracy, RNNRL1 does not adopt the absorption strategy but selects each instance whose size of RNN set is larger than one as a sample. In general case, most instances contain zero or one member in their RNN sets, so the reduction rate of RNNR-L1 is still decent.

Accuracy; (b) No. of features selected vs. Percentage of documents with non-zero size; (c) Percentage of documents with non-zero size vs. Accuracy. case, it will yield a very sparse document representation. According to Figure 1 (a) the effectiveness of the scoring functions is roughly: GSS > OR > IG > χ2 > IDF, where the number of non-zero-size documents (according to Figure 1 (b)) is also of this order.

Download PDF sample

Rated 4.50 of 5 – based on 37 votes