Encyclopedia

splog

Also found in: Dictionary, Acronyms, Wikipedia.
(redirected from Spam blog)

splog

(SPam bLOG) A blog site set up for the sole purpose of increasing page ranking in Google. The splog may contain very little content of value, but provides a link to the website or sites that it is promoting. See splogosphere, sping, Google and blog.
Copyright © 1981-2025 by The Computer Language Company Inc. All Rights reserved. THIS DEFINITION IS FOR PERSONAL USE ONLY. All other reproduction is strictly prohibited without permission from the publisher.
Mentioned in
References in periodicals archive
To investigate filtering precision, 20 samples from the determined spam blogs (selected from the first day of each month) were checked manually and judged to be spam.
In case of R = 0.8, mutual detection cannot detect the estimated number of spam blogs (defined by S) from updated blog data, due to the narrow condition of R.
In the case of C = 0.05, mutual detection cannot detect the estimated number of spam blogs (defined by S) from updated blog data, due to the narrow condition of C.
Parameter Adjustment for Change in Number of Spam Blogs
To calculate the number of spam blogs in one day's data on August 20th, 2008, 100 samples were selected from the data and 14 spam blogs were counted, which suggests that the estimated spam rate is 14%.
To illustrate the advantages and disadvantages of machine learning, SVM is employed to filter spam blogs in the data set August 20th, 2008, which was also processed in section 3.
Hence, the estimated number of spam blogs is approximately 4175.
To select words as features of data for SVM, document frequency (DF) is employed to find which are effective for filtering spam blogs. Employing these measures on the spam data set, the top 100 words are selected as features to represent characteristics of each blog.
To prepare a training data set for SVM learning, which contains 2643 spam and 2643 non-spam blogs, 2643 blogs are randomly selected as non-spam blogs from the unknown data set of 205,702 blogs, although some of the selected blogs may be spam blogs. The selected blogs are added to the subset of the spam data set for training SVM.
The estimated number of spam blogs in the test data set is 2089 and that of non-spam blogs is 555, based on the spam distribution in fig.
The results show that 13 of the 20 blogs are spam blogs. Therefore, the precision of SVM is 65%.
As a result, SVM trained using the training set can separate spam blogs in the test set at high precision.
Copyright © 2003-2025 Farlex, Inc Disclaimer
All content on this website, including dictionary, thesaurus, literature, geography, and other reference data is for informational purposes only. This information should not be considered complete, up to date, and is not intended to be used in place of a visit, consultation, or advice of a legal, medical, or any other professional.