Rule-based Automatic Criteria Detection for Assessing Quality of Online Health Information

Download
  1. (PDF, 246 KB)
AuthorSearch for: ; Search for:
TypeArticle
ConferenceAn International Conference Addressing Information Technology and Communications in Health (ITCH), February 15 - 18, 2007., Victoria, British Columbia, Canada
AbstractAutomatically assessing the quality of health related Web pages is an emerging method for assisting consumers in evaluating online health information. We propose a rule-based method of detecting technical criteria for automatic quality assessment in this paper. Firstly, we defined corresponding measurable indicators for each criterion with the indicator value and expected location. Then candidate lines that may contain indicators are extracted by matching the indicator value with the content of a Web page. The actual location of a candidate line is detected by analyzing the Web page DOM tree. The expression pattern of each candidate line is identified by regular expressions. Each candidate line is classified into a criterion according to rules for matching location and expression patterns. The occurrences of criteria on a Web page are summarized based on the results of line classification. The performance of this rule-based criteria detection method is tested on two data sets. It is also compared with a direct criteria detection method. The results show that the overall accuracy of the rule-based method is higher than that of the direct detection method. Some criteria, such as authors name, authors credential and authors affiliation, which were difficult to detect using the direct detection method, can be effectively detected based on location and expression patterns. The approach of rule-based detecting criteria for assessing the quality of health Web pages is effective. Automatic detection of technical criteria is complementary to the evaluation of content quality, and it can contribute in assessing the comprehensive quality of health related Web sites.
Publication date
LanguageEnglish
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedNo
NRC number48803
NPARC number8913818
Export citationExport as RIS
Report a correctionReport a correction
Record identifier94d4a383-a300-4822-97cd-64d1c5c9eb02
Record created2009-04-22
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)