Institutional Repository of Peking University: Cardiovascular Risk Prediction Method Based on CFS Subset Evaluation and Random Forest Classification Framework - 开云app体育

Title	Cardiovascular Risk Prediction Method Based on CFS Subset Evaluation and Random Forest Classification Framework
Authors	Xu, Shan Zhang, Zhen Wang, Daoxian Hu, Junfeng Duan, Xiaohui Zhu, Tiangang
Affiliation	China Acad Informat Commun Technol, Beijing, Peoples R China. Peking Univ, Sch Elect Engn & Comp Sci, Beijing, Peoples R China. Peking Univ, Peoples Hosp, Beijing, Peoples R China.
Keywords	Cardiovascular disease (CVD) risk prediction data mining feature selection random forest HEART-DISEASE SYSTEM
Issue Date	2017
Publisher	2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA)
Citation	2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA). 2017, 233-237.
Abstract	Cardiovascular Disease (CVD) is a highly significant contributor to loss of quality and quantity of life all over the world. Early detection and risk prediction is very important for patients' treatment and doctors' diagnose. This paper focus on establishing a more accurate and practical risk prediction system based on data mining techniques to provide auxiliary medical service. In order to be practically used for collecting and analyzing patients' data in healthcare industries, the system consists of four parts: data interface, data preparation, feature selection and classification. Data interface response to obtain hospitals' raw data from hospital; data preprocessing is needed for data integration, data cleaning and rating mapping etc. Key features were then selected by CFS Subset Evaluation combined with Best-First-Search method to reduce dimensionality. Random forest was inducted as basic classifier to identify risk level, which is a prior trial in CVD risk prediction field. Cleveland Heart-Disease Database (CHDD) and Cardiology inpatient dataset of PKU People's Hospital were both tested to confirm accuracy as well as practicality. In CHDD test, our system has a significantly higher accuracy of 91.6% than other methods. In People's Hospital dataset test, it achieves an accuracy of 97%, which is better than most of other classifiers except SVM (98.9%), however random forest only take half of time than SVM. Comprehensively considering the risk prediction system shows great significance in accuracy and practical use for patients' treatment and doctors' diagnose.
URI	http://hdl.handle.net/20.500.11897/504986
DOI	10.1109/ICBDA.2017.8078813
Indexed	EI CPCI-S(ISTP)
Appears in Collections:	信息科学技术学院人民医院

Files in This Work

There are no files associated with this item.

Web of Science®

0

Checked on Last Week

Scopus®

Checked on Current Time

百度学术™

0

Checked on Current Time

Google Scholar™

Check

Show full item record

License: See PKU IR operational policies.