自然科学版 英文版
自然科学版 英文版
自然科学版 英文版

您目前所在的位置:首页 - 期刊简介 - 详细页面

中南大学学报(自然科学版)

Journal of Central South University

第50卷    第5期    总第297期    2019年5月

[PDF全文下载]    [Flash在线阅读]

    

文章编号:1672-7207(2019)05-1112-07
同分布强化学习优化多决策树及其在非平衡数据集中的应用
焦江丽,张雪英,李凤莲,牛壮

(太原理工大学 信息与计算机学院,山西 太原,030024)

摘 要: 针对传统决策树在非平衡数据集分类时少数类预测性能出现偏差的问题,提出一种基于强化学习累积回报的属性优化策略即改进型同分布多决策树方法。首先通过同分布随机抽样法对非平衡数据集中的多数类样本进行随机采样,进而对各子集建立单决策树形成多个决策树,各决策树采用分类回归树算法建树,并利用强化学习累积回报机制进行属性选择策略的优化。研究结果表明:提出的基于强化学习累积回报机制的属性优化策略可有效提高少数类被正确分类的概率;同分布多决策树方法可有效提高非平衡数据集整体预测性能,且正类率和负类率的几何平均值都有所提高。

 

关键字: 非平衡数据集;多决策树;累积回报机制属性选择策略;同分布随机抽样;强化学习

Identically distributed multi-decision tree based on reinforcement learning and its application in imbalanced data sets
JIAO Jiangli, ZHANG Xueying, LI Fenglian, NIU Zhuang

College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China

Abstract:As the general decision tree can not classify the minority class of the imbalanced data sets well, an improved identically distributed multi-decision tree approach based on reinforcement learning cumulative reward was proposed to optimize the attribute selection strategy. Firstly, the majority class samples of the imbalanced data sets were randomly sampled by the identically distributed random sampling approach, and then each single decision tree was established over each subset and eventually a multi-decision tree was formed. Each single decision tree was constructed by classification and regression tree(CART) algorithm firstly and then reinforcement learning cumulative reward mechanism was utilized to optimize the attribute selection strategy. The results show that the proposed attribute optimization strategy based on the reinforcement learning cumulative reward mechanism effectively improves the probability that the minority class can be correctly classified. The identically distributed multi-decision tree method effectively improves the overall prediction performance over imbalanced data sets. Moreover, the positive rate and geometric mean value of positive and negative rates are improved at the same time.

 

Key words: imbalanced data sets; multi-decision tree; cumulative reward mechanism attributes selection strategy; identically distributed random sampling; reinforcement learning

中南大学学报(自然科学版)
  ISSN 1672-7207
CN 43-1426/N
ZDXZAC
中南大学学报(英文版)
  ISSN 2095-2899
CN 43-1516/TB
JCSTFT
版权所有:《中南大学学报(自然科学版、英文版)》编辑部
地 址:湖南省长沙市中南大学 邮编: 410083
电 话: 0731-88879765(中) 88836963(英) 传真: 0731-88877727
电子邮箱:zngdxb@csu.edu.cn 湘ICP备09001153号