AI-based data analysis for text classification and document summarization

Yuefeng Li

AI-based data analysis for text classification and document summarization

Abstract

Yuefeng Li

over the years, businesses have collected very large and complex big data collections, and it has become increasingly difficult to process these big data using the tradition techniques. There is a big challenging issue since the majority of big data is unlabelled in unstructured (information that is not pre-defined) manner. Recently, AI (Artificial Intelligence) based techniques have been used to solve this big issue, e.g., understanding a firm’s reputation using on-line customer reviews, or retrieving of training samples from unlabelled tweets and so on. This talk discusses how AI techniques contribute to text classification and document summarization in the case of only obtaining limited user feedback information for relevance. It firstly discusses the principle of a new classification methodology “a three-way decision based binary classification” to understand the hard issue for dealing with the uncertain boundary between the positive class and negative class. It also extended the application of three-way decisions for text classification to document summarization and sentiment analysis. This talk will presents some new experimental results on several popular data collections, such as RCV1, Reuters-21578, Tweets2011 and Tweets2013, DUC 2006 and 2007, and Amazon review data collections. It also discusses many advanced techniques for obtain more knowledge from big data about the relevance in order to help people to create effective machine learning systems for processing big data, and several open issues regarding to AI-based data analysis for text, Web and media data.

免责声明: 此摘要通过人工智能工具翻译，尚未经过审核或验证

分享此文章

期刊亮点

索引于

CAS 来源索引 (CASSI)
哥白尼索引
谷歌学术
夏尔巴·罗密欧
学术期刊数据库
Genamics 期刊搜索
期刊目录
引用因子
电子期刊图书馆
参考搜索
哈姆达大学
亚利桑那州EBSCO
期刊摘要索引目录
世界科学期刊目录
OCLC-WorldCat
学者指导
SWB 在线目录
虚拟生物学图书馆 (vifabio)
普布隆斯
Dtu 查找
日内瓦医学教育与研究基金会

计算机科学与系统生物学杂志