The Pulse of News in Social Media Forecasting Popularity
上传者:何存富|上传时间:2015-05-05|密次下载
The Pulse of News in Social Media Forecasting Popularity
The Pulse of News in Social Media: Forecasting Popularity
Roja Bandari
?
Sitaram Asur
?
Bernardo Huberman
?
Abstract
News articles are extremely time sensitive by nature. There
is also intense competition among news items to propagate
as widely as possible. Hence, the task of predicting the pop-
ularity of news items on the social web is both interesting
and challenging. Prior research has dealt with predicting
eventual online popularity based on early popularity. It is
most desirable, however, to predict the popularity of items
prior to their release, fostering the possibility of appropriate
decision making to modify an article and the manner of its
publication. In this paper, we construct a multi-dimensional
feature space derived from properties of an article and eval-
uate the e?cacy of these features to serve as predictors of
online popularity. We examine both regression and classi?-
cation algorithms and demonstrate that despite randomness
in human behavior, it is possible to predict ranges of pop-
ularity on twitter with an overall 84% accuracy. Our study
also serves to illustrate the di?erences between traditionally
prominent sources and those immensely popular on the so-
cial web.
1 Introduction
News articles are very dynamic due to their relation to
continuously developing events that typically have short
lifespans. For a news article to be popular, it is essential
for it to propagate to a large number of readers within
a short time. Hence there exists a competition among
di?erent sources to generate content which is relevant
to a large subset of the population and becomes virally
popular.
Traditionally, news reporting and broadcasting has
been costly, which meant that large news agencies dom-
inated the competition. But the ease and low cost of on-
line content creation and sharing has recently changed
the traditional rules of competition for public attention.
News sources now concentrate a large portion of their
attention on online mediums where they can dissemi-
nate their news e?ectively and to a large population. It
is therefore common for almost all major news sources to
have active accounts in social media services like Twitter
to take advantage of the enormous reach these services
?
UCLA.
?
HP Labs.
?
HP Labs.
provide.
Due to the time-sensitive aspect and the intense
competition for attention, accurately estimating the
extent to which a news article will spread on the web
is extremely valuable to journalists, content providers,
advertisers, and news recommendation systems. This
is also important for activists and politicians who are
using the web increasingly more to in?uence public
opinion.
However, predicting online popularity of news arti-
cles is a challenging task. First, context outside the web
is often not readily accessible and elements such as local
and geographical conditions and various circumstances
that a?ect the population make this prediction di?cult.
Furthermore, network properties such as the structure
of social networks that are propagating the news, in?u-
ence variations among members, and interplay between
di?erent sections of the web add other layers of com-
plexity to this problem. Most signi?cantly, intuition
suggests that the content of an article must play a cru-
cial role in its popularity. Content that resonates with
a majority of the readers such as a major world-wide
event can be expected to garner wide attention while
speci?c content relevant only to a few may not be as
successful.
Given the complexity of the problem due to the
above mentioned factors, a growing number of recent
studies [1], [2], [3], [4], [5] make use of early measure-
ments of an item’s popularity to predict its future suc-
cess. In the present work we investigate a more di?cult
problem, which is prediction of social popularity with-
out using early popularity measurements, by instead
solely considering features of a news article prior to its
publication. We focus this work on observable features
in the content of an article as well as its source of publi-
cation. Our goal is to discover if any predictors relevant
only to the content exist and if it is possible to make a
reasonable forecast of the spread of an article based on
content featu
res.
The news data for our study was collected from
Feedzilla
1
–a news feed aggregator– and measurements
of the spread are performed on Twitter
2
, an immensely
1
http://wendang.chazidian.com
2
http://wendang.chazidian.com
下载文档
热门试卷
- 2016年四川省内江市中考化学试卷
- 广西钦州市高新区2017届高三11月月考政治试卷
- 浙江省湖州市2016-2017学年高一上学期期中考试政治试卷
- 浙江省湖州市2016-2017学年高二上学期期中考试政治试卷
- 辽宁省铁岭市协作体2017届高三上学期第三次联考政治试卷
- 广西钦州市钦州港区2016-2017学年高二11月月考政治试卷
- 广西钦州市钦州港区2017届高三11月月考政治试卷
- 广西钦州市钦州港区2016-2017学年高一11月月考政治试卷
- 广西钦州市高新区2016-2017学年高二11月月考政治试卷
- 广西钦州市高新区2016-2017学年高一11月月考政治试卷
- 山东省滨州市三校2017届第一学期阶段测试初三英语试题
- 四川省成都七中2017届高三一诊模拟考试文科综合试卷
- 2017届普通高等学校招生全国统一考试模拟试题(附答案)
- 重庆市永川中学高2017级上期12月月考语文试题
- 江西宜春三中2017届高三第一学期第二次月考文科综合试题
- 内蒙古赤峰二中2017届高三上学期第三次月考英语试题
- 2017年六年级(上)数学期末考试卷
- 2017人教版小学英语三年级上期末笔试题
- 江苏省常州西藏民族中学2016-2017学年九年级思想品德第一学期第二次阶段测试试卷
- 重庆市九龙坡区七校2016-2017学年上期八年级素质测查(二)语文学科试题卷
- 江苏省无锡市钱桥中学2016年12月八年级语文阶段性测试卷
- 江苏省无锡市钱桥中学2016-2017学年七年级英语12月阶段检测试卷
- 山东省邹城市第八中学2016-2017学年八年级12月物理第4章试题(无答案)
- 【人教版】河北省2015-2016学年度九年级上期末语文试题卷(附答案)
- 四川省简阳市阳安中学2016年12月高二月考英语试卷
- 四川省成都龙泉中学高三上学期2016年12月月考试题文科综合能力测试
- 安徽省滁州中学2016—2017学年度第一学期12月月考高三英语试卷
- 山东省武城县第二中学2016.12高一年级上学期第二次月考历史试题(必修一第四、五单元)
- 福建省四地六校联考2016-2017学年上学期第三次月考高三化学试卷
- 甘肃省武威第二十三中学2016—2017学年度八年级第一学期12月月考生物试卷
网友关注
- 英语听力教学模式的优化与实践.doc
- 云计算的演进和挑战性研究问题
- 英文論文常用句型-背景
- Linux文件系统概述
- C语言课件第9章
- 基于MVVM模式的住区布局系统设计与开发
- unix系统管理
- 大学英语四、六级网考与大学英语听力教学改革探究
- 中小企业和云计算
- 交大网络JAVA程序第1章
- 浅析如何提高独立学院学生的英语听力水平
- linux网络管理及应用
- linux操作系统分析与实践
- it句型大扫描
- [精华]第1章 java概论
- 第四章_unix下的c與言開發環境
- 建构主义理论在高职多媒体英语听力教学中的应用研究
- 高中生 英语学习归因对其自主学习能力的影响对策
- c语言三级网络
- 城市民工子女初中英语学困生英语学习动机的实证研究与激发策略
- SEO的长尾理论及网络营销
- S30读取网页FlashVars中的参数
- 微软内部都对HTML5和Silverlight的未来举棋不定?
- 正确认识和了解SEM
- 物联网实验室人才架构体系
- ppt114项制作技术
- C++继承与派生
- 大学新生英语听力障碍分析与对策研究--以上海商学院为例
- 网络环境下大学英语听力自主学习教学模式初探
- [最新]英语听力
网友关注视频
- 【部编】人教版语文七年级下册《泊秦淮》优质课教学视频+PPT课件+教案,广东省
- 沪教版八年级下册数学练习册21.4(1)无理方程P18
- 北师大版八年级物理下册 第六章 常见的光学仪器(二)探究凸透镜成像的规律
- 沪教版八年级下册数学练习册20.4(2)一次函数的应用2P8
- 外研版八年级英语下学期 Module3
- 第五单元 民族艺术的瑰宝_15. 多姿多彩的民族服饰_第二课时(市一等奖)(岭南版六年级上册)_T129830
- 沪教版牛津小学英语(深圳用) 五年级下册 Unit 10
- 冀教版小学数学二年级下册1
- 8 随形想象_第一课时(二等奖)(沪教版二年级上册)_T3786594
- 【部编】人教版语文七年级下册《泊秦淮》优质课教学视频+PPT课件+教案,湖北省
- 苏科版数学七年级下册7.2《探索平行线的性质》
- 【部编】人教版语文七年级下册《老山界》优质课教学视频+PPT课件+教案,安徽省
- 三年级英语单词记忆下册(沪教版)第一二单元复习
- 六年级英语下册上海牛津版教材讲解 U1单词
- 二次函数求实际问题中的最值_第一课时(特等奖)(冀教版九年级下册)_T144339
- 冀教版小学数学二年级下册第二周第2课时《我们的测量》宝丰街小学庞志荣
- 七年级英语下册 上海牛津版 Unit5
- 七年级下册外研版英语M8U2reading
- 冀教版英语四年级下册第二课
- 沪教版牛津小学英语(深圳用) 四年级下册 Unit 2
- 化学九年级下册全册同步 人教版 第25集 生活中常见的盐(二)
- 冀教版小学数学二年级下册第二单元《有余数除法的整理与复习》
- 第8课 对称剪纸_第一课时(二等奖)(沪书画版二年级上册)_T3784187
- 【部编】人教版语文七年级下册《过松源晨炊漆公店(其五)》优质课教学视频+PPT课件+教案,辽宁省
- 沪教版牛津小学英语(深圳用) 四年级下册 Unit 8
- 冀教版英语五年级下册第二课课程解读
- 第19课 我喜欢的鸟_第一课时(二等奖)(人美杨永善版二年级下册)_T644386
- 苏科版数学八年级下册9.2《中心对称和中心对称图形》
- 【部编】人教版语文七年级下册《逢入京使》优质课教学视频+PPT课件+教案,辽宁省
- 二年级下册数学第一课
精品推荐
- 2016-2017学年高一语文人教版必修一+模块学业水平检测试题(含答案)
- 广西钦州市高新区2017届高三11月月考政治试卷
- 浙江省湖州市2016-2017学年高一上学期期中考试政治试卷
- 浙江省湖州市2016-2017学年高二上学期期中考试政治试卷
- 辽宁省铁岭市协作体2017届高三上学期第三次联考政治试卷
- 广西钦州市钦州港区2016-2017学年高二11月月考政治试卷
- 广西钦州市钦州港区2017届高三11月月考政治试卷
- 广西钦州市钦州港区2016-2017学年高一11月月考政治试卷
- 广西钦州市高新区2016-2017学年高二11月月考政治试卷
- 广西钦州市高新区2016-2017学年高一11月月考政治试卷
分类导航
- 互联网
- 电脑基础知识
- 计算机软件及应用
- 计算机硬件及网络
- 计算机应用/办公自动化
- .NET
- 数据结构与算法
- Java
- SEO
- C/C++资料
- linux/Unix相关
- 手机开发
- UML理论/建模
- 并行计算/云计算
- 嵌入式开发
- windows相关
- 软件工程
- 管理信息系统
- 开发文档
- 图形图像
- 网络与通信
- 网络信息安全
- 电子支付
- Labview
- matlab
- 网络资源
- Python
- Delphi/Perl
- 评测
- Flash/Flex
- CSS/Script
- 计算机原理
- PHP资料
- 数据挖掘与模式识别
- Web服务
- 数据库
- Visual Basic
- 电子商务
- 服务器
- 搜索引擎优化
- 存储
- 架构
- 行业软件
- 人工智能
- 计算机辅助设计
- 多媒体
- 软件测试
- 计算机硬件与维护
- 网站策划/UE
- 网页设计/UI
- 网吧管理