Wednesday, July 10, 2013
The Physics arXiv Blog
One of the curious features of human behaviour is that it is predictable in certain circumstances but not in others. Knowing the difference is a fantastically valuable skill.
That's why social-media researchers the world over are scrutinising social networks for clues they can use to predict people's behaviour on scales that have never before been achieved. On this blog, we've looked at various attempts to show social media can be used to predict, with varying degrees of success, people's buying habits, movie tastes, and even future stock market purchases.
Today, Petko Bogdanov at the University of California Santa Barbara and a few pals take a new, genetically inspired approach to this task. They say every person has a fixed set of interests, called their social-media genotype, which determines their pattern of behaviour on networks such as Twitter. What's more, they say that once these genotypes have been discovered, they can be used to predict an individual's future behaviour.
These guys start with the hypothesis that each individual has a stable, pre-existing interest in certain topics and that this determines his or her pattern of behaviour on Twitter. One of the goals of their research is to determine whether it's possible to extract an individual's genotype from the firehose of data that Twitter produces. To that end, they analysed one data set consisting of 467 million tweets sent by 42 million users in 2009, and another consisting of 14.5 million tweets sent by nine thousand users in 2012.
They grouped the hashtags from these data sets into five topics—sport, business, celebrities, politics, and science/technology. “Our hypothesis is that individual users exhibit consistent behavior of adopting and using hashtags (stable genotype) within a known topic,” say Bogdanov and co.
And they say their analysis confirms exactly this—that users tend to adopt a stable pattern of hashtag adoption by these topics. This is their social-media genotype.
Bogdanov go on to identify substructures within Twitter through which hashtags of a certain topic tend to flow. They call these structures “topical influence backbones” and say that they are subtly different from the well-known followee/follower networks that exist on Twitter.
Finally, Bogdanov and co's say that given a user's social-media genotype and their relationship to topical influence backbones, it is possible to predict their likely reaction to a hashtag on any a given topic from somebody they follow. They go on to show how this works using their data sets.
That's a powerful trick to pull off. The ability to predict the behaviour of individuals on Twitter is worth its weight in gold.
But as other groups have found, it's one thing to demonstrate this predictability on a test data set. It's quite another to do it with a live feed from Twitter. The acid test will be whether Bogdanov and co can predict a real individual's behaviour tomorrow with the data they gather today.
If they can, the road to the future will be paved with gold. We look forward to seeing the evidence in the not too distant future.
Ref: arxiv.org/abs/1307.0309: The Social Media Genome: Modeling Individual Topic-Speciﬁc Behavior in Social Media