Though I don’t use Facebook, recently I have been studying how social networks can be exploited to influence a large population of people. It’s fascinating to see how many people will buy a product just because someone they admire and follow on social media promoted it. In the 90’s, it was Michael Jordan and Nike, now it’s all the Youtube vloggers and Instagram celebs. If you are a marketer, this is a golder opportunity to promote your product instantly. If you are a user, it’s the time to ask, am I getting influenced by people I follow?

In recent years, the popularity of social networks such as Facebook, Instagram, Twitter has skyrocketed. These tools play a fundamental role as a medium for the spread of information, ideas and influence. The primary reason for the success of social media tools is that they do the job of connecting people exceptionally well. Moreover, they are also becoming very popular as a marketing platform, which allows the information and ideas to influence a large population in a short period of time. As there are millions of people using these social media websites, they have become an attractive medium for marketers for product promotion.

A very effective way of promoting a product is to give free samples to certain users, resulting in them promoting the product on their social media profile,which reaches to more users. If done well, this can result in viral marketing, where a huge number of population can be reached in a relatively short period of time.

Given the huge size of these networks, ranging from millions of users and billions of edges, there are some challenges that have to be met for effective promotion. A marketer can not just randomly pick users to hand out free samples, due to time and cost constraints. These users need to be picked strategically, so that using as few users as possible, the product can reach to as many people as possible. Hence this becomes an optimization problem.

**Influence maximization** is the problem of finding a small subset of seed nodes in a network which, if activated influence the maximum number of nodes in the graph. This problem has been widely researched in the academia. It can be formally defined as follows:

Assume the standard information diffusion model, called the Independent Cascade(IC) model, formulated by Kempe[1]. In this model, we have a directed graph `G`

, which represents the social network. Influence spreads via a random process that begins at a set `S`

of seed nodes. Each node, once infected,has a chance of subsequently influencing its neighbours.

`G= (V,E,p)`

: a directed influence graph, where`V`

is a vertex set of size`n`

,`E`

is an edge set of size`m`

,`p:E→[0,1]`

is a propagation probability function that maps a probability value between a pair of vertices.

`S`

: set of seeds.`σ(S)`

: influence spread of a seed set`S`

under the IC model, is defined as the expected total number of influenced vertices for`S`

.

** Problem 1 (Influence estimation) **: Given a graph `G= (V,E,p)`

and a vertex set `S⊆V`

, compute the influence spread `σ(S)`

of `S`

.

** Problem 2 (Influence maximization)**: Given a graph `G= (V,E,p)`

and an integer `k`

, find a vertex set `S⊆V`

of size `k`

that maximizes `σ(S)`

.

References:

[1] D. Kempe, J. Kleinberg, and E ́. Tardos. Maximizing the spread of influence through a social network. In KDD, pages 137–146, 2003.