Category Archives: digital information

Companies using social capital data for betting on people’s lives

Flickr photo by idletype

The Wall Street Journal recently noted  how insurance companies (Aviva PLC, Prudential Financial, AIG) bet on whom to insure at what rates through data mining.  Much of the info gleaned from online purchases and other digital traces is more lifestyle: is the insurance applicant an athlete? a TV addict? a hunter?

But some of the information is social capital-related:

Increasingly, some gather online information, including from social-networking sites. Acxiom Corp., one of the biggest data firms, says it acquires a limited amount of “public” information from social-networking sites, helping “our clients to identify active social-media users, their favorite networks, how socially active they are versus the norm, and on what kind of fan pages they participate.”

For insurers and data-sellers alike, the new techniques could open up a regulatory can of worms. The information sold by marketing-database firms is lightly regulated. But using it in the life-insurance application process would “raise questions” about whether the data would be subject to the federal Fair Credit Reporting Act, says Rebecca Kuehn of the Federal Trade Commission’s division of privacy and identity protection. The law’s provisions kick in when “adverse action” is taken against a person, such as a decision to deny insurance or increase rates. The law requires that people be notified of any adverse action and be allowed to dispute the accuracy or completeness of data, according to the FTC.

The article also notes that Celent, an insurance consulting division of Marsh & McLennan, indicates that such online social-network data could be mined for policing fraud and in making pricing decisions: “A life insurer might want to scrutinize an applicant who reports no family history of cancer, but indicates online an affinity with a cancer-research group, says Mike Fitzgerald, a Celent senior analyst.  ‘Whether people actually realize it or not, they are significantly increasing their personal transparency,’ he says. ‘It’s all public, and it’s electronically mineable.’  ”

We’ve written earlier about other life insurers using social capital data in making insurance decisions, but in those cases, the individual was being asked directly about his social and civic involvement.  [See also this blog post about social capital and healthcare.]

We applaud the life insurers for coming to the late realization that social capital data is strongly related to health, but strongly believe they should be more transparent about what they are doing.  Then it wouldn’t violate privacy concerns and it would have the added benefit of making the insured better aware of the positive health impact of being more involved civicly and socially, which might actually induce those who are less engaged to become more so.

See earlier blog post on loss of digital privacy and digital traces left online.

Read “Insurers Test Data Profiles to Identify Risky Clients” (Wall St. Journal, 11/17/2010, by Leslie Scism and Mark Maremount)


The Friendship Paradox: using social networks to predict spread of epidemics

Nick Christakis and James Fowler (whose research we’ve previously highlighted) is back with research that shows how one can easily use “sensors” in a network to track and get early warning regarding the spread of epidemics.

They took advantage of the “friendship paradox” to do so.  In any real-life network, our friends are more popular than we are.  [This is true mathematically in any group with some loners and some social butterflies.  If you poll members in the group about their friendships, far more of those friends who are reported are going to be the social butterflies.  If far more people reported friendships with the loners, they wouldn’t be loners.  See discussion here.]

Thus by asking random people in a network, in this case Harvard students, about their friends, researchers know that their friends are more centrally located in these networks.    Then one can track behavior among the random group and their friends, in this case the spread of H1N1 flu (swine flu) among 744 Harvard students in 2009.

Those more central in these networks (the “friend” group) got the flu a full 16-47 days earlier than the random group.  Thus, for public authorities, monitoring such a “friend” group could give one early indication of a spreading epidemic; they could serve as “canaries in the coal mine”.  If the process of spreading was person-to-person rather than being exposed to some impersonal information (via a website or a broadcast), one could also track the difference between a random group and a friend group to predict other more positive epidemics, like the spread of information, or the diffusion of a product, or a social norm.

We write in general on this blog about the positive benefits of social ties (social capital), but Fowler and Christakis’ study also shows you that having friends and being centrally located has its costs: in this case getting the flu faster.  [In some ways, this is analogous to Gladwell’s discussion in the Tipping Point of how Mavens, Connectors and Salesmen may be disproportionately influential in the spread of ideas through networks, although Fowler and Christakis are far more mathematical in identifying who these central folks are.]

The “friends group manifested the flu roughly two weeks prior to the random group using one method of detection, and a full 46 days prior to the epidemic peak using another method.

‘We think this may have significant implications for public health,’ said Christakis. ‘Public health officials often track epidemics by following random samples of people or monitoring people after they get sick. But that approach only provides a snapshot of what’s currently happening. By simply asking members of the random group to name friends, and then tracking and comparing both groups, we can predict epidemics before they strike the population at large. This would allow an earlier, more vigorous, and more effective response.’

‘If you want a crystal ball for finding out which parts of the country are going to get the flu first, then this may be the most effective method we have now,’ said Fowler. ‘Currently used methods are based on statistics that lag the real world – or, at best, are contemporaneous with it. We show a way you can get ahead of an epidemic of flu, or potentially anything else that spreads in networks.’

Christakis also notes that if you provided a random 30% in a population with immunity to a flu, you don’t protect the greater public, but if you took a random 30% of the population, asked them to name their friends, and then provided immunization to their friends, in a typical network the “friend” immunization strategy would achieve as high immunity protection for the entire network as giving 96% of the population immunity shots, but at less than 1/3 the cost.

The following video shows how the nodes that light up first (markers for getting the flu) are more central and far less likely to be at the periphery of the social network.  The red dots are people getting the flu; the yellow dots are friends of people with the flu and the size of the dot is proportional to how many of their friends have the flu.

Good summary of this research and its implications here: Nick Christakis TED talk (June 2010) – How social networks predict spread of flu.  Nick also discusses some of the implications of computational social science, which we’ve previously discussed here under the heading of digital traces.  Nick discusses how one could use data gathered from these networks (either passively or actively) to do things like predict recessions from patterns of fuel consumption by truckers, to communicate with drivers of a road of impending traffic jams ahead of them (by monitoring from cell phone users on the road ahead of them how rapidly they are changing cell phone towers) to asking those central in a mobile cellphone network (easily mapable today) to text their daily temperature (to monitor for impending flu epidemics).  Obviously these raise issues of privacy, which Nick does not discuss.

News release of study

Academic article in PLoS ONE

James Fowler on The Colbert Report discussing the book by Fowler and Christakis called Connected.

Nick Christakis presenting a talk at TED — The Hidden Influence of Social Networks. (February 2010).  In the talk he notes that while almost half of the variation in our number of friends is genetically-based (46%), that another equally large portion (47%) of whether your friends know each other is a function of whether your friends are the type that introduce (“knit”) their friends together or keep them apart (what they call “transitivity”).  About a third of whether you are in the center of social networks or not is genetically inherited.  Christakis believes that these social networks are critically important to transmitting ideas, and kindness, and information and goodness; and if society realized how valuable these networks were, we’d focus far more of our time, energy and resources into helping these networks to flourish.

Honest signals: our hidden, influential patterns of communication

(photo by shadowplay)

(photo by shadowplay)

Interesting lunchtime talk by Alex (Sandy) Pentland about honest signals sponsored by the Program on Networked Governance program at Harvard’s Kennedy School.

Sandy’s theory is that 50,000-100,000 years ago, humans lacked language, yet still managed to communicate with each other through “honest signals” (ancient primate signaling efforts which developed biologically to communicate our intentions, our trustworthiness, our suitability as a collaborator, whether we were bluffing, etc.). When language was introduced, it didn’t over-write or eliminate these honest signals but evolved to be synergistic with these signals. While we focus much more on language, these signals are measurable (Sandy’s group developed machines to read these signals) and often equally or more effective at predicting various behaviors than language. Sandy’s research aims to shine a light on this powerful channel that we know less about.

Sandy notes that such data from electronic ID badges (sociometers) and specially-programmed smart phones, can give us a “god’s eye” view of how the people in organizations interact, and observe the “rhythms of interaction for everyone in a city”.

What are such behaviors?

Sandy’s group at the MIT Media lab focuses on 4 of them, although there are probably others (laughter, yawning, etc.).

  1. INTEREST, shown by activity. An autonomic response. For example in children, this is evinced by jumping up and down or in dog’s by barking or wagging tail.
  2. ATTENTION, by looking at influence. Evidence of thalmic attention. Sandy observes that people actively following in conversations break in faster than they could with normal attention spans. Shows that they are processing the conversation and discussion as it goes along and predicting the right time to break in.
  3. EMPATHY, as shown by mimicry. This is evinced by mirror neurons, which are observable in infants as young as 3 hours old that can imitate a mother sticking her tongue out. People who evince higher levels of mimicry are seen as more empathic and more trustworthy. For example, they had computerized agents trying to sell an unpopular policy to students; in the cases where the computerized agent mimicked the body movements of the experimental subject with a 4 second delay, the computerized agent was 20% more successful in selling the policy to the experimental subject and the subject was unaware that he/she was being mimicked.
  4. EXPERTISE, as shown by consistency. This a function of the cerebellar motor. We assume that people who can do things more smoothly are more expert because of the number of actions that need to be simultaneously coordinated.

What do these honest signals predict?
These are only some of the examples:
-Computers attentive to these honest signals (and ignoring the content) were as successful in predicting from pitches by entrepreneurs which business plans would be judged by business school students as successful.
– Effective sales pitches: listening to the first few seconds of a telephone sales pitch (without listening to the language) but listening to tone, timing, etc., the computer could predict with 80% accuracy which would be successful calls.:
-Success in speed dating: monitoring the female’s signals predicted 35% of the variation in which couples exchanged their phone numbers, and this was significantly higher than any other factor researchers could find. Interestingly, the men’s signals were not predictive, but somehow men must have been able to subconsciously pick up on the women’s signals, because in almost all cases the men didn’t ask for phone numbers where it wasn’t reciprocated by women.
– They also found that honest signals predicted depression, predicted who was likely to be successful in negotiating for a pay raise, job interviews, who was bluffing at poker, etc.
Successful individual-level traits: they found that the most successful folks with these “honest signals” were ones who were high in activity, high in influence (others were more likely to mirror their communication styles then they were likely to mirror others’) high in “variable prosody” (their pitch varied and they sounded open to ideas), and high in body language dominance (i.e., they were more likely to directly face another person and others were more likely to not face them square on).  They were often far more successful in these “honest signals” than they were aware of.

Organizational effectiveness

Sandy notes that unlike an MRI, one can hook up an entire organization to these sociometers and absorb micro-second by micro-second, and the results are highly predictive. But the challenge is that while the people who exhibit these highly successful individual traits are useful to organizations, they are usually in “connector” roles for organizations, with star-shaped patterns of communication, where ideas flow through these individuals. While this speeds up the decision-making process, it actually impairs the brainstorming process. Sandy’s group is experimenting with devices to see if making participants aware of the dynamics of a team can influence their behavior in a positive manner.  They have shown with some experiments (Japanese-American teams designing Rube-Goldberg-type projects, and distance teams) that it can change people’s behaviors in a positive manner. The challenge will be to see if the group’s behavior can be more connected at the brainstorming phase and more “star-shaped” at the decision-making stage.

Sandy noted that they have been able to extract many properties of the social networks using smart phones: from a combination of where people are (GPS), when, and communication flows (who they talk to and when). He noted some interesting experiments to observe the flow of nurses in a nursing ward, or the flow of taxis in San Francisco, or communication (e-mail and face-to-face) between departments in a German bank. They are now at the stage of trying to get whole dormitories or parts of the city of Boston using these smart phones to try to track social networks and patterns in these data. (I’ve written about digital traces before.)

How could these flows of people be used:

Traffic: one could monitor, for example, delivery vans coursing through the road networks and by observing flows slower than typical, spot emerging traffic problems.

Urban tribes: Sandy noted that by monitoring flows of taxis, you can distill separate patterns of interconnected places. In other words people who live in this neighborhood, work in this area, go to these restaurants, go to these nightclubs. (You are not actually monitoring individual people but patterns of association.  This is equivalent to Netflix telling you that people who like “The Firm” also like “Michael Clayton”.) Or one can even find sub-patterns in a neighborhood:e.g., locations from which people regularly are returning from nightclubs at 3 or 4 AM.

-You can then use these patterns to “find people like me”: based on your own patterns (where you work, where you live, etc.), the system could tell you where many people in your neighborhood shop, go to dinner, or hear music.

Lending: one major bank told Sandy that credit scores are not very good (except at the high end) in predicting repayment rates on loans. Banks would love to use behavioral information (who is at nightclubs late at night, who goes to work early) to predict repayment rates.

Health insurance: similarly one could imagine rates tied to activity levels (who was jogging or getting enough sleep or…)

Germs: they want to use these devices to watch the spread of germs through social networks.

Privacy issues

The above examples of health insurance and lending make one understand why there are clear privacy implications. Do we want banks or health insurers knowing what we are doing (going to nightclubs) to set our rates? Will this be used to impose behavioral bases for “red lining”, where people in certain areas (like the old red lined areas) don’t get loans because of some behavior of theirs that is correlated with low repayment rates? Does it make any difference if these people can supposedly change their behavior?
-Sandy thinks we should move from company owning the personal data and sharing with no one or only sharing if an individual didn’t say it was confidential to the person owning the data and being able to decide how it gets used and whether the owner gets compensated for such use.
-There are clearly issues here about how the decision is framed? Does the individual truly understand why certain marginal information is so useful to a bank or insurer? And there may be negative externalities for all, even if you don’t choose to share your information with these companies?

Sandy’s research also raises questions about what happens when you start incentivizing people in companies based on these behaviors, or you start teaching people about these hidden “honest signals”. Do people start learning how to display these honest signals and dupe people who are not as aware of this (e.g., mimicking others to increase sales or do better in negotiations). If so, do people start focusing on these behaviors (like mimicry) and consciously teach themselves not to be swayed by this? Do companies find that people who pretend to be connectors (to get a pay raise) are actually less valuable to companies than the people who do it naturally (and are unaware they are doing this)?

See previews of Sandy’s book Honest Signals here.

Buy Alex (Sandy) Pentland’s Honest Signals here.

See interesting related story in NYT, “You’re Leaving a Digital Trail. Should You Care?” (John Markoff, 11/30/08), mentioning Alex Pentland’s work among others and discussing the SF taxi example.