That demo is here, in case you want to check it out before reading further. Go on, this article will still be here when you finish.
Andrew Schwartz, an assistant professor of computer science at Stony Brook University and one of the authors of the analysis, said that they can predict gender “simply from their language” more than 90 percent of the time, despite the many, many similarities between how men and women speak. “The question is, how do they differ?” he added.
To figure that out, Schwartz said, the team used an open vocabulary approach. Instead of setting the words they wanted to look at in advance, they gathered the data and looked at which words naturally fell together, and then worked on determining how each gathered topic related to gender.
Women were more inclined to use words like “party,” “amazing,” “thankful,” and “learn,” and were particularly associated with words describing positive emotions, social relationships and intensive adverbs (like, “soooo.”) Men, meanwhile, were strongly associated with words pertaining to government, competition, and specific hobbies – video games were one example.
(Yes, I’ve covered up a couple of words in the “Arrogant-calculating” section, because they are swears that are no-nos for The Washington Post. They are also highly male-linked, in this study.)
One kind of surprising thing, visible in the above chart, comes from their evaluation of the assertiveness of these words. Previous research would suggest that men would use more assertive words in their Facebook posts; in fact, the team found the opposite. Although the results were similar, women were actually slightly more assertive than men.
The potential applications for this sort of thing, as you might have guessed, are numerous. On Facebook, many users have already publicly provided information like age and gender, but not every social platform collects this. With more information about what those subtle differences are between men and women online, it’s possible to develop better algorithms to do that sort of predicting.
“There is a lot of interest in using such algorithms in all sorts of ways, but I do have concerns of things being applied too quickly,” Dr. Margaret Kern, Senior Lecturer from the Centre for Positive Psychology at the University of Melbourne in Australia and one of the paper’s authors, said. “For instance, if an algorithm classifies a person as being depressed, and sends a warning to the person or the friend network, what if the person is not?”