Snark Bite: Like an AI Could Ever Spot Sarcasm

by Jamie Beckett

I never forget a face, but in your case I’ll be glad to make an exception.
Groucho Marx

Comedians dine on sarcasm — the ironic, mocking remarks that say one thing on the surface but cut much deeper.

Could a computer learn to detect this nuanced form of expression? Pushpak Bhattacharyya says they can — and he’s got the algorithms to prove it.

Bhattacharyya — director of the Indian Institute of Technology (IIT), Patna, and a professor at IIT, Bombay — has dedicated the past few years to using GPU-powered deep learning to spot sarcasm online.

“We found lots of tweets, especially in politics, to be sarcastic,” he said, in what may be the biggest understatement so far of 2018.

Sarcasm Research Is No Joke

Bhattacharyya’s not kidding when he says there are sound reasons to study sarcasm. Politicians, heads of state, businesses and even celebrities concerned with protecting their reputations monitor Twitter and other social media to assess public opinion.

But the methods they use for what’s known as “sentiment analysis” fall short when it comes to sarcasm, Bhattacharyya said.

“Sarcasm sheds light on how the human mind operates,” he said. So it’s no wonder it’s hard for computers to catch.

Snark Alert

The word “sarcasm” comes from the Greek “sarkasmós,” meaning “tear flesh with teeth,” which pretty much describes its purpose: to ridicule or show contempt.

Sometimes snark comes with a clear signal — a tweet tagged “#sarcasm” or phrases like “as if” or “like you care.” Multiple exclamation marks, capital letters, emoticons and #LOL are other frequent flags for mockery.

Well, that was fun. More cold, snow and ice please. #sarcasm

More often, though, sarcasm isn’t obvious on the surface. Positive statements often hide negative meanings, Bhattacharyya said. He uses the phrase, “I love being ignored” to illustrate his point.

“When you look at the phrase, ‘I love being,’ you expect to see it followed by something positive like ‘rewarded’ or ‘appreciated,’” he said. “Then you see ‘ignored,’ and you understand that it’s sarcasm.”
I love when customers cough in my face.

As if That Wasn’t Hard Enough

Other times, knowing when a remark is tongue-in-cheek depends on understanding context or knowing something about the world. If someone says, “Phone battery lasts two hours. Awesome,” the AI would have to understand that a two-hour battery life isn’t a good thing.

That points to another problem. Nearly one-fifth of sarcastic tweets relate to numbers, but general sentiment analysis doesn’t pick it up, Bhattacharyya said.
Well 3 hrs of sleep was lovely. It's going to be a long day.


Getting Serious About Sarcasm

When Bhattacharyya and his team — his students, several linguists and psychologists — dug further, they found the key to separating the praise from the put-downs.

“Incongruity is at the heart of sarcasm,” Bhattacharya said. “That’s our main contribution.”

Using the cuDNN-accelerated TensorFlow deep learning framework and our GeForce GTX 1080 Ti GPUs, the researchers trained neural networks to recognize that incongruity in datasets that included everything from tweets to movie reviews to dialogue from the 1990s TV sitcom,“Friends.”

In numerous studies, their algorithms could detect sarcasm more accurately than existing methods. That was especially true for tweets that include numbers; their accuracy rate of 80 percent more than triples previous efforts.

Bhattacharya worked with several of his doctoral students on this project: Aditya Joshi, Mark Carmen, Abhijit Mishra and Raksha Sharma.

Suite on Sarcasm

Building on earlier work, the researchers created a browser-based engine that detects sarcasm. Called Sarcasm Suite, it consists of two trained modules that researchers anywhere can use for snark spotting.

The modules detect sarcasm by evaluating people’s historical tweets, as well as identifying such textual elements as contradictory words and incongruous emotional expressions. To enrich their analysis, the researchers’ algorithms also incorporate factors like word intensity — the difference between “good” and “excellent,” for example — and use data from eye-tracking studies.

The tool also includes code for SarcasmBot, a chatbot that generates sarcasm on its own in response to a question. In one sample of its output, the user asks SarcasmBot, “What do you think of Greg?” The answer: “Well … I like Greg. The way I absolutely appreciate zero accountability people.”

So now, computers can be as sarcastic as the rest of us. That sounds just great.

For more information, see the following papers by Bhattacharya and his team:

 

Similar Stories

  • http://mg.to/ Michael Geary

    Like a human being could ever read that text.

    If you’re having as much trouble reading it as I was, open the developer tools in your browser and on the body style, remove “DINPro” from the font-family, turn off the font-weight: 300 or change it to 400, and increase the font-size to 16px or whatever size you like.

    If the AIs ever do take over, my one request is that they use plenty of sarcasm when they talk about graphic designers who think unreadable text with substandard font-weight is a good thing.

    Of course, being AIs, maybe they could care less.

  • Brian Caulfield

    Just quick question: you don’t like the font (I don’t either). Or do you have other challenges reading this.

  • http://mg.to/ Michael Geary

    It’s just the font I have trouble with, but it’s all three aspects of the font that I mentioned: the small size, the unusual “DINPro” font, and the font-weight: 300. Changing any of those back to more standard settings helps, but changing all three makes it perfectly readable.

    What’s especially funny is that these Disqus comments are nice and easy to read, just like the blog post would have been if the web designer had used standard fonts instead of meddling with the font settings.

  • Brian Caulfield

    Easier to read now? (But seriously, thanks for the feedback!)

    I never forget a face, but in your case I’ll be glad to make an exception.
    — Groucho Marx

    Comedians dine on sarcasm — the ironic, mocking remarks that say one thing on the surface but cut much deeper.

    Could a computer learn to detect this nuanced form of expression? Pushpak Bhattacharyya says they can — and he’s got the algorithms to prove it.

    Bhattacharyya — director of the Indian Institute of Technology (IIT), Patna, and a professor at IIT, Bombay — has dedicated the past few years to using GPU-powered deep learning to spot sarcasm online.

    “We found lots of tweets, especially in politics, to be sarcastic,” he said, in what may be the biggest understatement so far of 2018.

    Sarcasm Research Is No Joke
    Bhattacharyya’s not kidding when he says there are sound reasons to study sarcasm. Politicians, heads of state, businesses and even celebrities concerned with protecting their reputations monitor Twitter and other social media to assess public opinion.

    But the methods they use for what’s known as “sentiment analysis” fall short when it comes to sarcasm, Bhattacharyya said.

    “Sarcasm sheds light on how the human mind operates,” he said. So it’s no wonder it’s hard for computers to catch.

    Snark Alert
    The word “sarcasm” comes from the Greek “sarkasmós,” meaning “tear flesh with teeth,” which pretty much describes its purpose: to ridicule or show contempt.

    Sometimes snark comes with a clear signal — a tweet tagged “#sarcasm” or phrases like “as if” or “like you care.” Multiple exclamation marks, capital letters, emoticons and #LOL are other frequent flags for mockery.

    Well, that was fun. More cold, snow and ice please. #sarcasm

    More often, though, sarcasm isn’t obvious on the surface. Positive statements often hide negative meanings, Bhattacharyya said. He uses the phrase, “I love being ignored” to illustrate his point.

    “When you look at the phrase, ‘I love being,’ you expect to see it followed by something positive like ‘rewarded’ or ‘appreciated,’” he said. “Then you see ‘ignored,’ and you understand that it’s sarcasm.”
    I love when customers cough in my face.

    As if That Wasn’t Hard Enough
    Other times, knowing when a remark is tongue-in-cheek depends on understanding context or knowing something about the world. If someone says, “Phone battery lasts two hours. Awesome,” the AI would have to understand that a two-hour battery life isn’t a good thing.

    That points to another problem. Nearly one-fifth of sarcastic tweets relate to numbers, but general sentiment analysis doesn’t pick it up, Bhattacharyya said.
    Well 3 hrs of sleep was lovely. It’s going to be a long day.

    Getting Serious About Sarcasm
    When Bhattacharyya and his team — his students, several linguists and psychologists — dug further, they found the key to separating the praise from the put-downs.

    “Incongruity is at the heart of sarcasm,” Bhattacharya said. “That’s our main contribution.”

    Using the cuDNN-accelerated TensorFlow deep learning framework and our GeForce GTX 1080 Ti GPUs, the researchers trained neural networks to recognize that incongruity in datasets that included everything from tweets to movie reviews to dialogue from the 1990s TV sitcom,“Friends.”

    In numerous studies, their algorithms could detect sarcasm more accurately than existing methods. That was especially true for tweets that include numbers; their accuracy rate of 80 percent more than triples previous efforts.

    Bhattacharya worked with several of his doctoral students on this project: Aditya Joshi, Mark Carmen, Abhijit Mishra and Raksha Sharma.

    Suite on Sarcasm
    Building on earlier work, the researchers created a browser-based engine that detects sarcasm. Called Sarcasm Suite, it consists of two trained modules that researchers anywhere can use for snark spotting.

    The modules detect sarcasm by evaluating people’s historical tweets, as well as identifying such textual elements as contradictory words and incongruous emotional expressions. To enrich their analysis, the researchers’ algorithms also incorporate factors like word intensity — the difference between “good” and “excellent,” for example — and use data from eye-tracking studies.

    The tool also includes code for SarcasmBot, a chatbot that generates sarcasm on its own in response to a question. In one sample of its output, the user asks SarcasmBot, “What do you think of Greg?” The answer: “Well … I like Greg. The way I absolutely appreciate zero accountability people.”

    So now, computers can be as sarcastic as the rest of us. That sounds just great.

    For more information, see the following papers by Bhattacharya and his team:

    Automatic Sarcasm Detection: A Survey
    Sarcasm Suite: A Browser-Based Engine for Sarcasm Detection and Generation
    “Having 2 hours to write a paper is fun!”: Detecting Sarcasm in Numerical Portions of Text
    Are Word Embedding-based Features Useful for Sarcasm Detection?

  • http://mg.to/ Michael Geary

    Hey Brian, now there’s a concept: you could implement an entire blog as Disqus comments! 😉

    In any case, I figured something out. I have two high-DPI displays on my ThinkPad: the internal WQHD display and an external 24″ 4K/UHD in portrait mode next to the ThinkPad. I run both displays with 225% scaling.

    I was reading the blog post on the 24″ monitor. Other people on HN reported more readable text than I was seeing, on similar sized monitors.

    Then I happened to move the Chrome window over to the internal ThinkPad display, and the text got bigger! The font was still unpleasant, especially with the 300 font-weight, but at least I could read it.

    So I took a look in the developer tools and saw that the page has a @media selector with min-width:1024px that bumps the font-size from 13px to 16.25px (close to the size I suggested earlier) and also bumps the line-height from 22.5px to 24.375px.

    My portrait mode display has a resolution of 2160×3840 (note the order), so with 225% scaling the logical width is 960px. That’s less than the 1024px in the media selector, so the override doesn’t happen and it uses 13px text with 22.5px line height. That’s why I was seeing such tiny text. The internal display is 2560×1440, so with the 225% scaling the logical width is about 1137px, wide enough to trigger the media override on a maximized window and get the 16.25px font size.

    You don’t need a display setup like mine to see this in action; simply make your browser window wider and narrower, and when it crosses the 1024px logical width you will see the text get smaller or larger (along with other layout changes).

    My feeling is that a media selector should never change the font size like this. People don’t expect the font size to change merely because they changed the width of their browser window. On Windows in particular, it’s pretty common to use Windows+Left or Windows+Right (or just drag the window) to make a window take only half the screen width. When you do that on a typical display, the media selector on this blog will drop out and the font size will drop from 16.25px to 13px, which is too small for good readability.