Black Hat Tool SNAP_R for Twitter Spear Phishing Campaign

October 14, 2016

Categories

Since Twitter has access to an enormous amount of resources such as extensive personal data, bot-friendly API, colloquial syntax and ubiquitous amounts of short URLs, it is easy for adversaries to machine-generate malicious content for victims. As a response, Baltimore cybersecurity firm ZeroFox created the SNAP_R Twitter bot as a proof-of-concept for the next generation of phishing techniques, explaining its methods in a whitepaper released at the recent Black Hat security conference in August. Black Hat Tool SNAP_R for Twitter Spear Phishing Campaign

SNAP_R is essentially a spear phishing tool used to automate the creation of fake tweets complete with malicious short links with messages to make a clickthrough more likely. Using a recurrent neural network to churn through a victim’s tweets and those of their followers, SNAP_R sends a dynamic message relevant to their interests and uses clustering to identify high-value targets based on social engagement such as followers and retweets, and measures the bot’s success by tracking clickthrough rates of IP-tracked links. After the target use clicks on the shortened link, SNAP_R writes a tweet loaded up with a link to a site containing malware and sends it.

John Seymour and Philip Tully, the researchers behind SNAP_R, states that phishing campaigns are more effective on Twitter accounts. It can take on average five to 10 minutes to write a single spear phishing email, spear phishing tweets are much faster to generate in a matter of seconds, and this highly depends on how much data they in.

Methodology

In order to make machine-generated Twitter accounts sound more authentic and believable to humans, machine learning depends upon using neural networks that can deal with human language. An approach to machine learning, called Natural Language Processing (NPL), is when raw text is the data source from which patterns are extracted.

However, the challenge with NPL deals with the intricacies of the human language. Human communication can be vague at times due to frequent usage of colloquialisms, abbreviations, and misspellings. These inconsistencies make computer analysis of natural language difficult at best, but in the last decade NLP as a field has progressed immeasurably.

Using NPL to generate datasets from neural networks is where phishing campaigns can originate, as phishing aims to attract unsuspecting users with generated sentences that would encourage the user to click on.

SNAP_R generates the fake tweets using two types of recurrent neural networks: 1) Markov Models and 2) Long Short-Term Memory (LSTM).

Markov Models is used for automated speech, or generating words and stringing sentences based on word repetition. When the SNAP_R is presented with a target Twitter profile as training data, which are used to describe a sequence of possible next words, the probability of each word depends only on the state attained in the previous event. SNAP_R leveraging this model is instantaneous at only a few seconds. However, because the training is entirely depending on the previous exact sets of words and how long a Twitter user held an account, the sentences it can generate in a tweet can sound nonsensical.

LSTM is a slightly more advanced neural network in which the network has a deeper representation of natural language. They differ from Markov Models by being able to generalize the contextualization of a sentence when predicting the next word, while at the same time discarding previous words. LSTMs have been used extensively for NLP because language is naturally sequential and words that are separated by a large distance may still be related to each other. In the application to SNAP_R, LSTMs have a greater chance of generating a false tweet with higher clickthroughs. However, the increase in accuracy of LSTMs is time-consuming since they require more days and data to train.

Recurrent neural networks can solve the problem of NPL, or how to address interactions between computers and human (natural) languages, as opposed to traditional neural networks. These networks have a loop structure that enables information to be kept continuously, and target consistent wording to generate the correct sentence.

When using speech tagging in neural networks overtime, so can potential phishing campaigns. This would be perfect in case use for Twitter due to the short 140-character message limit for “tweets,” use of shortened links to meet the 140 character limit, bot-friendly API, and the wide-acceptance of using short-hand/broken English. As a result, SNAP_R boasts a >30% accuracy success rate compared to mostly automated phishing at 5-14% accuracy and stacking up to spear phishing at 25% accuracy. Due to the speech and amount of tweets that are generated, SNAP_R’s net return is significantly higher.

Takeaway

The researchers claim the research to be “the world’s first automated end to end spear phishing campaign generator for Twitter.” The purpose is primarily for educational use and a security assessment tool of internal pentesting, staff recruitment, or social engagement. Since Twitter as a social media tool is used as a public space, users can be more susceptible to phishing than email phishing campaigns.

Through using similar attack methods as actual adversaries, machine learning can provide insight into how they can be used offensively to automate spear phishing campaigns. With the abundance of data available in social media networks, they can be manipulated for social engineering. SNAP_R as a whitehat experiments has larger implications that this type of phishing could be replicated across other social media networks just as easily if adversaries invest in the time and training data for themselves.