Detection: Web Fraud - Account Harvesting

DEPRECATED DETECTION

This detection has been marked as deprecated by the Splunk Threat Research team. This means that it will no longer be maintained or supported. If you have any questions or concerns, please reach out to us at research@splunk.com.

Description

This search is used to identify the creation of multiple user accounts using the same email domain name.

1`stream_http` http_content_type=text* uri="/magento2/customer/account/loginPost/" 
2| rex field=cookie "form_key=(?<SessionID>\w+)" 
3| rex field=form_data "login\[username\]=(?<Username>[^&
4|^$]+)" 
5| search Username=* 
6| rex field=Username "@(?<email_domain>.*)" 
7| stats dc(Username) as UniqueUsernames list(Username) as src_user by email_domain 
8| where UniqueUsernames> 25 
9| `web_fraud___account_harvesting_filter`

Data Source

Name Platform Sourcetype Source Supported App
N/A N/A N/A N/A N/A

Macros Used

Name Value
stream_http sourcetype=stream:http
web_fraud___account_harvesting_filter search *
web_fraud___account_harvesting_filter is an empty macro by default. It allows the user to filter out any results (false positives) without editing the SPL.

Annotations

- MITRE ATT&CK
+ Kill Chain Phases
+ NIST
+ CIS
- Threat Actors
ID Technique Tactic
T1136 Create Account Persistence
KillChainPhase.INSTALLATION
NistCategory.DE_CM
Cis18Value.CIS_10
Indrik Spider
Scattered Spider

Default Configuration

This detection is configured by default in Splunk Enterprise Security to run with the following settings:

Setting Value
Disabled true
Cron Schedule 0 * * * *
Earliest Time -70m@m
Latest Time -10m@m
Schedule Window auto
Creates Notable Yes
Rule Title %name%
Rule Description %description%
Notable Event Fields user, dest
Creates Risk Event True
This configuration file applies to all detections of type TTP. These detections will use Risk Based Alerting and generate Notable Events.

Implementation

We start with a dataset that provides visibility into the email address used for the account creation. In this example, we are narrowing our search down to the single web page that hosts the Magento2 e-commerce platform (via URI) used for account creation, the single http content-type to grab only the user's clicks, and the http field that provides the username (form_data), for performance reasons. After we have the username and email domain, we look for numerous account creations per email domain. Common data sources used for this detection are customized Apache logs or Splunk Stream.

Known False Positives

As is common with many fraud-related searches, we are usually looking to attribute risk or synthesize relevant context with loosely written detections that simply detect anamolous behavior. This search will need to be customized to fit your environment—improving its fidelity by counting based on something much more specific, such as a device ID that may be present in your dataset. Consideration for whether the large number of registrations are occuring from a first-time seen domain may also be important. Extending the search window to look further back in time, or even calculating the average per hour/day for each email domain to look for an anomalous spikes, will improve this search. You can also use Shannon entropy or Levenshtein Distance (both courtesy of URL Toolbox) to consider the randomness or similarity of the email name or email domain, as the names are often machine-generated.

Associated Analytic Story

Risk Based Analytics (RBA)

Risk Message Risk Score Impact Confidence
tbd 25 50 50
The Risk Score is calculated by the following formula: Risk Score = (Impact * Confidence/100). Initial Confidence and Impact is set by the analytic author.

References

Detection Testing

Test Type Status Dataset Source Sourcetype
Validation Not Applicable N/A N/A N/A
Unit ❌ Failing N/A N/A N/A
Integration ❌ Failing N/A N/A N/A

Replay any dataset to Splunk Enterprise by using our replay.py tool or the UI. Alternatively you can replay a dataset into a Splunk Attack Range


Source: GitHub | Version: 2