Detection: Windows File Collection Via Copy Utilities

Description

The following analytic detects the use of Windows command-line copy utilities, such as xcopy, to systematically collect files from user directories and consolidate them into a centralized location on the system. This activity is often indicative of malicious behavior, as threat actors frequently use such commands to gather sensitive information, including documents with .doc, .docx, and .pdf extensions. The detection focuses on identifying recursive copy operations targeting user folders, such as Documents, Desktop, or other directories that commonly store personal or organizational files. Malware that performs this behavior typically attempts to evade detection by using legitimate Windows utilities, executing commands through cmd.exe or other scripting hosts, and writing the collected files to directories like C:\ProgramData or temporary storage locations. Once collected, the information may be staged for exfiltration, used for lateral movement, or leveraged for further compromise of the environment. By monitoring for these types of file collection patterns, security teams can identify suspicious activity early, differentiate between normal administrative tasks and potentially malicious scripts, and prevent sensitive data from being exfiltrated. This analytic is particularly relevant for environments where confidential documents are present and attackers may attempt to harvest them using built-in Windows tools.

1
2| tstats `security_content_summariesonly` count min(_time) as firstTime max(_time) as lastTime from datamodel=Endpoint.Processes where `process_copy` Processes.process IN ("*.doc*","*.docx*","*.xls*","*.xlsx*","*.ppt*","*.pptx*","*.log*","*.txt*","*.db*","*.7z*","*.zip*","*.rar*","*.tar*","*.gz*","*.jpg*","*.gif*","*.png*","*.bmp*","*.pdf*","*.rtf*") by Processes.action Processes.dest Processes.original_file_name Processes.parent_process Processes.parent_process_exec Processes.parent_process_guid Processes.parent_process_id Processes.parent_process_name Processes.parent_process_path Processes.process Processes.process_exec Processes.process_guid Processes.process_hash Processes.process_id Processes.process_integrity_level Processes.process_name Processes.process_path Processes.user Processes.user_id Processes.vendor_product 
3| `drop_dm_object_name(Processes)` 
4| `security_content_ctime(firstTime)` 
5| `security_content_ctime(lastTime)` 
6| `windows_file_collection_via_copy_utilities_filter`

Data Source

Name Platform Sourcetype Source
CrowdStrike ProcessRollup2 N/A 'crowdstrike:events:sensor' 'crowdstrike'
Sysmon EventID 1 Windows icon Windows 'XmlWinEventLog' 'XmlWinEventLog:Microsoft-Windows-Sysmon/Operational'
Windows Event Log Security 4688 Windows icon Windows 'XmlWinEventLog' 'XmlWinEventLog:Security'

Macros Used

Name Value
process_copy (Processes.process_name=copy.exe OR Processes.original_file_name=copy.exe OR Processes.process_name=xcopy.exe OR Processes.original_file_name=xcopy.exe)
windows_file_collection_via_copy_utilities_filter search *
windows_file_collection_via_copy_utilities_filter is an empty macro by default. It allows the user to filter out any results (false positives) without editing the SPL.

Annotations

- MITRE ATT&CK
+ Kill Chain Phases
+ NIST
+ CIS
- Threat Actors
ID Technique Tactic
T1119 Automated Collection Collection
Exploitation
DE.AE
CIS 10

Default Configuration

This detection is configured by default in Splunk Enterprise Security to run with the following settings:

Setting Value
Disabled true
Cron Schedule 0 * * * *
Earliest Time -70m@m
Latest Time -10m@m
Schedule Window auto
Creates Risk Event True
This configuration file applies to all detections of type anomaly. These detections will use Risk Based Alerting.

Implementation

The detection is based on data that originates from Endpoint Detection and Response (EDR) agents. These agents are designed to provide security-related telemetry from the endpoints where the agent is installed. To implement this search, you must ingest logs that contain the process GUID, process name, and parent process. Additionally, you must ingest complete command-line executions. These logs must be processed using the appropriate Splunk Technology Add-ons that are specific to the EDR product. The logs must also be mapped to the Processes node of the Endpoint data model. Use the Splunk Common Information Model (CIM) to normalize the field names and speed up the data modeling process.

Known False Positives

Administrators may execute this command for testing or auditing.

Associated Analytic Story

Risk Based Analytics (RBA)

Risk Message:

An instance of $parent_process_name$ spawning $process_name$ was identified on endpoint $dest$ by user $user$ attempting to collect documents..

Risk Object Risk Object Type Risk Score Threat Objects
dest system 5 parent_process_name, process_name
user user 5 parent_process_name, process_name

References

Detection Testing

Test Type Status Dataset Source Sourcetype
Validation Passing N/A N/A N/A
Unit Passing Dataset XmlWinEventLog:Microsoft-Windows-Sysmon/Operational XmlWinEventLog
Integration ✅ Passing Dataset XmlWinEventLog:Microsoft-Windows-Sysmon/Operational XmlWinEventLog

Replay any dataset to Splunk Enterprise by using our replay.py tool or the UI. Alternatively you can replay a dataset into a Splunk Attack Range


Source: GitHub | Version: 1