Last updated: May 31, 2024, 3:15 p.m. PT
It was brought to our attention that a threat actor has been observed using stolen customer credentials to target organizations utilizing Snowflake databases. This campaign is continuing to evolve, please see below for the latest.
Background
- A data theft and extortion campaign targeting organizations utilizing Snowflake databases is an emerging threat posed by the threat actor.
- The threat actor primarily exploited environments lacking two-factor authentication (2FA) and originated from commercial VPN IPs.
- An attack tool named “rapeflake” has been identified in these incidents, though detailed information about the tool itself remains unknown.
- The threat actor has directly extorted organizations, further pressuring them by publicly posting stolen data for sale on hacker forums.
Mitiga customers were already notified if they were affected. To learn more about how this threat can be detected and investigated in Snowflake environments, continue reading.
What is Snowflake?
Snowflake is a cloud-based data warehousing and analytics platform designed to handle large-scale data storage and processing. It offers a highly scalable architecture that enables organizations to manage and analyze massive amounts of data seamlessly. Snowflake is widely utilized across various industries for its ability to consolidate data from multiple sources, providing robust support for data-driven decision-making and advanced analytics.
Snowflake is highly popular, boasting 9,437 global customers and holding a significant 21.51% market share in the data warehousing market. Its widespread adoption across various industries underscores its robust capabilities and efficiency in handling large-scale data operations.
Conducting a Threat Hunt in Snowflake Environments
What is Threat Hunting?
Threat hunting is a proactive cybersecurity practice where analysts actively search through networks, systems, and databases to detect and isolate potential security threats that may have bypassed traditional security measures. Unlike reactive measures that respond to detected threats, threat hunting involves continuous monitoring and analysis to uncover hidden or emerging threats before they can cause significant damage.
How to Conduct a Threat Hunt in Snowflake
To investigate a potential breach in your Snowflake environment and detect any possible data exfiltration, organizations can leverage the forensic information available within Snowflake’s built-in database and schemas.
Forensic Information for Threat Hunting:
In every Snowflake environment, there is a database named "Snowflake" housing a schema called "ACCOUNT_USAGE." This schema holds metadata and historical usage data for the current Snowflake account, updating with each action taken, providing a comprehensive audit trail.
Key Views in the ACCOUNT_USAGE Schema:
- QUERY_HISTORY:
- Logs all queries within the account, aiding in identifying suspicious or unauthorized queries potentially indicative of data exfiltration attempts.
- Explore detailed table columns here.
- LOGIN_HISTORY:
- Tracks all login attempts, facilitating the detection of irregular login activities, like repeated failed attempts or logins from unfamiliar locations.
- Discover detailed table columns here.
- SESSIONS:
- Captures details of all created sessions, allowing monitoring of session activities and detection of anomalies in session behavior.
- Refer to detailed table columns here.
- ACCESS_HISTORY (Requires at least Enterprise Edition):
- Records all user activity within the account, crucial for monitoring user actions and identifying any unauthorized access or data manipulation.
- Find detailed information about the columns in the Snowflake documentation.
In the following sections we'll demonstrate how you can leverage "Query_History" and "Login_history" logs to effectively identify and investigate suspicious behavior within your Snowflake environment.
Analyzing Query History for Anomalies
In the "QUERY_HISTORY" view, we focus on spotting unusual user activities, which could suggest data exfiltration attempts. Here are examples of what to look out for:
- More Data Scanned than Average: Detect users scanning more data than usual, which might indicate unauthorized access.
- More Data Written to Results than Average: Identify users writing excessive data to results, possibly extracting data improperly.
- Accessing unusually number of warehouses or databases: Flag users accessing an unusually high number of warehouses or databases, which could signal unauthorized exploration.
- Anomaly detection in new daily resource access: Compare new daily resource accesses to a baseline, to detect significant deviations, indicating potential anomalies or unusual activity. This is to surface users that accessed anomalous number of new resources (Databases, Warehouse) in a single day.
- Rare client applications used by user: Highlight uncommon client applications used by users, as they might indicate suspicious activities.
- Exfiltration through inline URL to external cloud storage location: identify instances where data might be copied from Snowflake tables to external cloud storage locations, such as Amazon S3, Google Cloud Storage, Azure Blob Storage, by scanning for the COPY INTO command followed by a valid URL.
WITH filtered AS (
SELECT *,
LOWER(QUERY_TEXT) AS query_text_lower,
CASE
WHEN LOWER(QUERY_TEXT) RLIKE 'copy into\\s+(s3://[^\\s]+|gcs://[^\\s]+|azure://[^\\s]+|https?://[^\\s]+)' THEN TRUE
ELSE FALSE
END AS url_found
FROM query_history
WHERE LOWER(QUERY_TEXT) LIKE '%copy into%'
),
final AS (
SELECT *,
REGEXP_EXTRACT(query_text_lower, 'copy into\\s+(s3://[^\\s]+|gcs://[^\\s]+|azure://[^\\s]+|https?://[^\\s]+)', 1) AS extracted_url
FROM filtered
WHERE url_found = TRUE
)
SELECT * FROM final
The method extracts any external cloud storage locations found in the QUERY_TEXT and flags these instances. If an extracted URL is unfamiliar or not part of your organization's regular data transfer locations, this should be considered a red alert for potential malicious activity.
By default, Snowflake allows the use of COPY INTO <location> to unload data to external URLs. To mitigate this risk and prevent unauthorized data exfiltration, you can configure Snowflake to block such actions. This can be done by setting the PREVENT_UNLOAD_TO_INLINE_URL parameter to true. This setting ensures that attempts to unload data to an inline URL are automatically prevented.
In the QUERY_HISTORY view, various detections help identify outliers. During threat hunting, one standout detection is Anomaly Detection in New Daily Resource Access, which serves as an excellent lead generator for further investigation. This detection method combines two layers of anomaly detection
- Identifying New Resource Access: Detecting new resources accessed by users for the first time in history offers valuable insights. However, to ensure reliable results, historical data of sufficient length is necessary. When reviewing the results, it is crucial to verify that a flagged day is not the first appearance of the user in history. Otherwise, all resources would be flagged as new.
- Calculating Daily Average and Standard Deviation: By calculating the average and standard deviation for the daily number of new resource accesses, we can identify days that stand out. Anomalies may indicate unusual patterns of resource access, warranting further investigation.
Below, we present our implementation for the described logic:
WITH ranked_accesses AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY USER_NAME8, DATABASE_NAME ORDER BY CAST(START_TIME AS TIMESTAMP)) AS row_number,
TO_DATE(CAST(START_TIME AS TIMESTAMP)) AS access_date
FROM query_history
),
first_time_accesses AS (
SELECT access_date, USER_NAME8, DATABASE_ID, DATABASE_NAME
WHERE row_number = 1
),
daily_new_resource_count AS (
SELECT access_date, USER_NAME, COUNT(*) AS count, COLLECT_SET(DATABASE_ID) AS database_ids, COLLECT_SET(DATABASE_NAME) AS database_names
FROM first_time_accesses
GROUP BY access_date, USER_NAME -- Updated DATE to access_date
),
stats AS (
SELECT AVG(count) AS avg_count, STDDEV_POP(count) AS stddev_count
FROM daily_new_resource_count
),
outlier_detection AS (
SELECT *,
(SELECT avg_count FROM stats) + 3 * (SELECT stddev_count FROM stats) AS upper_bound,
(SELECT avg_count FROM stats) - 3 * (SELECT stddev_count FROM stats) AS lower_bound
FROM daily_new_resource_count
)
SELECT *
FROM outlier_detection
WHERE count > upper_bound OR count < lower_bound
A subsequent investigation may involve examining database names, the success or failure of attempts, and reasons for access denial. For instance, an attacker exploring the Snowflake environment may attempt to access databases they lack permission for, resulting in an excessive number of insufficient privilege errors.
WITH error_stats AS (
SELECT date, USER_NAME8, COUNT(*) AS error_count
FROM query_history
WHERE error_code != 'NULL'
GROUP BY date, USER_NAME8
),
total_queries AS (
SELECT date, USER_NAME8, COUNT(*) AS total_queries
FROM query_history
GROUP BY date, USER_NAME8
),
final_stats AS (
SELECT tq.date, tq.USER_NAME8, tq.total_queries,
COALESCE(es.error_count, 0) AS error_count,
(COALESCE(es.error_count, 0) / tq.total_queries) * 100 AS daily_error_percentage
FROM total_queries tq
LEFT JOIN error_stats es ON tq.date = es.date AND tq.USER_NAME8 = es.USER_NAME8
)
SELECT * FROM final_stats;
By keeping an eye on these anomalies in the "QUERY_HISTORY" view, organizations can better detect and respond to potential security threats in their Snowflake environment.
Analyzing Login History for Suspicious Activity
In the "LOGIN_HISTORY" view, our goal is to identify unusual IP addresses and detect suspicious login patterns, such as brute force attacks. Here's what to focus on:
- Rare IP Addresses: Detect IP addresses that rarely appear in login history logs. These uncommon IPs may signal potential security risks and merit further scrutiny.
- Brute force Detection: Implement a brute force detection mechanism to identify patterns of multiple failed login attempts from the same IP address within a short timeframe.
- Threat Intelligence Integration Utilize threat intelligence data to evaluate the risk associated with rare IP addresses. Look for traits such as anonymous VPN, TOR exit node, public proxy, hosting provider, and other indicators of suspicious activity.
By monitoring these anomalies in the "LOGIN_HISTORY" view, organizations can enhance their ability to detect and respond to potential security threats in their Snowflake environment.
Please refer to our GitHub repository for ongoing updates on indicators of compromise (IOCs), anomalies, and behavioral patterns associated with this particular attack.
Next Steps of Action
Once threat hunters identify suspicious behavior through "Query_History" and "Login_History" logs, their next steps are crucial in uncovering the extent of potential threats. Upon detecting anomalies, such as unusual data scanning or rare IP addresses in login attempts, it's imperative for investigators to delve deeper. This involves conducting thorough analysis of associated user activities, cross-referencing contextual information, and correlating findings with other relevant logs or external threat intelligence sources.
An exemplary follow-up investigation tip involves monitoring suspicious activities using the SESSION_ID in the QUERY_HISTORY view. This approach enables the tracking of all activities associated with a potentially compromised user session, including executed queries. By examining related activities within the same session, analysts can assess the risk level, understand the accessed data, and potentially identify any exfiltrated information.
Proactive measures you can take to verify your Snowflake environment is secure:
- SSO Enforcement: While Single Sign-On (SSO) might be in place, is it truly enforced? There's a possibility that users can still authenticate using username/password outside of SSO directly to the Snowflake database. Double-check to prevent unauthorized access.
- MFA (Multifactor authentication) Enforcement: Is Multi-Factor Authentication (MFA) enforced across your organization? Ensure it's not just self-enrolled but mandatory for all users to add an extra layer of security.
- Network Exposure: Is your Snowflake database exposed to the internet? Consider using PrivateLink to limit exposure, or whitelist access only for authorized IP addresses to enhance network security.
Mitiga's Response: Assisting Customers Amidst the Snowflake campaign
To ensure our clients were well-prepared and protected, we reached out with a preliminary alert, informing them about the emerging threat posed by the threat actor. This alert included essential information on potential risks and initial steps to safeguard their Snowflake environments. Following this, we conducted a dedicated, event-driven threat hunt tailored to each client’s environment, meticulously tracking Indicators of Compromise (IOCs) and unusual activities within their systems.
Our threat hunt process involved analyzing key forensic information from Snowflake’s built-in database and schemas. We looked for anomalies such as unusual data scans or excessive data writing, which could indicate unauthorized access. We also scrutinized rare IP addresses, integrating threat intelligence to assess their risk levels.
Additionally, we developed custom Indicators of Attack (IOAs) to suit the unique nature of this threat, enhancing our ability to detect and mitigate potential breaches. Understanding the importance of proper security configurations, we collaborated closely with our clients to verify and optimize the security settings of their Snowflake instances. This included implementing strong authentication mechanisms, proper user permissions, and comprehensive logging and monitoring setups.
We invite organizations concerned about their Snowflake security to reach out to us. Our expert team is ready to assist in assessing your current security configurations and conducting thorough threat hunts to protect your valuable data.
Summary
In today's dynamic threat landscape, collecting forensic data from SaaS environments like Snowflake is paramount to safeguarding against potential security breaches. By proactively conducting continuous threat hunts and analyzing logs such as "Query_History" and "Login_history," organizations can detect and respond to suspicious activities effectively. These measures not only aid in identifying anomalies like unusual data scanning or rare IP addresses but also enable deeper investigations into potential threats. As the Snowflake campaign incident demonstrates, staying vigilant and leveraging advanced analytics are essential for mitigating risks and protecting critical data assets. Moving forward, we commit to updating this blog with any developments related to this campaign and the threat group, ensuring our readers stay informed and empowered in their cybersecurity efforts.
Learn more about this ongoing threat plus how researchers are helping teams understand the impact by watching this webinar.