November 23, 2023
6m read
New applications emerge at an almost impossible to keep-up-with pace, creating a constant challenge and blind spot for IT and security teams in the form...
November 23, 2023
6m read
Cato Application Catalog – How we supercharged application categorization with AI/ML New applications emerge at an almost impossible to keep-up-with pace, creating a constant challenge and blind spot for IT and security teams in the form of Shadow IT. Organizations must keep up by using tools that are automatically updated with latest developments and changes in the applications landscape to maintain proper security.
An integral part of any SASE product is its ability to accurately categorize and map user traffic to the actual application being used. To manage sanctioned/unsanctioned applications, apply security policies across the network based on the application or category of applications, and especially for granular application controls using CASB, a comprehensive application catalog must be maintained.
At Cato, keeping up required building a process that is both highly automated and just as importantly, data-driven, so that we focus on the applications most in-use by our customers and be able to separate the wheat from the chaff.In this post we’ll detail how we supercharged our application catalog updates from a labor-intensive manual process to an AI/ML based process that is fully automated in the form of a data-driven pipeline, growing our rate of adding new application by an order of magnitude, from tens of application to hundreds added every week.
What IS an application in the catalog?
Every application in our Application Catalog has several characteristics:
General – what the company does, employees, where it’s headquartered, etc.
Compliance – certifications the application holds and complies with.
Security – features supported by the application such as if it supports TLS or Two-Factor authentication, SSO, etc.
Risk score – a critical field calculated by our algorithms based on multiple heuristics (detailed here later) to allow IT managers and CISOs focus on actual possible threats to their network.
Down to business, how it actually gets done
We refer to the process of adding an application as “signing” it, that is, starting from the automated processes up to human analysts going over the list of apps to be released in the weekly release cycle and giving it a final human verification (side note: this is also presently a bottleneck in the process, as we want the highest control and quality when publishing new content to our production environment, though we are working on ways to improve this part of the process as well).
As mentioned, first order of business is picking the applications that we want to add, and for that we use our massive data lake in which we collect all the metadata from all traffic that flows through our network.We identify these by looking at the most used domains (FQDNs) in our entire network, repeating across multiple customer accounts, which are yet to be signed and are not in our catalog.
[boxlink link="https://catonetworks.easywebinar.live/registration-everything-you-wanted-to-know-about-ai-security"] Everything You Wanted To Know About AI Security But Were Afraid To Ask | Watch the Webinar [/boxlink]
The automation is done end-to-end using “Shinnok”, our in-house tool developed and maintained by our Security Research team, taking the narrowed down list of unsigned apps Shinnok begins compiling the 4 fields (description, compliance, security & risk score) for every app.
Description – This is the most straightforward part, and based on info taken via API from Crunchbase
Compliance – Using a combination of online lookups and additional heuristics for every compliance certification we target; we compile the list of supported certifications by the app.For example by using Google’s query API for a given application + “SOC2”, and then further filtering the results for false positives from unreliable sources we can identify support for the SOC2 compliance.
Security – Similar to compliance, with the addition of using our data lake to identify certain security features being used by the app that we observe over the network.
Risk Score – Being the most important field, we take a combination of multiple data points to calculate the risk score:
Popularity: This is based on multiple data points including real-time traffic data from our network to measure occurrences of the application across our own network and correlated with additional online sources. Typically, an app that is more popular and more well-known poses a lower risk than a new obscure application.
CVE analysis: We collect and aggregate all known CVEs of the application, obviously the more high-severity CVEs an application has means it has more opening for attackers and increases the risk to the organization.
Sentiment score: We collect news, mentions and any articles relating to the company/application, we then build a dataset with all mentions about the application.We then pass this dataset through our advanced AI deep learning model, for every mention outputting whether it is a positive or negative article/mentions, generating a final sentiment score and adding it as a data point for the overall algorithm.
Distilling all the different data points using our algorithms we can calculate the final Risk Score of an app.
WIIFM?
The main advantage of this approach to application categorization is that it is PROACTIVE, meaning network administrators using Cato receive the latest updates for all the latest applications automatically. Based on the data we collect we evaluate that 80% - 90% of all HTTP traffic in our network is covered by a known application categorization.Admins can be much more effective with their time by looking at data that is already summarized giving them the top risks in their organization that require attention.
Use case example #1 – Threads by Meta
To demonstrate the proactive approach, we can take a look at a recent use case of the very public and explosive launch of the Threads platform by Meta, which anecdotally regardless of its present success was recorded as the largest product launch in history, overtaking ChatGPT with over 100M user registrations in 5 days.In the diagram below we can see this from the perspective of our own network, checking all the boxes for a new application that qualifies to be added to our app catalog. From the numbers of unique connections and users to the numbers of different customer accounts in total that were using Threads.
Thanks to the automated process, Threads was automatically included in the upcoming batch of applications to sign. Two weeks after its release it was already part of the Cato App Catalog, without end users needing to perform any actions on their part.
Use case example #2 – Coverage by geographical region
As part of an analysis done by our Security Research team we identified a considerable gap in our coverage of application coverage for the Japanese market, and this coincided with feedback received from the Japan sales teams on lacking coverage.Using the same automated process, this time limiting the scope of the data from our data lake being inputted to Shinnok only from Japanese users we began a focused project of augmenting the application catalog with applications specific to the Japanese market, we were able to add more than 600 new applications over a period of 4 months.
Following this we’ve measured a very substantial increase in the coverage of apps going from under 50% coverage to over 90% of all inspected HTTP traffic to Japanese destinations.
To summarize
We’ve reviewed how by leveraging our huge network and data lake, we were able to build a highly automated process, using real-time online data sources, coupled with AI/ML models to categorize applications with very little human work involved.The main benefits are of course that Cato customers do not need to worry about keeping up-to-date on the latest applications that their users are using, instead they know they will receive the updates automatically based on the top trends and usage on the internet.
By Vadim Freger, Dolev Moshe Attiya, Shirley Baumgarten All secured webservers are alike; each vulnerable webserver running on a network appliance is vulnerable in its...
Cisco IOS XE Privilege Escalation (CVE-2023-20198) – Cato’s analysis and mitigation By Vadim Freger, Dolev Moshe Attiya, Shirley Baumgarten
All secured webservers are alike; each vulnerable webserver running on a network appliance is vulnerable in its own way. On October 16th 2023 Cisco published a security advisory detailing an actively exploited vulnerability (CVE-2023-20198) in its IOS XE operating system with a 10 CVSS score, allowing for unauthenticated privilege escalation and subsequent full administrative access (level 15 in Cisco terminology) to the vulnerable device.After gaining access, which in itself is already enough to do damage and allows full device control, using an additional vulnerability (CVE-2023-20273) an attacker can elevate further to the “root” user and install a malicious implant to the disk of the device.
When the initial announcement was published Cisco had no patched software update to provide, and the suggested mitigations were to disable HTTP/S access to the IOS XE Web UI and/or limiting the access to it from trusted sources using ACLs and approx. a week later patches were published and the advisory updated.The zero-day vulnerability was being exploited before the advisory was published, and many current estimates and scanning analyses put the number of implanted devices in the tens of thousands.
[boxlink link="https://www.catonetworks.com/rapid-cve-mitigation/"] Rapid CVE Mitigation by Cato Security Research [/boxlink]
Details of the vulnerability
The authentication bypass is done on the webui_wsma_http or webui_wsma_https endpoints in the IOS XE webserver (which is running OpenResty, an Nginx variant that adds Lua scripting support). By using double-encoding (a simple yet clearly effective evasion technique) in the URL of the POST request it bypasses checks performed by the webserver and passes the request to the backend. The request body contains an XML payload which the backend executes arbitrarily since it’s considered to pass validations and comes from the frontend.In the request example below (credit: @SI_FalconTeam) we can see the POST request along with the XML payload is sent to /%2577ebui_wsma_http, when %25 is the character “%” encoded, followed by 77, and combined is “%77” which is the character “w” encoded.
Cisco has also provided a command to check the presence of an implant in the device, by running: curl -k -X POST "https[:]//DEVICEIP/webui/logoutconfirm.html?logon_hash=1", replacing DEVICEIP and checking the response, if a hexadecimal string is returned an implant is present.
Cato’s analysis and response to the CVE
From our data and analysis at Cato’s Research Labs we have seen multiple exploitation attempts of the CVE, along with an even more interesting case of Cisco’s own SIRT (Security Incident Response Team) performing scanning of devices to detect if they are vulnerable, quite likely to proactively contact customers running vulnerable systems.An example of scanning activity from 144.254.12[.]175, an IP that is part of a /16 range registered to Cisco.
Cato deployed IPS signatures to block any attempts to exploit the vulnerable endpoint, protecting all Cato connected sites worldwide from November 1st 2023.Cato also recommends to always avoid placing critical networking infrastructure to be internet facing. In instances when this is a necessity, disabling HTTP access and proper access controls using ACLs to limit the source IPs able to access devices must be implemented.
Networking devices are often not thought of as webservers, and due to this do not always receive the same forms of protection e.g., a WAF, however their Web UIs are clearly a powerful administrative interface, and we see time and time again how they are exploited. Networking devices like Cisco’s are typically administered almost entirely using CLI with the Web UI receiving less attention, somewhat underscoring a dichotomy between the importance of the device in the network to how rudimentary of a webserver it may be running.
https://www.youtube.com/watch?v=6caLf-1KGFw&list=PLff-wxM3jL7twyfaaYB7jxy6WqDB_17V4
TL;DR This vulnerability appears to be less severe than initially anticipated. Cato customers and infrastructure are secure. Last week the original author and long-time lead...
Cato’s Analysis and Protection for cURL SOCKS5 Heap Buffer Overflow (CVE-2023-38545) TL;DR This vulnerability appears to be less severe than initially anticipated. Cato customers and infrastructure are secure.
Last week the original author and long-time lead developer of cURL Daniel Stenberg published a “teaser” for a HIGH severity vulnerability in the ubiquitous libcurl development library and the curl command-line utility.
A week of anticipation, multiple heinous crimes against humanity and a declaration of war later, the vulnerability was disclosed publicly.
The initial announcement caused what in hindsight can be categorized as somewhat undue panic in the security and sysadmin worlds. But given how widespread the usage of libcurl and curl is around the world (at Cato we use widely as well, more on that below), and to quote from the libcurl website – “We estimate that every internet connected human on the globe uses (lib)curl, knowingly or not, every day”, the initial concern was more than understandable.
The libcurl library and the curl utility are used for interacting with URLs and for various multiprotocol file transfers, they are bundled into all the major Linux/UNIX distributions. Likely for that reason the project maintainers opted to keep the vulnerability disclosure private, and shared very little details to deter attackers, only letting the OS distributions maintainers know in advance while patched version are made ready in the respective package management systems for when it is disclosed.
[boxlink link="https://www.catonetworks.com/rapid-cve-mitigation/"] Rapid CVE Mitigation by Cato Security Research [/boxlink]
The vulnerability in detail
The code containing the buffer overflow vulnerability is part of curl’s support for the SOCKS5 proxy protocol.SOCKS5 is a simple and well-known (while not very well-used nowadays) protocol for setting up an organizational proxy or quite often for anonymizing traffic, like it is used in the Tor network.
The vulnerability is in libcurl hostname resolution which is either delegated to the target proxy server or done by libcurl itself. If a hostname larger than 255 bytes is given, then it turns to local resolution and only passed the resolved address. Due to the bug, and in a slow enough handshake (“slow enough” being typical server latency according to the post), the Buffer Overflow can be triggered, and the entire “too-long-hostname” being copied to the buffer instead of the resolved result.
There are multiple conditions that need to be met for the vulnerability to be exploited, specifically:
In applications that do not set “CURLOPT_BUFFERSIZE” or set it below 65541. Important to note that the curl utility itself sets it to 100kB and so is not vulnerable unless changed specifically in the command line.
CURLOPT_PROXYTYPE set to type CURLPROXY_SOCKS5_HOSTNAME
CURLOPT_PROXY or CURLOPT_PRE_PROXY set to use the scheme socks5h://
A possible way to exploit the buffer overflow would likely require the attacker to control a webserver which is contacted by the libcurl client over SOCKS5, could make it return a crafted redirect (HTTP 30x response) which will contain a Location header with a long enough hostname to trigger the buffer overflow.
Cato’s usage of (lib)curl
At Cato we of course utilize both libcurl and curl itself for multiple purposes:
curl and libcurl based applications are used extensively in our global infrastructure in scripts and in-house applications.
Cato’s SDP Client also implements libcurl and uses it for multiple functions.
We do not use SOCKS5, and Cato’s code and infrastructure are not vulnerable to any form of this CVE.
Cato’s analysis response to the CVE
Based on the CVE details and the public POC shared along with the disclosure, Cato’s Research Labs researchers believe that chances for this to be exploited successfully are medium – low.
Nevertheless we have of course added IPS signatures for this CVE, providing Cato connected sites worldwide the peace and quiet through virtual patching, blocking attempts for an exploit with a detect-to-protect time of 1 day and 3 hours for all users and sites connected to Cato worldwide, and Opt-In Protection already available after 14 hours.Cato’s recommendation is as always to patch impacted servers and applications, affected versions being from libcurl 7.69.0 to and including 8.3.0. In addition, it is possible to mitigate by identifying usage as already stated of the parameters that can lead to the vulnerability being triggered - CURLOPT_PROXYTYPE, CURLOPT_PROXY, CURLOPT_PRE_PROXY.
For more insights on CVE-2023-38545 specifically and many other interesting and nerdy Cybersecurity stories, listen (and subscribe!) to Cato’s podcast - The Ring of Defense: A CyberSecurity Podcast (also available in audio form).
A new critical vulnerability has been disclosed by Atlassian in a security advisory published on October 4th 2023 in its on-premise Confluence Data Center and...
Cato Protects Against Atlassian Confluence Server Exploits (CVE-2023-22515) A new critical vulnerability has been disclosed by Atlassian in a security advisory published on October 4th 2023 in its on-premise Confluence Data Center and Server product. A privilege escalation vulnerability through which attackers may exploit a vulnerable endpoint in internet-facing Confluence instances to create unauthorized Confluence administrator accounts and gain access to the Confluence instance.
At the time of writing a CVSS score was not assigned to the vulnerability but it can be expected to be very high (9 – 10) due to the fact it is remotely exploitable and allows full access to the server once exploited.
[boxlink link="https://www.catonetworks.com/rapid-cve-mitigation/"] Rapid CVE Mitigation by Cato Security Research [/boxlink]
Cato’s Response
There are no publicly known proofs-of-concept (POC) of the exploit available, but it has been confirmed by Atlassian that they have been made aware of the exploit by a “handful of customers where external attackers may have exploited a previously unknown vulnerability” so it can be assumed with a high certainty that it is already being exploited.
Cato’s Research Labs identified possible exploitation attempts of the vulnerable endpoint (“/setup/”) in some of our customers immediately after the security advisory was released, which were successfully blocked without any user intervention needed. The attempts were blocked by our IPS signatures aimed at identifying and blocking URL scanners even before a signature specific to this CVE was available.
The speed with which using the very little information available from the advisory was already integrated into online scanners gives a strong indication of how much of a high-value target Confluence servers are, and is concerning given the large numbers of publicly facing Confluence servers that exist.
Following the disclosure, Cato deployed signatures blocking any attempts to interact with the vulnerable “/setup/” endpoint, with a detect-to-protect time of 1 day and 23 hours for all users and sites connected to Cato worldwide, and Opt-In Protection already available in under 24 hours.
Furthermore, Cato’s recommendation is to restrict access to Confluence servers’ administration endpoints only from authorized IPs, preferably from within the network and when not possible that it is only accessible from hosts protected by Cato, whether behind a Cato Socket or remote users running the Cato Client.
Cato’s Research Labs continues to monitor the CVE for additional information, and we will update our signatures as more information becomes available or a POC is made public and exposes additional information. Follow our CVE Mitigation page and Release Notes for future information.