Cato Develops Groundbreaking Method for Automatic Application Identification

April 30, 2020

New applications are identified faster, more efficiently by using data science and Cato’s data warehouse

Identifying applications has become a crucial part of network operations. Quickly and reliably identifying unknown applications is essential to everything from enforcing QoS rules, setting application policies, and preventing malicious communications.

However, legacy approaches to application classification have become too ineffective or too expensive. In the past, SD-WAN appliances and firewalls identified applications by largely relying on transport-layer information, such as the port number. This approach, though, is no longer sufficient as applications today employ multiple port numbers, run over their own protocols, or both.

As a result, accurately classifying applications has required reconstruction of application flows. Indeed, next-generation firewalls have become application-aware, identifying applications by their protocol structure or other application-layer headers to permit or deny unwanted traffic.

Reconstructing application flows, though, is a processor-intensive process that does not scale. Many vendors have resorted to manual classification, a labor-intensive process involving the hiring of many engineers. It’s costly, lengthy, and limited in accuracy. Ultimately that impacts product costs, the customer experience, or both.

Cato Uses Data Science to Automatically Classify New Applications

Cato has developed a new approach for automatically identifying the toughest types of applications to classify – new apps running over their own protocols. We do this by running machine learning algorithms against our data warehouse of flows. It’s a repository built from the billions of traffic flows crossing the Cato private backbone every day.

We’re able to use that repository to classify and label applications based on thousands of datapoints derived from flow characteristics. Those application labels or AppIDs are fed into our management system. With them, customers can categorize what was once uncategorized traffic, giving them deeper visibility into their network usage, and, in the process, create more accurate network rules for managing traffic flows.

To learn more about our approach and data science behind, click here to read the paper.

For insight into our security services click here or here to learn about Cato Managed Threat Detection and Response service.

 

Avidan Avraham

Avidan Avraham

Avidan is a Security Researcher in Cato Networks. Avidan has a strong interest in cybersecurity, from OS internals and reverse engineering to network protocols analysis and malicious traffic detection. Avidan is also a Big-data and machine learning enthusiast who enjoys solving complex problems related to this world. Previously he worked at IBM Trusteer and was responsible for a vast part of the company's detection and prevention capabilities: enterprise security threats & exploits, financial malware variants and more. Today, in Cato Research Labs, he's focusing on network based security research & novel methods for finding threats in enterprise network environments.