This is a guest post authored by Asaf Fried, Daniel Pienica, Sergey Volkovich from Cato Networks.
 Cato Networks is a leading provider of secure access service edge (SASE), an enterprise networking and security unified cloud-centered service that converges SD-WAN, a cloud network, and security service edge (SSE) functions, including firewall as a service (FWaaS), a secure web gateway, zero trust network access, and more.
 On our SASE management console, the central events page provides a comprehensive view of the events occurring on a specific account. With potentially millions of events over a selected time range, the goal is to refine these events using various filters until a manageable number of relevant events are identified for analysis. Users can review different types of events such as security, connectivity, system, and management, each categorized by specific criteria like threat protection, LAN monitoring, and firmware updates. However, the process of adding filters to the search query is manual and can be time consuming, because it requires in-depth familiarity with the product glossary.
 To address this challenge, we recently enabled customers to perform free text searches on the event management page, allowing new users to run queries with minimal product knowledge. This was accomplished by using foundation models (FMs) to transform natural language into structured queries that are compatible with our products’ GraphQL API.
 In this post, we demonstrate how we used Amazon Bedrock, a fully managed service that makes FMs from leading AI startups and Amazon available through an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case. With the Amazon Bedrock serverless experience, you can get started quickly, privately customize FMs with your own data, and quickly integrate and deploy them into your applications using AWS tools without having to manage the infrastructure. Amazon Bedrock enabled us to enrich FMs with product-specific knowledge and convert free text inputs from users into structured search queries for the product API that can greatly enhance user experience and efficiency in data management applications.
 Solution overview
 The Events page includes a filter bar with both event and time range filters. These filters need to be added and updated manually for each query. The following screenshot shows an example of the event filters (1) and time filters (2) as seen on the filter bar (source: Cato knowledge base).
The event filters are a conjunction of statements in the following form:
 Key – The field name
 Operator – The evaluation operator (for example, is, in, includes, greater than, etc.)
 Value – A single value or list of values
For example, the following screenshot shows a filter for action in [ Alert, Block ].
The time filter is a time range following ISO 8601 time intervals standard.
 For example, the following screenshot shows a time filter for UTC.2024-10-{01/00:00:00–02/00:00:00}.
Converting free text to a structured query of event and time filters is a complex natural language processing (NLP) task that can be accomplished using FMs. Customizing an FM that is specialized on a specific task is often done using one of the following approaches:
 Prompt engineering – Add instructions in the context/input window of the model to help it complete the task successfully.
 Retrieval Augmented Generation (RAG) – Retrieve relevant context from a knowledge base, based on the input query. This context is augmented to the original query. This approach is used for reducing the amount of context provided to the model to relevant data only.
 Fine-tuning – Train the FM on data relevant to the task. In this case, the relevant context will be embedded into the model weights, instead of being part of the input.
For our specific task, we’ve found prompt engineering sufficient to achieve the results we needed.
 Because the event filters on the Events page are specific to our product, we need to provide the FM with the exact instructions for how to generate them, based on free text queries. The main considerations when creating the prompt are:
Include the relevant context – This includes the following:
 The available keys, operators, and values the model can use.
 Specific instructions. For example, numeric operators can only be used with keys that have numeric values.
Make sure it’s simple to validate – Given the extensive number of instructions and limitations, we can’t trust the model output without checking the results for validity. For example, what if the model generates a filter with a key not supported by our API?
Instead of asking the FM to generate the GraphQL API request directly, we can use the following method:
 Instruct the model to return a response following a well-known JSON schema validation IETF standard.
 Validate the JSON schema on the response.
 Translate it to a GraphQL API request.
Request prompt
 Based on the preceding examples, the system prompt will be structured as follows:
 # Genral Instructions
Your task is to convert free text queries to a JSON format that will be used to query security and network events in a SASE management console of Cato Networks. You are only allowed to output text in JSON format. Your output will be validated against the following schema that is compatible with the IETF standard:
# Schema definition
 {
     “$schema”: “https://json-schema.org/draft/2020-12/schema”,
     “title”: “Query Schema”,
    “description”: “Query object to be executed in the ‘Events’ management console page. “,
     “type”: “object”,
     “properties”:
     {
         “filters”:
         {
             “type”: “array”,
            “description”: “List of filters to apply in the query, based on the free text query provided.”,
             “items”:
             {
                 “oneOf”:
                 [
                     {
                         “$ref”: “#/$defs/Action”
                     },
                     .
                     .
                     .
                 ]
             }
         },
         “time”:
         {
             “description”: “Start datetime and end datetime to be used in the query.”,
             “type”: “object”,
             “required”:
             [
                 “start”,
                 “end”
             ],
             “properties”:
             {
                 “start”:
                 {
                     “description”: “start datetime”,
                     “type”: “string”,
                     “format”: “date-time”
                 },
                 “end”:
                 {
                     “description”: “end datetime”,
                     “type”: “string”,
                     “format”: “date-time”
                 }
             }
         },
         “$defs”:
         {
             “Operator”:
             {
                 “description”: “The operator used in the filter.”,
                 “type”: “string”,
                 “enum”:
                 [
                     “is”,
                     “in”,
                     “not_in”,
                     .
                     .
                     .
                 ]
             },
             “Action”:
             {
                 “required”:
                 [
                     “id”,
                     “operator”,
                     “values”
                 ],
                 “description”: “The action taken in the event.”,
                 “properties”:
                 {
                     “id”:
                     {
                         “const”: “action”
                     },
                     “operator”:
                     {
                         “$ref”: “#/$defs/Operator”
                     },
                     “values”:
                     {
                         “type”: “array”,
                         “minItems”: 1,
                         “items”:
                         {
                             “type”: “string”,
                             “enum”:
                             [
                                 “Block”,
                                 “Allow”,
                                 “Monitor”,
                                 “Alert”,
                                 “Prompt”
                             ]
                         }
                     }
                 }
             },
             .
             .
             .
         }
     }
 }
Each user query (appended to the system prompt) will be structured as follows:
 # Free text query
 Query: {free_text_query}
# Add current timestamp for context (used for time filters) 
 Context: If you need a reference to the current datetime, it is {datetime}, and the current day of the week is {day_of_week}
The same JSON schema included in the prompt can also be used to validate the model’s response. This step is crucial, because model behavior is inherently non-deterministic, and responses that don’t comply with our API will break the product functionality.
 In addition to validating alignment, the JSON schema can also point out the exact schema violation. This allows us to create a policy based on different failure types. For example:
 If there are missing fields marked as required, output a translation failure to the user
 If the value given for an event filter doesn’t comply with the format, remove the filter and create an API request from other values, and output a translation warning to the user
After the FM successfully translates the free text into structured output, converting it into an API request—such as GraphQL—is a straightforward and deterministic process.
 To validate this approach, we’ve created a benchmark with hundreds of text queries and their corresponding expected JSON outputs. For example, let’s consider the following text query:
 Security events with high risk level from IPS and Anti Malware engines
 For this query, we expect the following response from the model, based on the JSON schema provided:
 {
     “filters”:
     [
         {
             “id”: “risk_level”,
             “operator”: “is”,
             “values”:
             [
                 “High”
             ]
         },
         {
             “id”: “event_type”,
             “operator”: “is”,
             “values”:
             [
                 “Security”
             ]
         },
         {
             “id”: “event_subtype “,
             “operator”: “in”,
             “values”:
             [
                 “IPS”,
                 “Anti Malware”
             ]
         }
     ]
 }
For each response of the FM, we define three different outcomes:
Success:
 Valid JSON
 Valid by schema
 Full match of filters
Partial:
 Valid JSON
 Valid by schema
 Partial match of filters
Error:
Invalid JSON or invalid by schema
Because translation failures lead to a poor user experience, releasing the feature was contingent on achieving an error rate below 0.05, and the selected FM was the one with the highest success rate (ratio of responses with full match of filters) passing this criterion.
 Working with Amazon Bedrock
 Amazon Bedrock is a fully managed service that simplifies access to a wide range of state-of-the-art FMs through a single, serverless API. It offers a production-ready service capable of efficiently handling large-scale requests, making it ideal for enterprise-level deployments.
 Amazon Bedrock enabled us to efficiently transition between different models, making it simple to benchmark and optimize for accuracy, latency, and cost, without the complexity of managing the underlying infrastructure. Additionally, some vendors within the Amazon Bedrock landscape, such as Cohere and Anthropic’s Claude, offer models with native understanding of JSON schemas and structured data, further enhancing their applicability to our specific task.
 Using our benchmark, we evaluated several FMs on Amazon Bedrock, taking into account accuracy, latency, and cost. Based on the results, we selected anthropic.claude-3-5-sonnet-20241022-v2:0, which met the error rate criterion and achieved the highest success rate while maintaining reasonable costs and latency. Following this, we proceeded to develop the complete solution, which includes the following components:
 Management console – Cato’s management application that the user interacts with to view their account’s network and security events.
 GraphQL server – A backend service that provides a GraphQL API for accessing data in a Cato account.
 Amazon Bedrock – The cloud service that handles hosting and serving requests to the FM.
 Natural language search (NLS) service – An Amazon Elastic Kubernetes Service (Amazon EKS) hosted service to bridge between Cato’s management console and Amazon Bedrock. This service is responsible for creating the complete prompt for the FM and validating the response using the JSON schema.
The following diagram illustrates the workflow from the user’s manual query to the extraction of relevant events.
With the new capability, users can also use free text query mode, which is processed as shown in the following diagram.
The following screenshot of the Events page displays free text query mode in action.
Business impact
 The recent feature update has received positive customer feedback. Users, especially those unfamiliar with Cato, have found the new search capability more intuitive, making it straightforward to navigate and engage with the system. Additionally, the inclusion of multi-language input, natively supported by the FM, has made the Events page more accessible for non-native English speakers to use, helping them interact and find insights in their own language.
 One of the standout impacts is the significant reduction in query time—cut down from minutes of manual filtering to near-instant results. Account admins using the new feature have reported near-zero time to value, experiencing immediate benefits with minimal learning curve.
 Conclusion
 Accurately converting free text inputs into structured data is crucial for applications that involve data management and user interaction. In this post, we introduced a real business use case from Cato Networks that significantly improved user experience.
 By using Amazon Bedrock, we gained access to state-of-the-art generative language models with built-in support for JSON schemas and structured data. This allowed us to optimize for cost, latency, and accuracy without the complexity of managing the underlying infrastructure.
 Although a prompt engineering solution met our needs, users handling complex JSON schemas might want to explore alternative approaches to reduce costs. Including the entire schema in the prompt can lead to a significantly high token count for a single query. In such cases, consider using Amazon Bedrock to fine-tune a model, to embed product knowledge more efficiently.
About the Authors
 Asaf Fried leads the Data Science team in Cato Research Labs at Cato Networks. Member of Cato Ctrl. Asaf has more than six years of both academic and industry experience in applying state-of-the-art and novel machine learning methods to the domain of networking and cybersecurity. His main research interests include asset discovery, risk assessment, and network-based attacks in enterprise environments.
 Daniel Pienica is a Data Scientist at Cato Networks with a strong passion for large language models (LLMs) and machine learning (ML). With six years of experience in ML and cybersecurity, he brings a wealth of knowledge to his work. Holding an MSc in Applied Statistics, Daniel applies his analytical skills to solve complex data problems. His enthusiasm for LLMs drives him to find innovative solutions in cybersecurity. Daniel’s dedication to his field is evident in his continuous exploration of new technologies and techniques.
 Sergey Volkovich is an experienced Data Scientist at Cato Networks, where he develops AI-based solutions in cybersecurity & computer networks. He completed an M.Sc. in physics at Bar-Ilan University, where he published a paper on theoretical quantum optics. Before joining Cato, he held multiple positions across diverse deep learning projects, ranging from publishing a paper on discovering new particles at the Weizmann Institute to advancing computer networks and algorithmic trading. Presently, his main area of focus is state-of-the-art natural language processing.
 Omer Haim is a Senior Solutions Architect at Amazon Web Services, with over 6 years of experience dedicated to solving complex customer challenges through innovative machine learning and AI solutions. He brings deep expertise in generative AI and container technologies, and is passionate about working backwards from customer needs to deliver scalable, efficient solutions that drive business value and technological transformation.