Using transcription confidence scores to improve slot filling in Amazo …

When building voice-enabled chatbots with Amazon Lex, one of the biggest challenges is accurately capturing user speech input for slot values. For example, when a user needs to provide their account number or confirmation code, speech recognition accuracy becomes crucial. This is where transcription confidence scores come in to help ensure reliable slot filling.
What Are Transcription Confidence Scores?
Transcription confidence scores indicate how confident Amazon Lex is in converting speech to text for slot values. These scores range from low to high and are separate from intent/entity recognition scores. For each spoken slot value, Lex provides a confidence score that you can use to:

Validate if a spoken slot value was correctly understood
Decide whether to ask for confirmation or re-prompt
Branch conversation flows based on recognition confidence

Here are some ways to leverage confidence scores for better slot handling:

Progressive Confirmation

High confidence (>0.9): Accept the slot value and continue
Medium confidence (0.6-0.9): Ask user to confirm (“Did you say 12345?”)
Low confidence (<0.6): Re-prompt for the slot value

Adaptive re-prompting

Customize re-prompt messages based on confidence level
Provide more specific guidance for low confidence inputs
Offer alternative input methods when needed

Branching Logic

Route to human agent if multiple low confidence attempts
Skip confirmation for consistently high confidence inputs
Adjust validation rules based on confidence thresholds

The post includes an AWS CloudFormation template, to demonstrate these patterns, that you can deploy in your AWS account.
Using confidence scores strategically helps create more robust slot filling experiences that:

Reduce errors in capturing critical information
Improve containment rates for self-service
Provide better handling of challenging audio conditions
Enable smarter conversation flows

By implementing these patterns, you can significantly improve the accuracy and reliability of slot filling in your Lex voice bots while maintaining a natural conversation flow.
Solution overview
The following diagram illustrates the overall architecture for the solution.

We use an Amazon Lex bot integrated with an Amazon Connect contact flow to deliver the conversational experience.
Amazon Lex asks the caller “In a few words, how can I help you today?” Amazon Lex reviews the transcription confidence score of the slot values to influence the confirmation routine.
Interaction model
The following are different customer experiences a caller may have depending on what they say and the transcription confidence scores returned by Amazon Lex:

Conversation 1:

Virtual Agent: Thank you for calling Acme Travel. Virtual Agent: …In a few words, what is the reason for your call today? Customer: I’d like to check my booking. Virtual Agent: That’s great, please say your 5 character booking reference, you will find it at the top of the information pack we sent. Customer: 1 9 A K B (Recognized with low transcription confidence: 0.4) Virtual Agent: Ok, let’s try again. What is your booking reference? Customer: 1 9 A A B (More clearly pronounced: recognized with 0.8) Virtual Agent: Your booking 1 9 A A B is currently being progressed.

Conversation 2:

Virtual Agent: Thank you for calling Acme Travel. Virtual Agent: …In a few words, what is the reason for your call today? Customer: I’d like to check my booking 1 9 A Z Y. Virtual Agent: Your booking 1 9 A Z Y is currently being progressed.

Conversation 3:

Virtual Agent: Thank you for calling Acme Travel. Virtual Agent: …In a few words, what is the reason for your call today? Customer: I’d like to check my booking. Virtual Agent: That’s great, please say your 5 character booking reference, you will find it at the top of the information pack we sent. Customer: 1 9 A Z Y (recognized with a transcription confidence of 0.6) Virtual Agent: Did you say 1 9 A Z Y? Please say yes or no. Customer: Yes Virtual Agent: Your booking 1 9 A Z Y is currently being progressed.

In the example conversations, the IVR requests the booking reference from the customer. Once received, the transcription confidence score is evaluated by enabling conditional branching in Amazon Lex based on speech confidence scores. These conditions check the value against specific thresholds. If the transcription confidence score exceeds the high threshold (for example, greater than 0.7), the conversation progresses to the next state. If the score falls in the medium confidence range (for example, between 0.4–0.7), the user is asked to confirm the interpreted input. Finally, if the score falls below a minimum threshold (for example, lower than 0.4), the user is prompted to retry and provide the information again. This approach optimizes the conversation flow based on the quality of the input captured and prevents erroneous or redundant slot capturing, leading to an improved user experience while increasing the self-service containment rates.
Prerequisites
You need to have an AWS account and an AWS Identity and Access Management (IAM) role and user with permissions to create and manage the necessary resources and components for this application. If you don’t have an AWS account, see How do I create and activate a new Amazon Web Services account?
Additionally, you need an Amazon Connect instance—you use the instance Amazon Resource Name (ARN) in a later step.
Deploy the Amazon Lex bot and Amazon Connect flow
To create the sample bot and configure the runtime phrase hints, perform the following steps. For this example, we create an Amazon Lex bot called disambiguation-bot, one intent (CheckBooking), and one slot type (BookingRef).

Sign in to your AWS account, then choose Launch Stack to deploy the CloudFormation template:

For Stack Name, enter a name, for example contact-center-transcription-confidence-scores.
Choose Next.
Provide the following parameters:

For BotName, enter disambiguation-bot.
For ConnectInstanceARN, enter the ARN of your Amazon Connect instance.
For ContactFlowName, enter a name for your Amazon Connect contact flow (for example, lex-check-booking-sample-flow).
For LogGroupName, enter the name of the Amazon CloudWatch log group where the conversation logs are stored.

Choose Next.

Leave all remaining settings as default and choose Next.
Select I acknowledge that AWS CloudFormation might create IAM resources.
Choose Submit.

Wait for the CloudFormation stack to successfully deploy.
On the Amazon Connect console, assign the contact flow to an Amazon Connect claimed number.

Configure the transcript confidence score logic
After you create your intent (CheckBooking), use you can Visual conversation builder to configure your transcription confidence score logic.
The following figure is an example of how we add logic to the intent. Highlighted in red is the branch condition where we use the transcription confidence score to dynamically change the customer experience and improve accuracy.

If you choose the node, you’re presented with the following configuration options, which is where you can configure the branch condition.

Test the solution
To test the solution, we examine a conversation with words that might not be clearly understood.

Assign the Amazon Lex bot to an Amazon Connect workflow.
Make a call.

Amazon Connect will ask “Thank you for calling Acme travel, In a few words, what is the reason for your call today?”

Respond “I want to check my booking.”
When asked for the booking reference, speak any two numbers followed by three letters (for example, “1 9 A Z Y”).

This test checks the confidence score and will either say “your booking 1 9 A Z Y is currently being progressed” or it will ask you to confirm “1 9 A Z Y”.
Limitations
Audio transcription confidence scores are available only in the English (GB) (en_GB) and English (US) (en_US) languages. Confidence scores are supported only for 8 kHz audio input. Transcription confidence scores aren’t provided for audio input from the test window on the Amazon Lex V2 console because it uses 16 kHz audio input.
Clean up
To remove the infrastructure created by the CloudFormation template, open the AWS CloudFormation console and delete the stack. This will remove the services and configuration installed as part of this deployment process.
Conclusion
Optimizing the user experience is at the forefront of any Amazon Lex conversational designer’s priority list, and so is capturing information accurately. This new feature empowers designers to have choices around confirmation routines that drive a more natural dialog between the customer and the bot. Although confirming each input can slow down the user experience and cause frustration, failing to confirm when transcription confidence is low can risk accuracy. These improvements enable you to create a more natural and performant experience.
For more information about how to build effective conversations on Amazon Lex with intent confidence scores, see Build more effective conversations on Amazon Lex with confidence scores and increased accuracy.

About the Authors
Alex Buckhurst is a Senior Amazon Connect consultant at Amazon Web Services with a focus on innovation and building customer-centric designs. In his downtime, Alex enjoys playing squash, perfecting his BBQ skills, and cherishing moments with his family.
Kai Loreck is a Senior professional services Amazon Connect consultant. He works on designing and implementing scalable customer experience solutions. In his spare time, he can be found playing sports, snowboarding, or hiking in the mountains.
Neel Kapadia is a Senior Software Engineer at AWS where he works on designing and building scalable AI/ML services using Large Language Models and Natural Language Processing. He has been with Amazon for over 5 years and has worked on Amazon Lex and Amazon Bedrock. In his spare time, he enjoys cooking, reading, and traveling.
Anand Jumnani is a DevOps Consultant at Amazon Web Services based in United Kingdom. Outside of work, he is passionate about club cricket and enjoys spending quality time with family and friends.

Improving Retrieval Augmented Generation accuracy with GraphRAG

Customers need better accuracy to take generative AI applications into production. In a world where decisions are increasingly data-driven, the integrity and reliability of information are paramount. To address this, customers often begin by enhancing generative AI accuracy through vector-based retrieval systems and the Retrieval Augmented Generation (RAG) architectural pattern, which integrates dense embeddings to ground AI outputs in relevant context. When even greater precision and contextual fidelity are required, the solution evolves to graph-enhanced RAG (GraphRAG), where graph structures provide enhanced reasoning and relationship modeling capabilities.
Lettria, an AWS Partner, demonstrated that integrating graph-based structures into RAG workflows improves answer precision by up to 35% compared to vector-only retrieval methods. This enhancement is achieved by using the graph’s ability to model complex relationships and dependencies between data points, providing a more nuanced and contextually accurate foundation for generative AI outputs.
In this post, we explore why GraphRAG is more comprehensive and explainable than vector RAG alone, and how you can use this approach using AWS services and Lettria.
How graphs make RAG more accurate
In this section, we discuss the ways in which graphs make RAG more accurate.
Capturing complex human queries with graphs
Human questions are inherently complex, often requiring the connection of multiple pieces of information. Traditional data representations struggle to accommodate this complexity without losing context. Graphs, however, are designed to mirror the way humans naturally think and ask questions. They represent data in a machine-readable format that preserves the rich relationships between entities.
By modeling data as a graph, you capture more of the context and intent. This means your RAG application can access and interpret data in a way that aligns closely with human thought processes. The result is a more accurate and relevant answer to complex queries.
Avoiding loss of context in data representation
When you rely solely on vector similarity for information retrieval, you miss out on the nuanced relationships that exist within the data. Translating natural language into vectors reduces the richness of the information, potentially leading to less accurate answers. Also, end-user queries are not always aligned semantically to useful information in provided documents, leading to vector search excluding key data points needed to build an accurate answer.
Graphs maintain the natural structure of the data, allowing for a more precise mapping between questions and answers. They enable the RAG system to understand and navigate the intricate connections within the data, leading to improved accuracy.
Lettria demonstrated improvement on correctness of answers from 50% with traditional RAG to more than 80% using GraphRAG within a hybrid approach. The testing covered datasets from finance (Amazon financial reports), healthcare (scientific studies on COVID-19 vaccines), industry (technical specifications for aeronautical construction materials), and law (European Union directives on environmental regulations).
Proving that graphs are more accurate
To substantiate the accuracy improvements of graph-enhanced RAG, Lettria conducted a series of benchmarks comparing their GraphRAG solution—a hybrid RAG using both vector and graph stores—with a baseline vector-only RAG reference.
Lettria’s hybrid methodology to RAG
Lettria’s hybrid approach to question answering combines the best of vector similarity and graph searches to optimize performance of RAG applications on complex documents. By integrating these two retrieval systems, Lettria uses both structured precision and semantic flexibility in handling intricate queries.
GraphRAG specializes in using fine-grained, contextual data, ideal for answering questions that require explicit connections between entities. In contrast, vector RAG excels at retrieving semantically relevant information, offering broader contextual insights. This dual system is further reinforced by a fallback mechanism: when one system struggles to provide relevant data, the other compensates. For example, GraphRAG pinpoints explicit relationships when available, whereas vector RAG fills in relational gaps or enhances context when structure is missing.
The benchmarking process
To demonstrate the value of this hybrid method, Lettria conducted extensive benchmarks across datasets from various industries. Using their solution, they compared GraphRAG’s hybrid pipeline against a leading open source RAG package, Verba by Weaviate, a baseline RAG reference reliant solely on vector stores. The datasets included Amazon financial reports, scientific texts on COVID-19 vaccines, technical specifications from aeronautics, and European environmental directives—providing a diverse and representative test bed.
The evaluation tackled real-world complexity by focusing on six distinct question types, including fact-based, multi-hop, numerical, tabular, temporal, and multi-constraint queries. The questions ranged from simple fact-finding, like identifying vaccine formulas, to multi-layered reasoning tasks, such as comparing revenue figures across different timeframes. An example multi-hop query in finance is “Compare the oldest booked Amazon revenue to the most recent.”
Lettria’s in-house team manually assessed the answers with a detailed evaluation grid, categorizing results as correct, partially correct (acceptable or not), or incorrect. This process measured how the hybrid GraphRAG approach outperformed the baseline, particularly in handling multi-dimensional queries that required combining structured relationships with semantic breadth. By using the strengths of both vector and graph-based retrieval, Lettria’s system demonstrated its ability to navigate the nuanced demands of diverse industries with precision and flexibility.
The benchmarking results
The results were significant and compelling. GraphRAG achieved 80% correct answers, compared to 50.83% with traditional RAG. When including acceptable answers, GraphRAG’s accuracy rose to nearly 90%, whereas the vector approach reached 67.5%.
The following graph shows the results for vector RAG and GraphRAG.

In the industry sector, dealing with complex technical specifications, GraphRAG provided 90.63% correct answers, almost doubling vector RAG’s 46.88%. These figures highlight how GraphRAG offers substantial advantages over the vector-only approach, particularly for clients focused on structuring complex data.
GraphRAG’s overall reliability and superior handling of intricate queries allow customers to make more informed decisions with confidence. By delivering up to 35% more accurate answers, it significantly boosts efficiency and reduces the time spent sifting through unstructured data. These compelling results demonstrate that incorporating graphs into the RAG workflow not only enhances accuracy, but is essential for tackling the complexity of real-world questions.
Using AWS and Lettria for enhanced RAG applications
In this section, we discuss how you can use AWS and Lettria for enhanced RAG applications.
AWS: A robust foundation for generative AI
AWS offers a comprehensive suite of tools and services to build and deploy generative AI applications. With AWS, you have access to scalable infrastructure and advanced services like Amazon Neptune, a fully managed graph database service. Neptune allows you to efficiently model and navigate complex relationships within your data, making it an ideal choice for implementing graph-based RAG systems.
Implementing GraphRAG from scratch usually requires a process similar to the following diagram.

The process can be broken down as follows:

Based on domain definition, the large language model (LLM) can identify the entities and relationship contained in the unstructured data, which are then stored in a graph database such as Neptune.
At query time, user intent is turned into an efficient graph query based on domain definition to retrieve the relevant entities and relationship.
Results are then used to augment the prompt and generate a more accurate response compared to standard vector-based RAG.

Implementing such process requires teams to develop specific skills in topics such as graph modeling, graph queries, prompt engineering, or LLM workflow maintenance. AWS released an open source GraphRAG Toolkit to make it simple for customers who want to build and customize their GraphRAG workflows. Iterations on extraction process and graph lookup are to be expected in order to get accuracy improvement.
Managed GraphRAG implementations
There are two solutions for managed GraphRAG with AWS: Lettria’s solution, soon available on AWS Marketplace, and Amazon Bedrock integrated GraphRAG support with Neptune. Lettria provides an accessible way to integrate GraphRAG into your applications. By combining Lettria’s expertise in natural language processing (NLP) and graph technology with the scalable and managed AWS infrastructure, you can develop RAG solutions that deliver more accurate and reliable results.
The following are key benefits of Lettria on AWS:

Simple integration – Lettria’s solution simplifies the ingestion and processing of complex datasets
Improved accuracy – You can achieve up to 35% better performance in question-answering tasks
Scalability – You can use scalable AWS services to handle growing data volumes and user demands
Flexibility – The hybrid approach combines the strengths of vector and graph representations

In addition to Lettria’s solution, Amazon Bedrock introduced managed GraphRAG support on December 4, 2024, integrating directly with Neptune. GraphRAG with Neptune is built into Amazon Bedrock Knowledge Bases, offering an integrated experience with no additional setup or additional charges beyond the underlying services. GraphRAG is available in AWS Regions where Amazon Bedrock Knowledge Bases and Amazon Neptune Analytics are both available (see the current list of supported Regions). To learn more, see Retrieve data and generate AI responses with Amazon Bedrock Knowledge Bases.
Conclusion
Data accuracy is a critical concern for enterprises adopting generative AI applications. By incorporating graphs into your RAG workflow, you can significantly enhance the accuracy of your systems. Graphs provide a richer, more nuanced representation of data, capturing the complexity of human queries and preserving context.
GraphRAG is a key option to consider for organizations seeking to unlock the full potential of their data. With the combined power of AWS and Lettria, you can build advanced RAG applications that help meet the demanding needs of today’s data-driven enterprises and achieve up to 35% improvement in accuracy.
Explore how you can implement GraphRAG on AWS in your generative AI application:

Understanding GraphRAG with Lettria
Amazon Bedrock Knowledge Bases now supports GraphRAG (preview)
Using knowledge graphs to build GraphRAG applications with Amazon Bedrock and Amazon Neptune
GraphRAG Toolkit
Create a Neo4j GraphRAG Workflow Using LangChain and LangGraph

About the Authors
Denise Gosnell is a Principal Product Manager for Amazon Neptune, focusing on generative AI infrastructure and graph data applications that enable scalable, cutting-edge solutions across industry verticals.
Vivien de Saint Pern is a Startup Solutions Architect working with AI/ML startups in France, focusing on generative AI workloads.

Email Engagement Metrics: Going Beyond Open Rates

If you’re still judging your email campaigns by open rates, it’s time to upgrade the ole playbook. Sure, they were the gold standard once upon a time, but let’s face it – open rates just don’t cut it anymore.

Why? 

For starters, privacy updates like Apple’s Mail Privacy Protection have thrown a wrench into the works, making open rates unreliable for tracking engagement. Plus, an open doesn’t tell you much. Did they skim your email? Did they take action? Or did they just open it by mistake?

The reality is that click-through rates, post-click behavior, and time on page are where the real insights live. In fact, email campaigns that focus on advanced email engagement metrics see up to 3x higher conversion rates than those relying on opens alone.

This post is all about going deeper. 

We’re talking actionable email engagement data – what your recipients do after they open, how they interact with your content, and how to use these insights to refine your strategy. And with tools like Customers.ai, you’ll learn how to move past the basics and start driving real results. 

Let’s dive in.

Email Deliverability Hacks:

Unlocking Inboxes

HOSTED BY

Larry Kim

Founder and CEO, Customers.ai

Free Webinar: Watch Now

What Are Email Engagement Metrics (and Why Do They Matter)?

Email engagement metrics are the key to understanding what your audience actually does with your emails. 

It’s everything that happens after the open – clicks, time spent on your linked pages, form submissions, downloads, etc. It’s the difference between knowing someone glanced at your email and knowing they clicked through, explored your site, and added something to their cart.

Why is this so valuable? 

Because these actions tell you what your audience is thinking! A click on your pricing link? That screams intent. Time spent reading a case study? They’re digging into your value. 

These are the behaviors that show real readiness to buy and they’re way more actionable than an open.

But email engagement goes even deeper than clicks:

Email Replies: Especially for B2B marketers, a reply to your email can be a direct line to a sales conversation.

On-Site Behavior: What do recipients do after clicking your link? Are they browsing specific products or spending time on your pricing page?

Dynamic Content Interactions: Think about email features like polls, interactive buttons, or product carousels. Did they engage? If so, you’ve got another signal to follow.

The key here is that email engagement metrics don’t just tell you if someone opened your email. They show you how they engaged and what they’re most interested in. 

That’s the kind of info that drives smarter campaigns and even bigger results.

The Most Common Email Engagement Metrics

We are all about going above and beyond the basics but when it comes to email marketing, it’s still important to know the tried and true. 

These are the foundational numbers give you a solid overview of how your campaigns are performing:

1. Open Rates

What It Is: The percentage of recipients who open your email.

Why It Matters: While less reliable due to privacy updates, open rates can still provide a general sense of subject line effectiveness and initial interest.

Pro Tip: Focus more on trends over time than individual email performance.

2. Click-Through Rates (CTR)

What It Is: The percentage of recipients who click on any link in your email.

Why It Matters: It’s a clear indicator of how well your content and CTAs resonate with your audience.

Pro Tip: Compare click-through rates across campaigns to identify what messaging and offers drive the most engagement.

3. Bounce Rates

What It Is: The percentage of emails that couldn’t be delivered, either due to invalid addresses (hard bounces) or temporary issues (soft bounces).

Why It Matters: High bounce rates can hurt your sender reputation and impact overall deliverability.

Pro Tip: Regularly clean your email list to remove inactive or invalid addresses.

4. Unsubscribe Rates

What It Is: The percentage of recipients who opt out of your email list after receiving a specific email.

Why It Matters: A high unsubscribe rate signals issues with your content, frequency, or audience targeting.

Pro Tip: Test different email cadences or content types to reduce opt-outs.

5. Spam Complaint Rates

What It Is: The percentage of recipients who mark your email as spam.

Why It Matters: Spam complaints directly impact your sender reputation and future deliverability.

Pro Tip: Avoid deceptive subject lines and always include a clear, easy-to-find unsubscribe option.

6. Forward/Share Rates

What It Is: The percentage of recipients who share your email with others or forward it to a friend.

Why It Matters: High forward/share rates indicate content that’s valuable and worth spreading.

Pro Tip: Include share buttons or a call-to-action encouraging recipients to pass it along.

These common email engagement metrics provide the baseline for understanding how your campaigns are performing. 

Advanced Email Engagement Metrics to Track (and Why They Matter)

When you move beyond the basic email engagement metrics of open rates and clicks, the good stuff starts to happen.  

Here’s a closer look at key email marketing metrics that reveal more than just surface-level insights:

1. Time on Page: Are They Really Engaged?

Tracking how long recipients spend on your linked pages tells you whether they’re genuinely interested or just browsing.

If they linger on a pricing page or case study, that’s a sign they’re digging deeper.

If they bounce after a few seconds, it might be time to rethink your content or its alignment with their expectations.

Actionable Tip: Follow up with tailored content for those who spend the most time on key pages.

2. Clicks to Conversions: Get that Bread

Tracking how long recipients spend on your linked pages tells you whether they’re genuinely interested or just browsing.

If they linger on a pricing page or case study, that’s a sign they’re digging deeper.

If they bounce after a few seconds, it might be time to rethink your content or its alignment with their expectations.

Actionable Tip: Test different CTAs and landing pages to optimize this number.

3. Heatmap Analytics: What’s Grabbing Their Attention?

Heatmaps reveal exactly where your recipients are clicking within your emails.

Are they engaging with your CTA, or are they distracted by a secondary link?

Use this data to refine your layout, placing key elements where they’re most likely to interact.

Actionable Tip: Position your most important CTAs where your audience is already clicking.

4. Reply Rates: Are They Talking Back?

For B2B marketers, reply rates are one of the clearest indicators of engagement. After all, a reply isn’t just interaction, it’s the beginning of a conversation.

Low reply rates? It might be time to adjust your tone or simplify your messaging.

Actionable Tip: Analyze your highest-performing emails and replicate their tone, structure, or offer in future campaigns.

5. Post-Click Behavior: What Happens After the Click?

Post-click behavior is where the real story unfolds. Do recipients stick around and explore your site or do they leave after one page?

A visitor who navigates multiple pages is showing clear interest, while one who exits immediately might indicate a disconnect between your email and your landing page.

Actionable Tip: Use this insight to create follow-up sequences that address where visitors dropped off or encourage them to revisit key pages.

These metrics give you a clearer picture of how your audience actually interacts with your emails and they’re the key to building campaigns that really work. 

Unlock High-Intent Leads Hiding on Your Site

Book a demo of Customers.ai’s U.S. website visitor identification, customer journey insights and remarketing platform to skyrocket conversions and sales.

Book a Demo

How Customers.ai Enhances Your Email Engagement Metrics

Tracking email engagement metrics is one thing. Turning it into action? That’s where Customers.ai takes center stage. 

Here’s how it transforms your email marketing strategy:

1. Real-Time Visitor Identification

Ever wonder who’s clicking your links and landing on your site? Customers.ai connects the dots, identifying anonymous visitors as they arrive.

What It Does: Tracks email recipients and matches them to on-site activity in real time.

Why It’s Powerful: You can immediately see which emails are driving traffic and who’s showing interest, giving you the context to act fast.

2. Behavioral Insights That Go Beyond the Click

Clicks are great but what happens after? Customers.ai customer journey tool digs deeper, showing you exactly what recipients do once they land on your site.

Example Actions Tracked: Viewing your pricing page, browsing products, or adding items to their cart.

Why It’s Powerful: Knowing post-click behavior means allows you to respond to what they’ve already shown interest in.

3. Advanced Segmentation and Custom Audiences

Customers.ai helps you segment your audience based on real-time engagement, making your campaigns more precise and impactful.

What It Does: Automatically groups recipients by behaviors like clicks, page views, and cart activity.

Why It’s Powerful: Create hyper-targeted campaigns. Retarget people who clicked a product link but didn’t buy or nurture those who engaged with educational content.

4. Recognizing and Acting on Return Visitors

Repeat visits signal deeper interest and Customers.ai ensures you don’t miss a beat.

What It Does: Identifies return visitors and connects their actions to past behaviors for a complete engagement profile.

Why It’s Powerful: Tailor your messaging to reflect their journey—for example, offering a discount to someone returning to a product page.

5. Cross-Channel Integration for a Full Picture

Customers.ai doesn’t stop at email. It connects engagement data across your entire marketing stack. We’re talking Facebook Ads, Google Ads, direct mail, and more.

What It Does: Combines email clicks with on-site actions, ad engagement, and even retargeting performance.

Why It’s Powerful: This unified view helps you understand your audience’s journey and craft campaigns that hit every touchpoint.

Example in Action: Let’s say a recipient clicks on a product link in your email, browses for five minutes, and then abandons their cart. Customers.ai tracks this journey, flags the visitor as a high-priority lead, and triggers a targeted cart recovery sequence. Later, if they return to your site, Customers.ai updates their profile and recommends additional actions like a personalized email offer.

With Customers.ai, engagement metrics are about more than tracking. Our goal is to help you truly understand your audience, segment them smartly, and craft personalized strategies that convert.

Strategies for Using Your Engagement Metrics to Optimize Campaigns

Now that we have a full understanding of the metrics themselves, we need to talk about how to actually use these metrics. After all, email engagement metrics are worthless on their own. You have to take action. 

Here’s how to use the data to make your email campaigns smarter, more targeted, and more effective.

1. Personalize Follow-Ups Based on Behavior

Not every recipient interacts with your email the same way, so your follow-ups shouldn’t be one-size-fits-all.

If They Clicked But Didn’t Convert: Don’t let them forget about you. Follow up with a reminder or sweeten the deal with a limited-time offer.

Example: “Still thinking about it? Here’s 10% off to make the decision easier.”

If They Clicked and Converted: Time to nurture that relationship. Add them to a loyalty sequence with exclusive perks or helpful content.

Example: “Thanks for your purchase! Here’s how to get the most out of [Product].”

2. Retarget with Precision

Post-click behavior tells you exactly what your recipients care about, so use it to make your retargeting ads laser-focused.

How It Works: If someone clicks a link to a product page but doesn’t buy, show them retargeting ads with the exact product they viewed.

Example: A carousel ad featuring the shoes they added to their cart but didn’t purchase.

Why It Works: Tailored retargeting feels relevant, not pushy, which increases the chance of engagement.

3. Segment by Engagement

Your audience isn’t a monolith. Use engagement metrics to divide and conquer.

High Engagers: These are your VIPs—the people clicking on every email, reading every blog, and exploring your site. Reward their loyalty with exclusive offers or early access to sales.

Example: “You’re one of our top customers—enjoy early access to our new collection!”

Low Engagers: They’re on the fence, and it’s your job to bring them back into the fold. Test fresh subject lines, tweak your timing, or add dynamic content to catch their attention.

Example: “We miss you! Here’s what’s new since your last visit.”

Pro Tip: Combine these strategies for maximum impact. Retarget low engagers with ads, follow up with clickers who didn’t convert, and nurture your top customers with personalized offers. 

Make Your Email Engagement Metrics Work for You

It sounds silly but email engagement metrics are more than just stats. They’re the key to creating smarter campaigns that actually connect with your audience. 

Whether it’s clicks, replies, or post-click behavior, these numbers tell you what’s working and where you need to level up.

But here’s the thing to remember – metrics don’t mean much if you’re not using them. You can have all the data in the world but if not used, and used right, you might as well not have any. 

To make the most of your email marketing data, personalize your follow-ups, retarget with intent, and segment your audience to give them exactly what they need, when they need it. 

That’s how you turn engagement into something that matters.

So, start looking beyond the basics, dig into the data, and make every email count!

Start your free trial of Customers.ai and get 500 emails free!

See Who Is On Your Site Right Now!

Get names, emails, phone numbers & more.

Try it Free, No Credit Card Required

Start Your Free Trial

Important Next Steps

See what targeted outbound marketing is all about. Capture and engage your first 500 website visitor leads with Customers.ai X-Ray website visitor identification for free.

Talk and learn about sales outreach automation with other growth enthusiasts. Join Customers.ai Island, our Facebook group of 40K marketers and entrepreneurs who are ready to support you.

Advance your marketing performance with Sales Outreach School, a free tutorial and training area for sales pros and marketers.

Email Engagement Metric FAQs

What are the most common email marketing metrics?

Common email marketing metrics include open rates, click-through rates (CTR), conversion rates, bounce rates, and unsubscribe rates. These metrics give you a baseline understanding of how well your campaigns are performing. For example, click-through rates show how effectively your email content drives engagement, while unsubscribe rates can signal if your frequency or relevance is off. Monitoring these metrics regularly helps you stay on top of your email marketing performance.

What is the difference between open rates and click-through rates?

Open rates measure the percentage of recipients who opened your email, while click-through rates measure the percentage who clicked on a link within the email. Open rates indicate the effectiveness of your subject lines and sender name, while click-through rates show how engaging your content and CTAs are. Focusing on both metrics gives you a clearer picture of your campaign’s strengths and weaknesses.

How do you calculate conversion rates in email marketing?

Conversion rates measure the percentage of recipients who complete a desired action, such as making a purchase or signing up for a webinar, after clicking on your email. To calculate, divide the number of conversions by the number of delivered emails, then multiply by 100. For example, if 200 recipients converted out of 5,000 delivered emails, your conversion rate would be 4%. High conversion rates often indicate well-targeted and engaging campaigns.

What is email deliverability, and why does it matter?

Email deliverability is the ability of your emails to reach your recipients’ inboxes instead of landing in spam folders. It’s a critical metric because even the most well-crafted campaigns are useless if they don’t reach their audience. Factors like sender reputation, email content, and list hygiene all impact deliverability. Monitoring and optimizing this metric ensures that your messages reach the people who need to see them.

How does segmentation improve email marketing metrics?

Segmentation involves dividing your email list into smaller groups based on characteristics like behavior, demographics, or purchase history. This strategy improves metrics like open rates, click-through rates, and conversions because your messages are more targeted and relevant. For example, sending a product recommendation email based on past purchases can lead to higher engagement and sales.

What role does A/B testing play in email marketing metrics?

A/B testing allows you to compare two versions of an email to see which performs better. You can test elements like subject lines, CTAs, or email design. For example, one subject line might yield a 20% open rate while another achieves 30%. By consistently A/B testing, you can optimize your campaigns and improve key metrics like open rates and conversions.

How do reply rates benefit email marketing campaigns?

Reply rates measure the percentage of recipients who respond to your email, making them particularly valuable in B2B or outreach campaigns. A high reply rate indicates strong engagement and opens the door for direct communication. To improve reply rates, personalize your messaging and ensure your emails feel conversational rather than promotional.

What is the impact of email frequency on metrics?

Email frequency directly affects metrics like open rates, unsubscribe rates, and engagement. Sending too frequently can lead to higher unsubscribe rates, while infrequent sends might cause your audience to forget about you. Striking the right balance is key—test different frequencies to find what resonates with your audience.

What are forward or share rates in email marketing?

Forward or share rates measure the percentage of recipients who forward your email to others or share it on social platforms. This metric indicates how valuable or engaging your content is to your audience. To encourage shares, include a clear CTA like “Share this with your team!” and ensure your email provides genuine value.

How can bounce rates be reduced in email campaigns?

Bounce rates measure the percentage of emails that couldn’t be delivered. To reduce bounce rates, maintain a clean email list by removing invalid or inactive addresses. Use double opt-ins to ensure valid entries and regularly check your sender reputation. A lower bounce rate means better deliverability and improved campaign performance.

Why are unsubscribe rates important to monitor?

Unsubscribe rates show how many recipients are opting out of your list. A high unsubscribe rate can signal issues with your frequency, content relevance, or targeting. Monitoring this metric helps you address potential problems early and refine your approach to keep your audience engaged.

What is email list hygiene, and how does it impact metrics?

Email list hygiene involves regularly cleaning your email list by removing invalid, inactive, or duplicate addresses. Poor list hygiene can hurt deliverability, increase bounce rates, and damage your sender reputation. A clean list ensures better engagement and helps maintain strong metrics across the board.

How can personalization improve email marketing metrics?

Personalized emails, such as those using a recipient’s name or past behavior, consistently perform better than generic ones. In fact, personalized subject lines can increase open rates by up to 50%. Tailoring content to specific audience segments boosts engagement metrics like click-through and conversion rates.

What is the impact of email timing on engagement metrics?

The timing of your emails can significantly impact metrics like open rates and click-through rates. For example, sending emails on Tuesday mornings might yield better engagement than late Friday afternoons. Use analytics to determine when your audience is most active and schedule sends accordingly.

How does mobile optimization affect email metrics?

With over 50% of emails opened on mobile devices, ensuring your emails are mobile-friendly is critical. Poor mobile design can lead to low engagement, higher bounce rates, and even unsubscribes. Optimize your emails with responsive design, short subject lines, and easily clickable links to improve engagement metrics.

What are click-to-open rates (CTOR) in email marketing?

CTOR measures the percentage of recipients who clicked a link after opening your email. It’s calculated by dividing the number of clicks by the number of opens and multiplying by 100. CTOR provides a clearer picture of how engaging your email content and CTAs are.

What is the role of heatmaps in email engagement metrics?

Heatmaps visually display where recipients are clicking within your email, helping you identify which elements drive the most interaction. Use heatmaps to refine your email layout and position CTAs or links in high-engagement areas.

How do cart abandonment emails affect engagement metrics?

Cart abandonment emails are automated messages sent to customers who leave items in their cart without purchasing. These emails typically achieve a 45% open rate and a 21% conversion rate, making them highly effective for re-engaging potential buyers and boosting key metrics like CTR and conversions.

What are lifetime engagement metrics?

Lifetime engagement metrics track recipient behavior over time, including opens, clicks, and purchases across multiple campaigns. These metrics help you identify loyal customers, re-engage inactive ones, and fine-tune your long-term email marketing strategy.

How do triggered emails improve engagement metrics?

Triggered emails are automated messages sent based on specific actions, like signing up for a newsletter or completing a purchase. These emails often have 2-3x higher open and click rates than general campaigns. By delivering highly relevant content, they boost key metrics like engagement and conversions.

How can email re-engagement campaigns improve metrics?

Re-engagement campaigns target inactive subscribers to reignite interest. Offer incentives like discounts or exclusive content to bring them back into the fold. Successful re-engagement campaigns can reduce unsubscribe rates and improve overall list health.

What is the relationship between email content length and engagement metrics?

Shorter, concise emails often perform better, as they respect the recipient’s time and make CTAs easy to spot. However, longer emails can work if the content is valuable and engaging. Test different lengths to see what resonates with your audience and improves metrics like click-through rates.

What role do call-to-actions (CTAs) play in email metrics?

CTAs are the driving force behind clicks and conversions. Clear, action-oriented CTAs like “Shop Now” or “Get Started” encourage recipients to take the next step. Strong CTAs significantly improve metrics like click-through and conversion rates.
The post Email Engagement Metrics: Going Beyond Open Rates appeared first on Customers.ai.

From Anonymous to Known: Behavioral Data’s Role in Website Visitor I …

Your website traffic isn’t just numbers. It’s potential customers. Real people are clicking, scrolling, and exploring, yet most of them remain a mystery. 

How are you supposed to convert visitors if you don’t know who they are? And even if you do know who they are, you need more information than just their name and email. 

The answer is behavioral data. 

By tracking not just who your visitors are but they do on your site (think what pages they visit, how long they stay, the actions they take, etc.), you can start piecing together a clearer picture of who’s behind the clicks and better understand how to reach them.

And that’s what we’re going to do. We’re diving into how behavioral data enhances visitor identification and gives you the information you need to turn traffic into results.

See Who Is On Your Site Right Now!

Get names, emails, phone numbers & more.

Try it Free, No Credit Card Required

Start Your Free Trial

What Behavioral Data Can Tell You About Your Website Visitors

Behavioral data is more than just numbers on a dashboard. It’s the story of what your visitors are doing on your site and, more importantly, why they’re doing it. 

Every click, scroll, and interaction reveals something about their interests, intent, and engagement level.

When you track behavioral data, you’re getting to see what your visitors are actually doing and get a better sense of what they want.

What They Care About: By looking at the pages they visit and the content they engage with, you can start to understand their interests. Are they glued to your product pages? Binging your blog? Their activity tells you what’s catching their eye.

What They Want: High-value actions, like checking out your pricing page, signing up for a demo, or downloading a resource, scream intent. These behaviors are gold for understanding who’s ready to take the next step.

When They Engage: Timing is everything. Behavioral data tracks when visitors come back, how often, and how long they stay. Frequent return visits or extended time on key pages? That’s a signal they’re seriously interested.

And if you’re thinking, “Great, but does it really work?” here’s a stat to back it up: behavioral targeting can boost conversion rates by up to 70%.

Bottom line? Behavioral data is an active tool for understanding your visitors and figuring out how to turn their browsing into a purchase.

How Behavioral Data Supports Visitor Identification

A visitor repeatedly views your Buffalo Bills hats page. They’re not adding it to the cart or signing up for your email discounts but their behavior is screaming high intent. 

Without ever speaking to them, you already know they’re interested in what you’re offering. 

That’s the power of behavioral data. It gives you insights that forms and purchase data alone can’t and by analyzing visitor activity, you can uncover who they are, what they want, and how to engage them effectively. 

Here’s how:

1. Connecting the Dots Through Patterns

Behavioral data reveals patterns in visitor actions, even when they don’t fill out a form.

Example: A visitor who repeatedly views the same product page or spends significant time on your shipping page likely has high purchase intent.

What to Do: Use these patterns to prioritize outreach, retarget with tailored ads, or prompt lead capture CTAs.

2. Enrich Visitor Profiles with External Tools

Combine behavioral data with third-party resources to fill in the blanks.

How It Works: Tools like Customers.ai match profiles with company details or identify repeat visitors through persistent IDs.

What to Do: Add these insights to your CRM to create enriched profiles that help your sales and marketing teams tailor their approach.

3. Predict Visitor Demographics and Intent

AI and machine learning take behavioral data to the next level by predicting key visitor details.

What It Does: Analyzes actions like time spent, pages visited, and engagement levels to determine purchase likelihood or segment placement.

What to Do: Use predictive insights to send personalized follow-ups or prioritize high-intent leads in your sales funnel.

Pro Tip: Don’t just collect behavioral data…act on it. Whether it’s triggering a dynamic pop-up, automating a personalized email, or segmenting visitors in real-time, the value of this data is in how you use it.

By leveraging behavioral data strategically, you’re building a roadmap to convert them visitors into customers much faster.

Unlock High-Intent Leads Hiding on Your Site

Book a demo of Customers.ai’s U.S. website visitor identification, customer journey insights and remarketing platform to skyrocket conversions and sales.

Book a Demo

Tools for Using Behavioral Data to Identify Website Visitors

The right tools make all the difference when it comes to turning behavioral insights into useful data. 

Here’s a breakdown of the top platforms that help you identify anonymous website visitors and understand their actions:

1. Visitor Identification Platforms (Start with Customers.ai)

Customers.ai is built to bridge the gap between anonymous visitors and actionable leads.

What It Does: Combines behavioral data (like page views and time spent) with external databases to de-anonymize visitors.

Why It’s Essential: Identify high-intent visitors in real-time and enrich their profiles with key details, such as company info and contact data.

Bonus: Customers.ai doesn’t just track visitors—it empowers you to take immediate action with tools for outreach and lead nurturing.

2. Behavioral Analytics Tools

Platforms like Google Analytics and Hotjar help you understand what’s happening on your site.

What They Do: Track visitor actions like clicks, navigation paths, and time spent on each page.

Why They Matter: Highlight behavioral trends that indicate intent or areas of friction in the user journey.

Pro Tip: Use heatmaps (via Hotjar) to visualize where visitors are clicking and how they interact with key elements.

3. Dynamic Personalization Engines

Personalization platforms like Optimizely and Dynamic Yield adapt your site content based on visitor behavior.

What They Do: Adjust banners, CTAs, or product recommendations in real-time to match visitor preferences.

Why They Work: Tailored experiences increase engagement, reduce bounce rates, and drive conversions.

Example: A visitor repeatedly browsing women’s shoes sees a homepage banner highlighting trending footwear styles in their size.

Pro Tip: Combine these tools for a complete picture. Use behavioral analytics to understand trends, Customers.ai to de-anonymize visitors, and personalization engines to engage them meaningfully. Together, they create a powerful system for turning visitors into customers.

Strategies for Leveraging Behavioral Data to Identify Visitors

Behavioral data is all about driving your visitors to purchase but it requires a strategic approach. Here are three methods to help you identify and engage visitors effectively:

1. Trigger-Based Identification

Use specific visitor actions as triggers to engage your visitors in real time and drive conversions.

What It Is: Identify key moments in a visitor’s journey, like viewing a pricing page multiple times, hovering over a contact button, or spending extra time on your demo page, and set triggers for engagement.

How to Do It: Tools like Customers.ai can work in coordination with Klaviyo to track these behaviors in real-time and prompt personalized actions, such as automated emails.

Example in Action: A visitor checks out pricing three times but hasn’t taken action. A dynamic pop-up appears with, “Still deciding? Book a quick demo today,” paired with a one-click scheduling link.

Why It Works: Trigger-based actions make your outreach timely and relevant, engaging visitors when their interest is at its peak.

Pro Tip: Experiment with different triggers for various intent levels. For instance, set softer triggers for blog readers and more direct ones for visitors on pricing or checkout pages.

2. Behavioral Segmentation

Not all visitors are ready to convert right away. Behavioral segmentation lets you group them based on their actions and intent, so you can deliver tailored experiences.

High Intent: Visitors actively browsing pricing pages, booking demo links, or adding items to their cart. These are your warmest leads and should receive direct CTAs like “Get Started Now” or “Book Your Free Demo.”

Medium Intent: Blog readers, resource downloaders, or those who engage with your educational content. Nurture these leads with targeted follow-ups, like case studies or webinars.

Low Intent: Visitors casually exploring your homepage or clicking around with no clear pattern. These users are ideal for awareness-building campaigns or retargeting ads to keep your brand top of mind.

How to Use It: Segment visitors dynamically using behavioral data so they’re always grouped based on their most recent actions. For example, a casual browser who downloads a resource should immediately shift into your medium-intent funnel.

Pro Tip: Behavioral segmentation ensures your resources are focused where they matter most, prioritizing leads with the highest likelihood to convert.

3. Real-Time Visitor Profiling

Identifying visitors while they’re still active on your site allows you to take action before they leave.

What It Does: Real-time profiling combines on-site behaviors (like navigation patterns, time spent, and clicks) with external data sources to create enriched visitor profiles.

How It Works: Tools like Customers.ai use IP matching, cookies, and behavioral tracking to identify visitors and provide information like company details or purchase readiness.

Example in Action: A visitor spends 5 minutes on your pricing page, views your demo video, and downloads a resource. Based on these actions, your system flags them as a high-intent B2B lead and alerts your sales team for immediate outreach.

Why It Works: Engaging visitors in real time means you’re catching them while their interest is fresh, increasing the likelihood of conversion.

Pro Tip: Pair real-time profiling with live chat or proactive pop-ups to create personalized, on-the-spot engagement.

By layering trigger-based identification, behavioral segmentation, and real-time profiling, you’re actively guiding visitors through their journey and toward a sale. 

Advanced Use Cases for Behavioral Data in Visitor Identification

Taking behavioral data to the next level means using it not just to track, but to act. Here are three advanced ways businesses are leveraging it for visitor identification and engagement:

Abandoned Cart Recovery

A visitor spends 10 minutes browsing your site, adds three items to their cart, and then—poof—they’re gone. 

Behavioral data tracks that exit and tags them for re-engagement and abandoned cart recovery. The follow-up might look like this:

Email: “Forgot something? Complete your order now and enjoy free shipping!”

Retargeting Ad: Show them the exact products they abandoned, with an extra incentive to come back.

By tracking cart behavior, you’re giving those almost-customers a reason to come back and complete their purchase.

On-Site Lead Scoring

Not every visitor is ready to convert but behavioral data helps you focus on the ones who are closest to taking action.

Here’s how:

Visitors earn points for high-intent actions, like requesting a demo (+10), downloading a resource (+5), or revisiting your pricing page (+8).

Once they hit a certain score, they’re flagged for immediate follow-up by your sales team or entered into a high-touch nurture flow.This approach ensures you’re focusing your efforts on leads with the highest potential, while automated systems handle the rest.

Cross-Channel Insights

Behavioral data works best when it’s tied together, giving you a full picture of how visitors interact across your marketing channels.

Imagine this flow:

A visitor clicks a Google Ad, visits your pricing page, and leaves without converting.

A week later, they open an email about your free trial and click through to your site again.

Your system identifies this cross-channel behavior, tagging them as a high-intent lead for immediate outreach.This unified view ensures your team has the full context for every interaction, making your follow-ups smarter and more impactful.

The real power of advanced behavioral data is in how it turns data into opportunities. Whether you’re recovering lost sales, targeting high-value leads, or tying together activity across channels, these strategies help you transform anonymous traffic into real results.

Real-World Wins: Behavioral Data in Action

Seeing how behavioral data works in practice can make all the difference. Here are three examples of businesses using it to drive real results:

1. B2B SaaS: Turning Browsers into Demo Requests

A SaaS platform wanted to connect with anonymous companies visiting their website but leaving without engaging. 

By using a visitor identification platform, they identified which businesses were browsing key pages like pricing and demo info.

The Result: Their sales team prioritized outreach to high-intent companies, increasing demo requests by 40%.

Why It Works: Behavioral data helped the team focus on visitors most likely to convert, saving time and boosting efficiency.

2. Ecommerce: Personalizing the Shopper Journey

An ecommerce retailer used behavioral data to track visitor actions, such as products viewed, items added to wish lists, and abandoned carts. With this data, they created tailored product recommendations and automated follow-up emails.

The Result: Cart completions increased by 25%, driven by personalized nudges like “Still thinking about this item? Here’s 10% off to make it yours.”

Why It Works: Personalization based on real-time behavior meets customers where they are, making the buying experience feel seamless and relevant.

3. Digital Agencies: Accelerating Lead Conversions

A digital marketing agency leveraged behavioral data to uncover high-value leads among their website traffic. 

By tracking visitor engagement, like time spent on case study pages or downloading guides, they flagged prospects showing serious interest. 

Automated alerts sent these leads directly to their sales team for immediate follow-up.

The Result: Conversion times shortened significantly, with a 30% increase in qualified leads moving through the funnel.

Why It Works: Behavioral data ensured their team acted quickly on visitors who were already engaged and ready to talk.

Whether you’re in SaaS, ecommerce, or marketing, these strategies show what’s possible when you truly understand your visitors.

See all of our case studies >>

Using Behavioral Data to Identify Website Visitors

Behavioral data isn’t just about tracking what your visitors do. It’s about understanding who they are and what they need. 

By analyzing their actions and using the right tools, you can turn anonymous clicks into real connections and opportunities.

If you’re ready to take your visitor identification game to the next level, tools like Customers.ai make it easy to combine behavioral insights with advanced technology. From tracking patterns to identifying high-value leads, it’s all about using data to act smarter and faster.

Your visitors are already giving you the clues. 

Their actions are telling you what they’re interested in, where they’re stuck, and when they’re ready to take the next step. 

It’s time to listen and turn that data into results.

Start by putting Customers.ai into action. Begin your free trial and we’ll give you 500 contacts at no cost!

See Who Is On Your Site Right Now!

Get names, emails, phone numbers & more.

Try it Free, No Credit Card Required

Start Your Free Trial

Important Next Steps

See what targeted outbound marketing is all about. Capture and engage your first 500 website visitor leads with Customers.ai X-Ray website visitor identification for free.

Talk and learn about sales outreach automation with other growth enthusiasts. Join Customers.ai Island, our Facebook group of 40K marketers and entrepreneurs who are ready to support you.

Advance your marketing performance with Sales Outreach School, a free tutorial and training area for sales pros and marketers.

The post From Anonymous to Known: Behavioral Data’s Role in Website Visitor Identification appeared first on Customers.ai.

OpenAI Announces OpenAI o3: A Measured Advancement in AI Reasoning wit …

On December 20, OpenAI announced OpenAI o3, the latest model in its o-Model Reasoning Series. Building on its predecessors, o3 showcases advancements in mathematical and scientific reasoning, prompting discussions about its capabilities and constraints. This article takes a closer look at the insights and implications surrounding OpenAI o3, weaving in information from official announcements, expert analyses, and community reactions.

Progress in Reasoning Capabilities

OpenAI describes o3 as a model designed to refine reasoning in areas requiring structured thought, such as mathematics and science. The model was tested using a specialized reasoning benchmark ARC AGI, where it reportedly surpassed the previous model score of 32% and went up to 87%. This advancement demonstrates o3’s improved capacity to address complex logical and mathematical problems.

source: https://arcprize.org/blog/oai-o3-pub-breakthrough

The model’s enhanced abilities stem from an architecture tailored for hierarchical reasoning tasks. While this marks a step toward broader reasoning abilities, OpenAI acknowledges that o3 is far from achieving Artificial General Intelligence (AGI).

Performance Overview

source: https://x.com/OpenAI/status/1870186518230511844

Mathematics: Achieved a 96.7% success rate on advanced mathematical tests, a notable improvement over o1’s 56.7%.

Scientific Reasoning: Displayed a 10% increase in accuracy for solving PhD-level Science Questions.

Code Understanding: Demonstrated capability in comprehending and debugging code snippets, offering potential utility in software development.

Architectural Innovations

OpenAI o3 employs a hybrid reasoning framework, combining neural-symbolic learning with probabilistic logic. This architecture enables the model to:

Break Down Problems: Simplify complex queries into smaller, manageable components.

Leverage Context: Utilize extended memory to retain context over prolonged interactions.

Iterate Solutions: Refine answers through multiple reasoning cycles.

These features make o3 particularly adept at tackling multi-step reasoning challenges where traditional Transformer-based models often falter.

Real-World Applications

OpenAI o3 could benefit several fields:

Education: Assist students with complex mathematical and scientific problems.

Healthcare: Support diagnostic processes and optimize treatment plans through data analysis.

Software Development: Debug and generate code, providing practical support for developers.

OpenAI’s Broader Vision

OpenAI released a video that illustrates its vision for AI reasoning. The demonstrations include o3 addressing problems in physics, mathematics, and ethical dilemmas, underscoring its aspirations to develop models capable of reasoning across a wide range of scenarios.

Today, we shared evals for an early version of the next model in our o-model reasoning series: OpenAI o3 pic.twitter.com/e4dQWdLbAD— OpenAI (@OpenAI) December 20, 2024

Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….
The post OpenAI Announces OpenAI o3: A Measured Advancement in AI Reasoning with 87.5% Score on Arc AGI Benchmarks appeared first on MarkTechPost.

Mix-LN: A Hybrid Normalization Technique that Combines the Strengths o …

The Large Language Models (LLMs) are highly promising in Artificial Intelligence. However, despite training on large datasets covering various languages 

and topics, the ability to understand and generate text is sometimes overstated. LLM applications across multiple domains have proven to have little impact on improving human-computer interactions or creating innovative solutions. This is because the deep layers of the LLMS don’t contribute much and, if removed, don’t affect their performance. This underutilization of deep layers shows inefficiency within the models.

Current methods showed that deeper layers of LLMs contributed little to their performance. Although used to stabilize training, techniques like pre-LN and post-LN showed significant limitations. Pre-LN reduced the magnitude of gradients in deeper layers, limiting their effectiveness, while post-LN caused gradients to vanish in earlier layers. Despite efforts to address these issues through dynamic linear combinations and Adaptive Model Initialization, these techniques do not fully optimize LLM performance.

To address this issue, researchers from the Dalian University of Technology, the University of Surrey, the Eindhoven University of Technology, and the University of Oxford proposed Mix-LN. This normalization technique combines the strengths of Pre-LN and Post-LN within the same model. Mix-LN applies Post-LN to the earlier layers and Pre-LN to the deeper layers to ensure more uniform gradients. This approach allows both shallow and deep layers to contribute effectively to training. The researchers evaluated the hypothesis that deeper layers in LLMs were inefficient due to pre-LN. The main difference between post-LN and pre-LN architectures is layer normalization (LN) placement. In post-LN, LN is applied after the residual addition, while in pre-LN, it is used before. 

Researchers compared pre- and post-LN models in large-scale open-weight and small-scale in-house LLMs. Metrics such as angular distance and performance drop assessed layer effectiveness. Early layers were less effective in BERT-Large (Post-LN) than in deeper layers. In LLaMa2-7B (Pre-LN), deeper layers were less effective, and pruning them showed minimal performance impact. Researchers observed similar trends in LLaMa-130M, where Pre-LN layers were less effective at deeper levels, and Post-LN maintained better performance in deeper layers. These results suggested that Pre-LN caused the inefficiency of deeper layers.

The optimal Post-LN ratio α for Mix-LN was determined through experiments with LLaMA-1B on the C4 dataset. The best performance occurred at α = 0.25, where perplexity was lowest. For the remaining layers, performance decreased but remained higher than the performance recorded by Pre-LN compared to the layers that adopted Post-LN. Mix-LN also supported a broader range of representations and maintained a healthier gradient norm for deeper layers to contribute effectively. Mix-LN achieved significantly low perplexity scores, outperforming other normalization methods.

In conclusion, the researchers identified inefficiencies caused by Pre-LN in deep layers of large language models (LLMs) and proposed Mix-LN as a solution. Experiments showed that Mix-LN outperformed both Pre-LN and Post-LN, improving model performance during pre-training and fine-tuning without increasing model size. This approach can act as a baseline for future research, offering a foundation for further enhancements in training deep models and advancing model efficiency and capacity.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….
The post Mix-LN: A Hybrid Normalization Technique that Combines the Strengths of both Pre-Layer Normalization and Post-Layer Normalization appeared first on MarkTechPost.

Researchers from ETH Zurich and UC Berkeley Introduce MaxInfoRL: A New …

Reinforcement Learning, despite its popularity in a variety of fields, faces some fundamental difficulties that refrain users from exploiting its full potential. To begin with, algorithms like PPO, which are widely used, suffer from the curse of sample inefficiency (the need for multiple episodes to learn basic actions). Moving on, Off-Policy methods like SAC and DrQ offer some immunity against the above problem. They are applicable in the real world while being compute-efficient, but they have drawbacks. Off-policy methods often require dense reward signals, which means their performance is undermined in rewards’ sparsity or local optima. This suboptimality can be attributed to naive exploration schemes such as ε-greedy and Boltzmann exploration. The scalability and simplicity of these algorithms are appealing enough for users to accept the trade-off with optimality.

Intrinsic exploration has recently shown great potential in this regard, where reward signals such as information gain and curiosity improve the exploration of RL agents. Approaches to maximizing information gain show great theoretical potential and have even achieved empirical state-of-the-art (SOTA). While this approach appears promising in theory, a gap exists in balancing intrinsic and naive extrinsic exploration objectives. This article discusses the latest research that claims to find a balance between intrinsic and extrinsic exploration in practice.

Researchers from ETH Zurich and UC Berkeley have put forth MAXINFORL, which improves the naive old exploration techniques and aligns them theoretically and practically with intrinsic rewards. MAXINFORL is a novel class of Off-policy model-free algorithms for continuous state-action spaces that augment existing RL methods with directed exploration. It takes the standard Boltzmann exploration technique and enhances it through an intrinsic reward. The authors propose a practical auto-tuning procedure simplifying the trade-off between exploration and rewards. Thus, the algorithms modified by MAXINFORL explore by visiting trajectories that achieve the maximum information gain while efficiently solving the task. The authors also show that the proposed algorithms benefit from all theoretical properties of contraction and convergence that hold for other max-entropy RL algorithms, such as SAC.

Let us jog down memory lane and review intrinsic rewards, precisely information gains, to get the fundamentals right. They enable RL agents to acquire information in a more principled manner by directing agents toward underexplored regions. In MAXINFORL, the authors use intrinsic rewards to guide exploration such that, instead of random sampling, the exploration is informed to cover the state-action spaces efficiently. For this, the authors modify ε-greedy selection to learn Optimal Q for extrinsic and intrinsic rewards, determining the action to be taken. Thus, ε–MAXINFORL augments the Boltzmann Exploration strategy. However, the augmented policy presents a trade-off between value function maximization and the entropy of states, rewards, and actions. MAXINFORL introduces two exploration bonuses in this augmentation: policy entropy and information gain. Additionally, in this strategy, the Q-function and policy update rules converge to an optimal policy.

The research team evaluated MAXINFORL with Boltzmann exploration across several deep RL benchmarks on state-based and visual control tasks. The SAC method was used for state-based tasks, and for visual control tasks, the authors combined the algorithm with DrQ. The authors compared MAXINFORL against various baselines across tasks of different dimensionality. It was observed that MAXINFORLSAC performed consistently across all tasks, while other baselines struggled to maintain comparable performance. Even in environments requiring complex exploration, MAXINFORL achieved the best performance. The paper also compared the performance of SAC with and without MAXINFORL and found a stark improvement in speed. For visual tasks, MAXINFORL also achieved substantial gains in performance and sample efficiency.

Conclusion: Researchers presented MAXINFORL algorithms that augmented naive extrinsic exploration techniques to achieve intrinsic rewards by targeting high entropy in state rewards and actions.. In a variety of benchmark tasks involving state-based and visual control, it outperformed off-policy baselines. However, since it required training several models, it was burdened by computational overhead.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….
The post Researchers from ETH Zurich and UC Berkeley Introduce MaxInfoRL: A New Reinforcement Learning Framework for Balancing Intrinsic and Extrinsic Exploration appeared first on MarkTechPost.

Apple Researchers Introduce ARMADA: An AI System for Augmenting Apple …

Imitation learning (IL) is one of the methods in robotics where robots are trained to mimic human actions based on expert demonstrations. This method relies on supervised machine learning and requires significant human-generated data to guide the robot’s behavior. Although effective for complex tasks, imitation learning is limited by the lack of large-scale datasets and challenges in scaling data collection, unlike language and vision models. Learning from human video demonstrations faces big challenges because robots cannot match the sensitivity and flexibility of human hands. These differences make it hard for imitation learning to work effectively or scale up for general robot tasks.

Traditional imitation learning (IL) relied on human-operated robots, which were effective but faced significant limitations. These systems are based on teleoperation via gloves, motion capture, and VR devices and rely on complex setups and the low-latency control loop. They also relied on physical robots and special-purpose hardware, which was difficult to scale. Although robots could perform tasks such as inserting batteries or tying shoelaces using expert data collected by these approaches, the need for special equipment made such approaches impractical for large-scale or more general use.

To solve this, a group of researchers from Apple and the University of Colorado Boulder proposed the ARMADA system, which integrates the Apple Vision Pro headset with external robot control using a combination of ROS and WebSockets. This setup let communication between the devices, where the system could be plug-and-play and was flexible to many robot platforms, such as Franka and UR5, by only replacing 3D model files and data formatting for the headset. The ARMADA app handled robot visualization, data storage, and a user interface, receiving transformation frames for robot links, capturing image frames from cameras, and tracking human skeleton data for processing. The robot node managed control, data storage, and constraint calculation, transforming skeletal data into robot commands and detecting workspace violations, singularities, and speed issues for real-time feedback.

The robot’s movements were aligned with human wrist and finger positions, tracked through ARKit in vision 2.0, using inverse kinematics to calculate joint positions and control a gripper based on finger spacing. Constraints like singularity, workspace limits, and speed violations were visualized through color changes, virtual boundaries, or on-screen text. Researchers used the ARMADA system to perform three tasks: picking a tissue from a box, placing a toy into a cardboard box, and wiping a table with both hands. Each task had five starting states, and success was based on specific criteria. Wearing Apple Vision Pro with ARMADA software on visionOS 2.0, participants provided 45 demonstrations under three feedback conditions: No Feedback, Feedback, and Post Feedback. Wrist and finger movements were tracked in real-time using ARKit, and robot movements were controlled via inverse kinematics, with joint trajectories recorded for replay.

Upon evaluation, the results showed that feedback visualization significantly improved replay success rates for tasks like Pick Tissue, Declutter, and Bimanual Wipe, with gains of up to 85% compared to no feedback. Post-feedback demonstrations also showed improvements but were less effective than real-time feedback. Participants found the feedback intuitive and useful for understanding robot motion, and the system worked well for users with varying experience levels. Common failure modes without feedback included imprecise robot poses and gripper issues. Participants adjusted their behavior during demonstrations, slowing down and changing hand positions, and could visualize feedback after removing it.

In summary, the proposed ARMADA system addressed the challenge of scalable data collection for robot imitation learning by using augmented reality for real-time feedback to improve data quality and compatibility with physical robots. The results showed the importance of feedback for aligning robot-free demonstrations with real robot kinematics. While the study focused on simpler tasks, future research can explore more complex ones and refine techniques. This system can serve as a baseline for future robotics research, particularly in training robot control policies through imitation learning with visual observations.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….
The post Apple Researchers Introduce ARMADA: An AI System for Augmenting Apple Vision Pro with Real-Time Virtual Robot Feedback appeared first on MarkTechPost.

Meet FineFineWeb: An Open-Sourced Automatic Classification System for …

Multimodal Art Projection (M-A-P) researchers have introduced FineFineWeb, a large open-source automatic classification system for fine-grained web data. The project decomposes the deduplicated Fineweb into 67 unique categories with extensive seed data. Moreover, a comprehensive correlation analysis between vertical categories and common benchmarks and detailed URL and content distribution analysis are conducted. The system provides specialized test sets for PPL evaluation, featuring both “small cup” validation and “medium cup” test options. Complete training materials for FastText and Bert implementation accompany the dataset, with upcoming suggestions for data proportioning based on RegMix methodology.

The data construction process for FineFineWeb follows a systematic multi-step workflow. The initial deduplication of FineWeb employs exact deduplication and MinHash techniques. URL labeling utilizes GPT-4 to process the top million root URLs, categorizing them into Domain-of-Interest (DoI) and Domain-of-Non-Interest (DoNI) URLs. Further, the coarse recall phase involves domain-specific sampling based on the labeled root URLs, with Qwen2-7B-Instruct handling the labeling of 500K positive and negative data points. FastText models, trained on this labeled data, perform coarse recall operations across FineWeb to generate Coarse DoI Data.

The fine recall stage advances the data refinement process using Qwen2-72B-Instruct to label the Coarse DoI Data, creating 100K Dol positive and 100K Dol negative data points. After that, a BERT model, trained on this labeled data, performs fine recall to produce the final DoI subset of FineFineWeb. Moreover, the entire coarse-fine recall iteration undergoes three rounds with specific modifications: 

FastText is re-trained using updated seed data, which combines BERT-recalled samples, BERT-dropped samples, and previously labeled seed data.

The BERT model keeps frozen during subsequent iterations.

Steps for training FastText, coarse recall, and fine recall are repeated without re-labeling data with Qwen2-Instruct models.

The domain-domain similarity Analysis employs a sophisticated analytical approach using proportional weighted sampling across domain subsets, processing one billion tokens from the domain subsets. Then the BGE-M3 model is used to generate two types of embeddings: domain embeddings from domain subset samples and benchmark embeddings from benchmark samples. The analysis concludes by calculating MMD and Wasserstein distances between domain embeddings and benchmark embeddings to quantify domain relationships.

The similarity analysis reveals several key patterns in domain-benchmark relationships. Code-related benchmarks (MBPP and HumanEval) show significant distance from most domains except mathematics, indicating limited code representation in the dataset. General knowledge benchmarks (Hellaswag, ARC, MMLU, BoolQ) demonstrate close relationships with multiple domains, suggesting broad knowledge distribution, while excluding gambling content. Moreover, GSM8K and TriviaQA exhibit notable domain-specific variations, particularly in mathematics and factual content. Lastly, the gambling domain stands distinctly separate, showing minimal overlap with other domains and benchmarks.

The domain-domain duplication analysis examines URL uniqueness across domains using TF-IDF values. High TF-IDF scores indicate domain-specific unique URLs, while low values suggest common URLs across domains. The analysis reveals minimal duplication across most domains, with exceptions in topicality, pet, and atmospheric science categories. The domain-benchmark correlation study, conducted across 28 models, compares domain-specific performance (BPC) rankings with benchmark performance rankings using Spearman correlation. STEM-related domains show stronger correlations with reasoning-focused benchmarks (ARC, MMLU, GSM8K, HumanEval, MBPP), while knowledge-intensive domains like literature and history correlate higher with fact-based benchmarks like TriviaQA.

Check out the Dataset and Tweet. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….
The post Meet FineFineWeb: An Open-Sourced Automatic Classification System for Fine-Grained Web Data appeared first on MarkTechPost.

This AI Paper from aiXplain Introduces Bel Esprit: A Multi-Agent Frame …

Artificial intelligence has progressed from handling atomic tasks to addressing intricate, real-world problems requiring the integration of multiple specialized models. This approach, known as AI pipelines, allows for seamless task transitions by connecting different models to process diverse data inputs and outputs. These pipelines enable complex applications like multilingual video dubbing, multimodal content moderation, and advanced speech translation. The growing sophistication of AI pipelines reflects the increasing need for automated solutions that simplify and streamline challenging computational tasks in various domains.

Addressing complex computational challenges requires coordinating multiple models to handle different aspects of a problem. Current solutions often fall short when faced with ambiguous user requirements, poorly defined task parameters, and mismatched data modalities. For instance, computational tasks like multilingual dubbing demand careful alignment of inputs and outputs, such as matching audio transcription to translation models and text-to-speech synthesis. Such complexities make manual intervention necessary, slowing progress and leading to inefficiencies.

Existing methods for building AI pipelines often rely on static frameworks and predefined models tailored to specific tasks. While these approaches can handle isolated problems effectively, they lack adaptability. Manual adjustments are frequently required to address missing information, ensure semantic alignment, or resolve errors arising from mismatched modalities. Moreover, the rigidity of current systems limits their ability to cater to diverse user queries, leaving significant room for improvement in both flexibility and accuracy.

Researchers from aiXplain, Inc. and Los Gatos introduced a novel AI framework called Bel Esprit to overcome these challenges. This multi-agent system facilitates building customizable AI model pipelines tailored to user needs. Bel Esprit features specialized subagents, including Mentalist for clarifying user queries, Builder for pipeline assembly, and Inspector for error detection and correction. By employing a collaborative and iterative approach, the framework ensures pipelines are accurate and aligned with user intent. The system is designed to work dynamically, refining user inputs and optimizing the models chosen for specific tasks.

Bel Esprit is a graph-based framework with nodes representing AI functions and edges representing data flows. The Mentalist subagent begins by analyzing user queries to clarify ambiguous details, converting them into comprehensive task specifications. Builder then constructs an initial pipeline, breaking the task into manageable subgraphs. For example, distinct branches are created for each language in a multilingual dubbing task. The inspector reviews the pipeline for structural and semantic errors, ensuring alignment with the refined user requirements. This iterative process leverages techniques like chain-of-branches, where smaller subgraphs are built sequentially, facilitating model reuse and minimizing errors. Further, Bel Esprit integrates advanced large language models (LLMs) to automate reasoning and ensure seamless task execution.

The performance of Bel Esprit demonstrates its significant potential for transforming pipeline construction. The system achieved considerable results using exact match (EM) and graph edit distance (GED) metrics. The overall EM rate increased by 9.5%, indicating a higher rate of perfectly constructed pipelines. GED errors decreased by 28.1%, showcasing improvements in reducing discrepancies between generated and reference pipelines. For instance, when applied to multilingual video dubbing, Bel Esprit optimized workflows by reusing AI nodes, such as automatic speech recognition (ASR) models, across branches for different languages. This led to a streamlined pipeline construction process with fewer errors. Also, Bel Esprit effectively handled ambiguous user queries, with performance enhancements being more pronounced in cases where user input lacked clarity. The system’s iterative process ensured alignment with user intent, even in highly complex scenarios.

Bel Esprit significantly advances AI pipeline construction, addressing key ambiguity issues and error-prone assembly processes. Its innovative multi-agent collaboration, iterative refinement, and state-of-the-art models make it a robust solution for complex computational tasks. Bel Esprit sets a new benchmark for adaptability and precision in the field by automating critical stages of pipeline building and ensuring semantic accuracy. The framework’s demonstrated ability to improve efficiency and handle complex queries underscores its potential as a transformative tool in AI applications.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….
The post This AI Paper from aiXplain Introduces Bel Esprit: A Multi-Agent Framework for Building Accurate and Adaptive AI Model Pipelines appeared first on MarkTechPost.

Add a generative AI experience to your website or web application with …

Generative AI offers many benefits for both you, as a software provider, and your end-users. AI assistants can help users generate insights, get help, and find information that may be hard to surface using traditional means. In addition, they can help your employees reduce repetitive tasks and focus on high-value work. However, adding generative AI assistants to your website or web application requires significant domain knowledge and the technical expertise to build, deploy, and maintain the infrastructure and end-user experience. These challenges fall outside of some software providers’ core domain, creating barriers to offering AI assistants to users.
Amazon Q Business is a generative AI assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems. Amazon Q Business securely unites disparate data with over 40 built-in connectors to popular enterprise applications, document repositories, chat applications, and knowledge management systems. You can use natural language to request information or assistance to generate content. Amazon Q Business handles the complexity of deploying and maintaining the infrastructure required for generative AI assistants so you can focus on creating a delightful end-user experience.
Amazon Q embedded is a feature that lets you embed a hosted Amazon Q Business assistant on your website or application to create more personalized experiences that boost end-users’ productivity. You can configure the assistant with guardrails to define global and topic-level controls for your environment. With an embedded Amazon Q Business assistant, end-users can receive immediate, permission-aware responses from your data sources, with citations.
In this post, we demonstrate how to use the Amazon Q embedded feature to add an Amazon Q Business assistant to your website or web application using basic HTML or React. We also show you how to use the feature with content management systems like WordPress and Drupal. This post includes a sample webpage for Amazon Q Business that allows you to quickly test and demonstrate your AI assistant. This allows you to develop the changes on your website or application in parallel while refining your Amazon Q Business configurations.
Solution overview
Embedding Amazon Q Business gives your users access to a generative AI assistant without leaving your website or web application. Integrating the assistant involves creating an Amazon Q Business application, adding users or groups, connecting relevant data sources, allowlisting your domain, and finally adding an HTML inline frame (iframe) element to your website or web application.
Prerequisites
In this section, we walk through how to set up an Amazon Q Business application, permissions, and user access.
Amazon Q Business application
The Amazon Q embedded feature requires an Amazon Q Business application. If you don’t have an existing application, you can create an application integrated with AWS IAM Identity Center or AWS Identity and Access Management (IAM) identity federation. Refer to Configuring an Amazon Q Business application using AWS IAM Identity Center, or Creating an Amazon Q Business application using Identity Federation through IAM if you need to make a new application.
Permissions
Configuring the Amazon Q embedded feature IAM permissions that allow you to use and manage Amazon Q Business. Your permission policy must at least allow the Amazon Q Business CreateWebExperience and UpdateWebExperience actions:

“Action”: “qbusiness:CreateWebExperience”,
“Action”: “qbusiness:UpdateWebExperience”,

When creating the IAM permission policy, the IAM Visual policy creator is a great way to see the options available. Using the least privileged access approach, you can restrict the resource in which the permission grants access to a specific AWS Region, account ID, application ID, and web experience ID.

“Resource”: “arn:aws:qbusiness:us-east-1:123456789012:application/<replace-with-id>”
“Resource”: “arn:aws:qbusiness:us-east-1:123456789012:application/<replace-with-id>/web-experience/<replace-with-id>”

You can find your application ID on the Amazon Q Business console under Application settings or from the list-applications command in the AWS Command Line Interface (AWS CLI). You can find your web experience ID with the list-web-experiences AWS CLI command. For example:

aws qbusiness list-applications
aws qbusiness list-web-experiences –application-id a1b2c3d4-5678-90ab-cdef-EXAMPLE11111

User access
Amazon Q Business requires authentication before users can engage with the assistant. If you use AWS IAM Identity Center, you can grant users access to the assistant by adding the users or groups to your Amazon Q Business application. If you use IAM identity federation, Amazon Q Business automatically subscribes users to the subscription type you select when you create the application. For more information on managing users, refer to Managing user subscriptions for IAM Identity Center-integrated applications, or see Updating and cancelling user subscriptions for applications using IAM Federation.
Allowlisting your website or web application
To embed Amazon Q Business on your website or web application, you must first allowlist your domain. This restricts your assistant to only sites you trust and stops others from embedding your assistant. You can add multiple domains for different services or development instances used for testing. Complete the following steps:

Open the Amazon Q Business console.
Next, select your Amazon Q Business application.
From the menu, choose Amazon Q embedded under the Enhancements section, then choose Add allowed website.
For Enter website URL, enter the base URL of the website or web application you want to allowlist for Amazon Q Business, for example https://www.example.com (trailing / not required), and choose Add.

Amazon Q Business hosts the web experience on an AWS domain. To find the URL, navigate to the main page of your Amazon Q Business application and copy the value for Deployed URL, for example https://1234abcdef5678.chat.qbusiness.example.on.aws/, in the Web experience settings section. Now you can embed this assistant into the website or web application hosted at the domain you allowlisted.
Customizing the user experience
You can customize the user experience look and feel for your organization. Customization options include the assistant title, subtitle, welcome message, font, color, and logo. You can also enable sample prompts. Refer to Customizing an Amazon Q Business web experience to see the available customization options.
The following screenshots show the default Amazon Q Business user experience (left) and an Amazon Q Business user experience with a custom title, subtitle, and welcome message (right).

Add Amazon Q Business to your website or web application
Before continuing, make sure you have allowlisted your domain as described earlier in this post.
You can choose from the following embedding options:

Using an HTML iframe element
Using a React component
Using a content management system

Embed Amazon Q Business using an HTML iframe element
You can embed Amazon Q Business on your website or web application using an iframe element, which is an HTML element that you can use to insert another HTML page into the current one. Other embedding options build upon this foundational HTML element. The following is a sample iframe element:

<iframe src=”https://1234abcdef5678.chat.qbusiness.example.on.aws/”></iframe>

You can customize the iframe element with various attributes such as the width, height, and title. Setting the Amazon Q Business deployed URL as the value for the src attribute will display the Amazon Q Business web experience within the iframe. The following code shows an example iframe element with the id, title, width, height, and src attributes set to example values:

<iframe
id=”inlineFrameExample”
title=”Inline Frame Example”
width=”600″
height=”650″
src=”https://1234abcdef5678.chat.qbusiness.example.on.aws/”>
</iframe>

Refer to <iframe>: The Inline Frame element to learn more about the iframe element.
Embed Amazon Q Business using a React component
You can embed Amazon Q Business on your website or web application using a React component. React components offer more customizations and modularity than a standard iframe. In this post, we’ve included a sample React component that wraps an iframe element and adds abilities such as an expanding and collapsing chat interface and showing a loading spinner when the page first loads.
To use this React component, download the sample code from the Embed GenAI chat into React GitHub repo and add it to your React source code. Then you can import the component into your website or web application and add the Chat element with at least the embedUrl attribute set to the deployed URL of your Amazon Q Business application. The following example code shows the options of the sample React component:

import Chat from “../components/embed”;

<Chat
embedUrl=”https://1234abcdef5678.chat.qbusiness.example.on.aws/”
embedWidth={600} // Optional
embedHeight={650} // Optional
embedOffsetRightPc={5} // Optional
headerText=”Chat” // Optional
headerInfo=”Chat with us” // Optional
/>

Embed Amazon Q Business using a content management system
You can embed Amazon Q Business on a website published by a content management system that allows you to add HTML elements to the content. We’ve included examples for WordPress and Drupal, both of which you can deploy with Amazon Lightsail.
Embedding on a WordPress site
To embed Amazon Q Business on your WordPress site, first access the WordPress admin page. Optionally, add a block group wrapper to constrain iframe sizing with the values of your choosing. For example, you could set the layout content height to 650px, width to 620px, a width of 100% in the iframe to fill the container, and select a full-size block item. Finally, add a custom HTML block and insert the iframe code. The following code is a sample iframe element:

<iframe
id=”inlineFrameExample”
title=”Inline Frame Example”
width=”100%”
height=”650″
src=”https://021345abcdef.chat.qbusiness.example.on.aws/”>
</iframe>

The following screenshot shows an example of adding a block to a WordPress site.

The following screenshot shows an example of adding an iframe to the block.

The following screenshot shows an example of Amazon Q Business in a WordPress site.

Embedding on a Drupal site
To embed Amazon Q Business on your Drupal site, complete the following steps:

Open the Drupal admin page.
Choose Content, Blocks, and Add content block.
Give your content block a description and change the text format to HTML.
Choose the Source
Add your iframe to the Body section of the block, then choose Save and configure.
When configuring your content block, the visibility options are optional and can be left with the default values.
Choose a Region to display this block, such as Content Above or Sidebar, then choose Save block.

The following screenshot shows an example of Amazon Q Business embedded with the Content Above option.

The following screenshot shows an example of Amazon Q Business embedded with the Sidebar option.

Sample website
To help you get started embedding Amazon Q Business, we have included a sample website that you can deploy on AWS Amplify with an AWS CloudFormation stack. The sample website contains an HTML iframe element with your Amazon Q Business assistant. To use the website, complete the following steps:

First collect your Amazon Q Business application ID and make a note. You can find your application ID on the Amazon Q Business console as described earlier in this post.
Download our YAML sample CloudFormation template to your workstation.
Deploy the stack either using the AWS CloudFormation console or using the AWS CLI.
After uploading the sample CloudFormation template, enter a stack name, a web page name, and your Amazon Q Business application ID in the Application ID input field.
You can leave all other settings at their default values.
After the stack fully deploys, navigate to the Outputs tab on the AWS CloudFormation console and copy the Amplify URL.
Return to the Amazon Q Business console, select your Amazon Q Business application, and choose Amazon Q Embedded to add your Amplify URL to the Allowed websites list as described earlier in this post.
Navigate to your Amplify URL in your web browser to see your sample website with Amazon Q Business. You may need to Sign in to Q Business.

Clean Up
To avoid future charges in your account from Amplify you can delete the resources you created in the previous section walkthrough on creating a sample website.

On the CloudFormation console, in the navigation pane, choose Stacks.
Select the stack you launched in the previous step, then choose Delete.

Conclusion
In this post, we showed you various methods of embedding Amazon Q Business, which enables users to have natural language conversations and get meaningful assistance directly on your website or web application. We discussed creating an Amazon Q Business application and how to allowlist your URL. We then walked through adding Amazon Q Business with a standard HTML iframe, a React component, and how to update a WordPress or Drupal site.
To get started, refer to Getting started with Amazon Q Business to create an Amazon Q Business application. For more information on the Amazon Q embedded feature, see Amazon Q embedded. Refer to Enhancing an Amazon Q Business application environment for guidance on integrating your data sources, which can include your website content, to enrich the answers Amazon Q Business can provide your website or web application users.

About the authors
Bobby Williams is a Senior Solutions Architect at AWS. He has decades of experience designing, building, and supporting enterprise software solutions that scale globally. He works on solutions across industry verticals and horizontals and is driven to create a delightful experience for every customer.
David Girling is a Senior AI/ML Solutions Architect with over 20 years of experience in designing, leading, and developing enterprise systems. David is part of a specialist team that focuses on helping customers learn, innovate, and utilize these highly capable services with their data for their use cases.
Philip Whiteside is a Solutions Architect (SA) at Amazon Web Services. Philip is passionate about overcoming barriers by utilizing technology.

An introduction to preparing your own dataset for LLM training

Large language models (LLMs) have demonstrated remarkable capabilities in a wide range of linguistic tasks. However, the performance of these models is heavily influenced by the data used during the training process.
In this blog post, we provide an introduction to preparing your own dataset for LLM training. Whether your goal is to fine-tune a pre-trained modIn this blog post, we provide an introduction to preparing your own dataset for LLM training. Whether your goal is to fine-tune a pre-trained model for a specific task or to continue pre-training for domain-specific applications, having a well-curated dataset is crucial for achieving optimal performance.el for a specific task or to continue pre-training for domain-specific applications, having a well-curated dataset is crucial for achieving optimal performance.
Data preprocessing
Text data can come from diverse sources and exist in a wide variety of formats such as PDF, HTML, JSON, and Microsoft Office documents such as Word, Excel, and PowerPoint. It’s rare to already have access to text data that can be readily processed and fed into an LLM for training. Thus, the first step in an LLM data preparation pipeline is to extract and collate data from these various sources and formats. During this step, you read data from multiple sources, extract the text using tools such as optical character recognition (OCR) for scanned PDFs, HTML parsers for web documents, and bespoke libraries for proprietary formats such as Microsoft Office files. Non-textual elements such as HTML tags and non-UTF-8 characters are typically removed or normalized.
The next step is to filter low quality or desirable documents. Common patterns for filtering data include:

Filtering on metadata such as the document name or URL.
Content-based filtering such as excluding any toxic or harmful content or personally identifiable information (PII).
Regex filters to identify specific character patterns present in the text.
Filtering documents with excessive repetitive sentences or n-grams.
Filters for specific languages such as English.
Other quality filters such as the number of words in the document, average word length, ratio of words comprised of alphabetic characters versus non-alphabetic characters, and others.
Model based quality filtering using lightweight text classifiers to identify low quality documents. For example, the FineWeb-Edu classifier is used to classify the education value of web pages.

Extracting text from various file formats can be a non-trivial task. Fortunately, many high-level libraries exist that can significantly simplify this process. We will use a few examples to demonstrate extracting text and review how to scale this to large collections of documents further down.
HTML preprocessing
When processing HTML documents, remove non-text data such as the document mark-up tags, inline CSS styles, and inline JavaScript. Furthermore, translate structured objects such as lists, tables, and sample code blocks into markdown format. The trafilatura library provides a command-line interface (CLI) and Python SDK for translating HTML documents in this fashion. The following code snippet demonstrates the library’s usage by extracting and preprocessing the HTML data from the Fine-tune Meta Llama 3.1 models using torchtune on Amazon SageMaker blog post.

from trafilatura import fetch_url, extract, html2txt

url = “https://aws.amazon.com/blogs/machine-learning/fine-tune-meta-llama-3-1-models-using-torchtune-on-amazon-sagemaker/”

downloaded = fetch_url(url)
print(“RAW HTMLn”, downloaded[:250])

all_text = html2txt(downloaded)
print(“nALL TEXTn”, all_text[:250])

main_text = extract(downloaded)
print(“nMAIN TEXTn”, main_text[:250])

trafilatura provides numerous functions for dealing with HTML. In the preceding example, fetch_url fetches the raw HTML, html2txt extracts the text content which includes the navigation links, related content links, and other text content. Finally, the extract method extracts the content of the main body which is the blog post itself. The output of the preceding code should look like the following:

RAW HTML
<!doctype html> <html lang=”en-US” class=”no-js aws-lng-en_US” xmlns=”http://www.w3.org/1999/xhtml” data-aws-assets=”https://a0.awsstatic.com” data-js-version=”1.0.681″ data-css-version=”1.0.538″ data-static-assets=”https://a0.awsstatic.com” prefix=”

ALL TEXT
Skip to Main Content Click here to return to Amazon Web Services homepage About AWS Contact Us Support English My Account Sign In Create an AWS Account Products Solutions Pricing Documentation Learn Partner Network AWS Marketplace Customer Enablement

MAIN TEXT
AWS Machine Learning Blog Fine-tune Meta Llama 3.1 models using torchtune on Amazon SageMaker This post is co-written with Meta’s PyTorch team. In today’s rapidly evolving AI landscape, businesses are constantly seeking ways to use advanced large lan

PDF processing
PDF is a common format for storing and distributing documents within organizations. Extracting clean text from PDFs can be challenging for several reasons. PDFs may use complex layouts that include text columns, images, tables, and figures. They can also contain embedded fonts and graphics that cannot be parsed by standard libraries. Unlike HTML, there is no structural information to work with such as headings, paragraphs, lists, and others, which makes parsing PDF documents significantly more difficult. If possible, PDF parsing should be avoided if an alternative format for the document exists such an HTML, markdown, or even a DOCX file. In cases where an alternative format is not available, you can use libraries such as pdfplumber, pypdf, and pdfminer to help with the extraction of text and tabular data from the PDF. The following is an example of using pdfplumber to parse the first page of the 2023 Amazon annual report in PDF format.

import pdfplumber

pdf_file = “Amazon-com-Inc-2023-Annual-Report.pdf”

with pdfplumber.open(pdf_file) as pdf:
    page = pdf.pages[1]

print(page.extract_text(x_tolerance=1)[:300])

pdfplumber provides bounding box information, which can be used to remove superfluous text such as page headers and footers. However, the library only works with PDFs that have text present, such as digitally authored PDFs. For PDF documents that require OCR, such as scanned documents, you can use services such as Amazon Textract.
Office document processing
Documents authored with Microsoft Office or other compatible productivity software are another common format within an organization. Such documents can include DOCX, PPTX, and XLSX files, and there are libraries available to work with these formats. The following code snippet uses the python-docx library to extract text from a Word document. The code iterates through the document paragraphs and concatenates them into a single string.

from docx import Document
doc_file = “SampleDoc.docx”

doc = Document(doc_file)

full_text = []
for paragraph in doc.paragraphs:
  full_text.append(paragraph.text)

document_text = ‘n’.join(full_text)

Deduplication
After the preprocessing step, it is important to process the data further to remove duplicates (deduplication) and filter out low-quality content.
Deduplication is a critical aspect for preparing high-quality pretraining datasets. According to CCNet, duplicated training examples are pervasive in common natural language processing (NLP) datasets. This issue is not only a frequent source of bias in datasets originating from public domains such as the internet, but it can also be a potential problem when curating your own training dataset. When organizations attempt to create their own training dataset, they often use various data sources such as internal emails, memos, internal employee chat logs, support tickets, conversations, and internal wiki pages. The same chunk of text might appear across multiple sources or can repeat excessively in a single data source such as an email thread. Duplicated data extends the training time and potentially biases the model towards more frequently repeated examples.
A commonly used processing pipeline is the CCNet pipeline. The following section will describe deduplication and filtering employed in the CCNet pipeline.
Break documents into shards. In the CCNet paper, the author divided 30 TB of data into 1,600 shards. In that example, the shards are documents that have been grouped together. Each shard contains 5 GB data and 1.6 million documents. Organizations can determine the number of shards and size of each shard based on their data size and compute environment. The main purpose of creating shards is to parallelize the deduplication process across a cluster of compute nodes.
Compute hash code for each paragraph of the document. Each shard contains many documents and each document contains multiple paragraphs. For each paragraph, we compute a hash code and save them into a binary file. The authors of the CCNet paper use the first 64 bits of SHA-1 digits of the normalized paragraphs as the key. Deduplication is done by comparing these keys. If the same key appears multiple times, the paragraphs that these keys link to are considered duplicates. You can compare the keys within one shard, in which case there might still be duplicated paragraphs across different shards. If you compare the keys across all shards, you can verify that no duplicated paragraph exists in your whole dataset. However,  this can be computationally expensive.
MinHash is another popular method for estimating the similarities between two paragraphs. This technique is particularly useful for large datasets because it provides an efficient approximation of the Jaccard similarity. Paragraphs are broken down into shingles, which are overlapping sequences of words or characters of a fixed length. Multiple hashing functions are applied to each shingle. For each hash function, we find the minimum hash value across all the shingles and use that as the signature of the paragraph, called the MinHash signature. Using the MinHash signatures, we can calculate the similarity of the paragraphs. The MinHash technique can also be applied to words, sentences, or entire documents. This flexibility makes MinHash a powerful tool for a wide range of text similarity tasks. The following example shows the pseudo-code for this technique:

function MinHash_similarity(text1, text2, shingle_length, num_hash_functions):
    # Preprocess texts
    shingles1 = create_shingles(text1, shingle_length)
    shingles2 = create_shingles(text2, shingle_length)

    # Initialize MinHash signatures
    minhash_signatures = []

    # Compute MinHash signatures
    for i from 1 to num_hash_functions:
        hash_function = generate_hash_function()
        minhash1 = minimum_hash(shingles1, hash_function)
        minhash2 = minimum_hash(shingles2, hash_function)
        minhash_signatures.append((minhash1, minhash2))

    # Estimate Jaccard similarity
    common_minhashes = count_common_minhashes(minhash_signatures)
    jaccard_similarity = common_minhashes / num_hash_functions
    return jaccard_similarity

The complete steps of using MinHash for deduplication are:

Break down documents into paragraphs.
Apply the MinHash algorithm as shown in the preceding example and calculate the similarity scores between paragraphs.
Use the similarity between paragraphs to identify duplicate pairs.
Combine duplicate pairs into clusters. From each cluster, select one representative paragraph to minimize duplicates.

To enhance the efficiency of similarity searches, especially when dealing with large datasets, MinHash is often used in conjunction with additional techniques such as Locality Sensitive Hashing (LSH). LSH complements MinHash by providing a way to quickly identify potential matches through bucketing and hashing techniques without having to compare every pair of items in the dataset. This combination allows for efficient similarity searches even in massive collections of documents or data points, significantly reducing the computational overhead typically associated with such operations.
It’s important to note that paragraph-level deduplication is not the only choice of granularity. As shown in Meta’s Llama 3 paper, you can also use sentence-level deduplication. The authors also applied document-level deduplication to remove near duplicate documents. The computation cost for sentence-level deduplication is even higher compared to paragraph-level deduplication. However, this approach offers more fine-grained control over duplicate content. At the same time, removing duplicated sentences might result in an incomplete paragraph, potentially affecting the coherence and context of the remaining text. Thus, the trade-off between granularity and context preservation needs to be carefully considered based on the nature of the dataset.
Creating a dataset for model fine-tuning
Fine-tuning a pre-trained LLM involves adapting it to a specific task or domain by training it on an annotated dataset in a supervised manner or through reinforcement learning techniques. The dataset considerations for fine-tuning are crucial because they directly impact the model’s performance, accuracy, and generalization capabilities. Top considerations include:

Relevance and domain-specificity:The dataset should closely match the task or domain the model is being fine-tuned for. Make sure that the dataset includes diverse examples and edge cases that the model is likely to encounter. This helps improve the robustness and generalizability of the model across a range of real-world scenarios. For example,  when fine-tuning a model for financial sentiment analysis, the dataset should contain financial news articles, analyst reports, stock market commentary, and corporate earnings announcements.
Annotation quality:The dataset must be free of noise, errors, and irrelevant information. Annotated datasets must maintain consistency in labeling. The dataset should accurately reflect the correct answers, human preferences, or other target outcomes that the fine-tuning process aims to achieve.
Dataset size and distribution:Although fine-tuning generally requires fewer tokens than pretraining (thousands compared to millions), the dataset should still be large enough to cover the breadth of the task requirements. The dataset should include a diverse set of examples that reflect the variations in language, context, and style that the model is expected to handle.
Ethical considerations: Analyze and mitigate biases present in the dataset, such as gender, racial, or cultural biases. These biases can be amplified during fine-tuning, leading to unfair or discriminatory model outputs. Make sure that the dataset aligns with ethical standards and represents diverse groups and perspectives fairly.
Sensible data cut offs: While preparing the dataset, one of the considerations to understand is choosing a cut-off date for the data. Generally, depending on the speed of changes in the information, you can choose an early or late cut off. For example, for fine-tuning an LLM for brand adherence, you can have a distant cutoff date because the brand language remains consistent for many years. Whereas preparing the dataset for generating audit and compliance letters needs an earlier cutoff date because new compliance regulations are created and are updated quite often.
Modalities: In the case of multi-modal models, the dataset must include various supported data types. Each data type must follow the other considerations mentioned here around annotation quality, ethical considerations, relevance, domain specificity, and so on.
Synthetic data augmentation:Consider generating synthetic data to supplement real-world data, especially to help fill gaps in the dataset to make sure that it’s realistic and representative. Employing these techniques can help overcome the challenges of limited data availability, enhance model robustness, and provide better generalization across diverse tasks.

Dataset format for fine tuning
Instruction tuning is a process of further training a pretrained model on a diverse set of tasks framed as natural language instructions. This approach aims to enhance the model’s ability to understand and follow explicit instructions, improving its performance on a wide range of downstream tasks without the need for task-specific fine -tuning. A technique called reference tuning is used to align large models with human preferences and values. This approach aims to improve the quality, safety, and desirability of AI-generated outputs by incorporating human feedback into the training process. The data requirements for both of these techniques are very different, so we’ll go into further detail in the following sections.
Instruction fine tuning dataset format
The columns in the table that follows represent the key components of the instruction-tuning paradigm: Type categorizes the task or instruction type. Instruction contains the specific directive for the model. Input provides the context or data to work with. Output shows the expected or desired response. This structure is crucial for training models to understand and execute a wide variety of tasks based on natural language instructions.

Type
Instruction
Input 
Output

antonym_relation
What is the relationship between the given pairs?
Night : Day :: Right : Left
The relationship between the given pairs is that they are opposites.

email_subject_generation
Generate an appropriate subject title for the following email:
Hi [person name],nn I’m writing to ask you if you are willing to be a panelist of our workshop on multi-modality at CVPR. The workshop will be held on June 20, 2023. nnBest,n[my name]
Invitation to be a panelist for CVPR 2023 workshop on multi-modality

analogy_completion
To make the pairs have the same analogy, write the fourth word.
Instability : Turmoil:: Change :
Instability : Turmoil :: Change : Revolution

question_answering_generation_from_facts
Using the facts given, write a question-answer pair.
Name: Bill GatesnBorn: October 28, 1955 (age 66)nEducation: Harvard University (dropped out)nOccupation: Software developer, investor, entrepreneurnSource: WikipedianTime: August 2022
Question: What is Bill Gates’s occupation? nAnswer: He is a software developer, investor, and entrepreneur.

 
Preference-tuning dataset format
For a preference-tuning dataset, the dataset typically consists of entries, each containing an input or prompt, a chosen (preferred) response, and a rejected (less preferred) response. The following is an example of a Direct Preference Optimization (DPO) dataset.

Prompt
Chosen
Rejected

Artists Gustav Klimt and Carl Moll were born in which European country? What is the answer?
[{‘content’: ‘You are a helpful assistant who always provide explanation. Think like you are answering a five year old.’, ‘role’: ‘system’} {‘content’: ‘Artists Gustav Klimt and Carl Moll were born in which European country?nWhat is the answer?’, ‘role’: ‘user’} {‘content’: ‘Gustav Klimt and Carl Moll were both born in a European country called Austria. Austria is a beautiful place with mountains, forests, and many people who speak German. These two artists created lots of pretty paintings while they lived in Austria.’, ‘role’: ‘assistant’}]
[{‘content’: ‘You are a helpful assistant, who always provides an explanation. Think like you are answering a five year old.’, ‘role’: ‘system’} {‘content’: ‘Artists Gustav Klimt and Carl Moll were born in which European country?nWhat is the answer?’, ‘role’: ‘user’} {‘content’: ” Ooh ooh! *giggle* You wanna know where Gustav Klimt and Carl Moll were born? *bounce bounce* Well, they were born in a special place called Austria! *twirl* That’s right, Austria! It’s a beautiful country in Europe, and lots of talented artists like Gustav and Carl were born there! *smile* Does that make you wanna dance? *spin spin* Hehehe!”, ‘role’: ‘assistant’}]

The following is an example using the Ultrachat-feedback dataset format, which includes the following elements: prompt, chosen, rejected, message, score_chosen, and score_rejected. This type of dataset is often used in DPO or reinforcement learning from human feedback (RLHF) to improve AI model outputs. By providing examples of your preferred and non-preferred responses along with their respective scores, the dataset can be used to train models to generate more desirable outputs.

prompt
chosen
rejected
messages
score_chosen
score_rejected

Let’s play a game. I say a sentence, then you make a sentence that follows up my sentence, then I give a continuation to yours and so on. You ready?
[{‘content’: “Let’s play a game. I say a sentence, then you make a sentence that follows up my sentence, then I give a continuation to yours and so on. you ready?”, ‘role’: ‘user’} {‘content’: “I’m ready! Let’s begin. Please provide your first sentence.”, ‘role’: ‘assistant’}]
[{‘content’: “Let’s play a game. I say a sentence, then you make a sentence that follows up my sentence, then I give a continuation to yours and so on. you ready?”, ‘role’: ‘user’} {‘content’: ‘Sure, I would love to play.’, ‘role’: ‘assistant’}]
[{‘content’: “Let’s play a game. I say a sentence, then you make a sentence that follows up my sentence, then I give a continuation to yours and so on. you ready?”, ‘role’: ‘user’} {‘content’: “I’m ready! Let’s begin. Please provide your first sentence.”, ‘role’: ‘assistant’}]
7
6

In the case of Meta Llama 3, instruction-tuned models go through an iterative process of DPO preference alignment, and the dataset typically consists of triplets—a user prompt and two model responses, with one response preferred over the other. In advanced implementations, this format can be extended to include a third, edited response that’s considered superior to both original responses. The preference between responses is quantified using a multi-level rating system, ranging from marginally better to significantly better. This granular approach to preference annotation allows for a more nuanced training of the model, enabling it to distinguish between slight improvements and significant enhancements in response quality.

prompt
chosen
rejected
edited
alignment rating

Let’s play a game. I say a sentence, then you make a sentence that follows up my sentence, then I give a continuation to yours and so on. You ready?
[{‘content’: “Let’s play a game. I say a sentence, then you make a sentence that follows up my sentence, then I give a continuation to yours and so on. You ready?”, ‘role’: ‘user’} {‘content’: “I’m ready! Let’s begin. Please provide your first sentence.”, ‘role’: ‘assistant’}]
[{‘content’: “Let’s play a game. I say a sentence, then you make a sentence that follows up my sentence, then I give a continuation to yours and so on. You ready?”, ‘role’: ‘user’} {‘content’: ‘Sure, I would love to play.’, ‘role’: ‘assistant’}]
[{‘content’: “Let’s play a game. I say a sentence, then you make a sentence that follows up my sentence, then I give a continuation to yours and so on. You ready?”, ‘role’: ‘user’} {‘content’: “I’m ready! Let’s begin. Please provide your first sentence.”, ‘role’: ‘assistant’}]
significantly better

 
Synthetic data creation approach for the instruction-tuning dataset format using the Self-Instruct technique
Synthetic data creation using the Self-Instruct technique is one of the most well-known approaches for generating instruction-finetuning datasets. This method uses the capabilities of LLMs to bootstrap a diverse and extensive collection of instruction-tuning examples, significantly reducing the need for manual annotation. The following figure shows the process of the Self-Instruct technique, which is described in the following sections.

 
Seed data and tasks
The seed data process begins with a small set of human-written instruction-output pairs that serve as seed data. The seed dataset serves as the foundation for building a robust collection of tasks used in various domains, with a focus on promoting task diversity. In some cases, the input field provides context to support the instruction, especially in classification tasks where output labels are limited. On the other hand, for tasks that are non-classification, the instruction alone might be self-contained without needing additional input. This dataset encourages task variety through different data formats and solutions, making it a critical step in defining the final task pool, which supports the development of diverse AI applications.
The following is an example of a seed task that identifies financial entities (companies, government institutions, or assets) and assigns a part of speech tag or entity classification based on the given sentence.

{
    “id”: “finance_task_001”,
    “name”: “financial_entity_classification”,
    “instruction”: “Identify the type of financial entity in the given sentence.”,
    “instances”: [
      {
        “input”: “Entity: Federal ReservenSentence: The Federal Reserve raised interest rates by 0.25% to combat inflation.”,
        “output”: “Government Institution, ORG”
      }
    ],
    “is_classification”: true
  }

The following example requests an explanation of a financial concept, and because it isn’t a classification task, the output is more open-ended.

{
    “id”: “finance_task_002”,
    “name”: “explain_financial_concept”,
    “instruction”: “Explain the concept of compound interest in two sentences.”,
    “instances”: [
      {
        “input”: “”,
        “output”: “Compound interest is the interest on a loan or deposit calculated based on both the initial principal and the accumulated interest from previous periods. It allows investments to grow at a faster rate compared to simple interest, where interest is only calculated on the principal.”
      }
    ],
    “is_classification”: false
  }

Instruction generation
Using the seed data as a foundation, an LLM is prompted to generate new instructions. The process uses existing human-written instructions as examples to help a model (such as Anthropic’s Claude 3.5 or Meta Llama 405B) to generate new instructions, which are then checked and filtered for quality before being added to the final output list.

Come up with a series of tasks:
1. Suggest a diversified investment portfolio for someone with a moderate risk tolerance.
2. What is the relation between the following financial ratios and company performance?
3. Generate a one-sentence description for each of the following economic terms.
4. Describe a situation in which market volatility can negatively impact retirement planning.

Instance generation
For each generated instruction, the model creates corresponding input-output pairs. This step produces concrete examples of how to follow the instructions. The Input-First Approach for non-classification tasks asks the model to first generate the input values, which will then be used to generate the corresponding output. This approach is especially useful for tasks such as financial calculations, where the output directly depends on specific inputs.

input_first_template =
”’Come up with examples for the following tasks.
Try to generate multiple examples when possible.
If the task doesn’t require additional input, you can generate the output directly.
Task: Calculate the compound interest for the given principal, rate, and time period.
Example 1
Principal: $10,000, Rate: 5%, Time: 2 years
Output: $1,025 (Compound interest using annual compounding)
Example 2
Principal: $5,000, Rate: 3%, Time: 5 years
Output: $796.25 (Compound interest using annual compounding)

Task: {instruction}”’

The Output-First Approach for classification tasks is designed to first define the output (class label), and then condition the input generation based on the output. This approach verifies that inputs are created in such a way that they correspond to the pre-defined class labels.

output_first_template =
”’Given the classification task definition and the class labels,
generate an input that corresponds to each of the class labels.
If the task doesn’t require input, just generate possible class labels.
Task: Identify whether the following financial transaction is categorized as “Income” or “Expense.”
Class Label: Income
Transaction: Payment received from client for consulting services – $5,000.
Class Label: Expense
Transaction: Payment made for office rent – $1,200.

Task: {instruction}”’

Post-processing filters
The filtering and quality control step verifies the dataset quality by applying various mechanisms to remove low-quality or redundant examples. After generating tasks, instances are extracted and formatted, followed by filtering based on rules such as removing instances where the input and output are identical, the output is empty, or the instance is already in the task pool. Additional heuristic checks, such as incomplete generations or formatting issues, are also applied to maintain the integrity of the final dataset.
For more details on self-instruct synthetic data creation, see Alpaca: A Strong, Replicable Instruction-Following Model for information about the data creation approach and instruction fine-tuning with the dataset. You can follow a similar approach for various fine-tuning tasks including instruction fine-tuning and direct preference optimization.
Data labeling for different downstream tasks (such as, code languages, summarization, and so on)
When it comes to preparing the data for training an LLM, data labeling plays a crucial role because it directly controls and impacts the quality of responses a model produces. Generally, for training an LLM, there are a variety of approaches that you can take. It depends on the task at hand because we expect the LLM to work on a variety of use cases. The reason we see base foundation models excelling a variety of instructions and tasks is because during the pre-training process, we provided such instructions and examples to the model so it can understand the instructions and perform the tasks. For example, asking the model to generate code or perform name entity extraction. Training the LLM for each type of task requires task-specific labeled datasets. Let’s explore some of the common data-labeling approaches:

Human labelers: The most common method for data labeling is to use human labelers. In this approach, a team of human labelers annotates data for various tasks, such as general question-answering, sentiment analysis, summarization, comparing various text for similarity and differences, and so on. For each category of task, you prepare a dataset for the various tasks and ask the human labelers to provide the answers. To mitigate individual bias, you can collect multiple responses for the same question by sourcing answers from multiple human labelers and then consolidate responses into an aggregate label. Human labeling is regarded as the gold standard for collecting high-quality data at scale. However, the process of labeling by hand tends to be tedious, time-consuming, and expensive for labeling tasks that involve millions of data points, which has motivated the study of AI-assisted data annotation tools—such as Snapper—that interactively reduce the burden of manual annotation.
LLM-assisted labeling: Another common approach to labeling is to use another LLM to label the data to speed up the labeling process. In this approach, you use another LLM to generate the responses for the various tasks such as sentiment analysis, summarization, coding, and so on. This can be achieved in different ways. In some cases, we can use N-shot learning approaches to improve the quality of the label. To mitigate bias, we use the human-in-the-loop (HITL) approach to review certain responses to verify that the labels are high quality. The benefit of this approach is that it’s faster than human labeling because you can scale the LLM endpoint and serve multiple requests in parallel. However, the downside is that you have to keep iterating and changing the acceptance threshold of confidence of the model’s response. For example, if you’re preparing the dataset for financial crime, you have to lower the tolerance for false negatives and accept slightly higher false positives.
Cohort-based labeling: Cohort-based labeling is an emerging approach where more than two LLMs are asked to generate the label for the same data. The models are then asked whether they agree with the other model’s response. The label is accepted if both models agree with each other’s response. There is another variation of this approach where instead of asking the models to agree with each other’s responses, you use a third LLM to rate the quality of the output of the other two models. It produces high quality outputs, but the cost of labeling rises exponentially because you need to make at least three LLM invocation calls for each data point to produce the final label. This approach is under active research, and we expect more orchestration tools for this in the near future.
RLHF-based data labeling: This approach is inspired by the RLHF fine-tuning process. Based on the task at hand, you first take a sample of unlabeled data points and have them labeled by a human labeler. You then use the labeled dataset to fine-tune an LLM. The next step is to use the fine-tuned LLM to produce multiple outputs for another subset of unlabeled data points. A human labeler ranks the outputs from best to worst and you use this data to train a reward model. You then send the rest of the unlabeled data points through the re-enforcement-learned PPO initialized through supervised policy. The policy generates the label and then you ask the reward model to calculate a reward for the label. The reward is further used to update the PPO policy. For further reading on this topic, see Improving your LLMs with RLHF on Amazon SageMaker.

Data processing architecture
The entire data processing pipeline can be achieved using a series of jobs as illustrated in the following architecture diagram. Amazon SageMaker is used as a job facility to filter, deduplicate, and tokenize the data. The intermediate outputs of each job can be stored on Amazon Simple Storage Service (Amazon S3). Depending on the size of the final datasets, either Amazon S3 or FSx for Lustre can be used for storing the final dataset. For larger datasets, FSx can provide significant improvements in the training throughput by eliminating the need to copy or stream data directly from S3. An example pipeline using the Hugging Face DataTrove library is provided in this repo.

Pipeline for fine-tuning
As previously discussed, fine-tuning data is typically comprised of an input instruction and the desired outputs. This data can be sourced using manual human annotation, synthetic generation, or a combination of the two. The following architecture diagram outlines an example pipeline where fine-tuning data is generated from an existing corpus of domain-specific documents. An example of a fine-tuning dataset would take a source document as input or context and generate task-specific responses such as a summary of the document, key information extracted from the document, or answers to questions about the document.

Models provided by Amazon Bedrock can be used to generate the synthetic data, which can then be validated and modified by a human reviewer using Amazon SageMaker Ground Truth. SageMaker Ground Truth can also be used to create human-labeled data fine-tuning from scratch. For synthetic data generation, be sure to review the model provider’s acceptable usage terms to verify compliance.
Pipeline for DPO
After a model is fine-tuned, it can be deployed on model hosting services such as Amazon SageMaker. The hosted model can then be used to generate candidate responses to various prompts. Through SageMaker Ground Truth, users can then provide feedback on which responses they prefer, resulting in a preference dataset. This flow is outlined in the following architecture diagram and can be repeated multiple times as the model tunes using the latest preference data.

Conclusion
Preparing high-quality datasets for LLM training is a critical yet complex process that requires careful consideration of various factors. From extracting and cleaning data from diverse sources to deduplicating content and maintaining ethical standards, each step plays a crucial role in shaping the model’s performance. By following the guidelines outlined in this post, organizations can curate well-rounded datasets that capture the nuances of their domain, leading to more accurate and reliable LLMs.

About the Authors
Simon Zamarin is an AI/ML Solutions Architect whose main focus is helping customers extract value from their data assets. In his spare time, Simon enjoys spending time with family, reading sci-fi, and working on various DIY house projects.
Vikram Elango is an AI/ML Specialist Solutions Architect at Amazon Web Services, based in Virginia USA. Vikram helps financial and insurance industry customers with design, thought leadership to build and deploy machine learning applications at scale. He is currently focused on natural language processing, responsible AI, inference optimization and scaling ML across the enterprise. In his spare time, he enjoys traveling, hiking, cooking and camping with his family.
Qingwei Li is a Machine Learning Specialist at Amazon Web Services. He received his Ph.D. in Operations Research after he broke his advisor’s research grant account and failed to deliver the Nobel Prize he promised. Currently he helps customers in the financial service and insurance industry build machine learning solutions on AWS. In his spare time, he likes reading and teaching.
Vinayak Arannil is a Sr. Applied Scientist from the AWS Bedrock team. With several years of experience, he has worked on various domains of AI like computer vision, natural language processing etc. Vinayak led the data processing for the Amazon Titan model training. Currently, Vinayak helps build new features on the Bedrock platform enabling customers to build cutting-edge AI applications with ease and efficiency.
Vikesh Pandey is a Principal GenAI/ML Specialist Solutions Architect at AWS, helping customers from financial industries design, build and scale their GenAI/ML workloads on AWS. He carries an experience of more than a decade and a half working on entire ML and software engineering stack. Outside of work, Vikesh enjoys trying out different cuisines and playing outdoor sports.
David Ping is a Sr. Manager of AI/ML Solutions Architecture at Amazon Web Services. He helps enterprise customers build and operate machine learning solutions on AWS. David enjoys hiking and following the latest machine learning advancement.
Graham Horwood is Sr. Manager of Data Science from the AWS Bedrock team.

Relationship Marketing: Reactive vs. Proactive and Why You Need Both

You’ve heard it a million times – relationships are everything in business. But how you manage those relationships? 

That’s what separates brands with loyal customers from the ones stuck chasing their next quick sale.

Enter relationship marketing. 

Relationship marketing is the not-so-secret strategy behind turning one-time buyers into lifetime fans. And boy oh boy do we need those lifetime fans.

According to studies, a 5% increase in customer retention can boost profits by up to 95%. That’s the kind of ROI you can’t ignore.

But here’s where most brands get it wrong — they lean too hard on one side. Reactive marketing? It’s all about responding to customers. Proactive marketing? That’s about staying a step ahead.

The truth is, to win at relationship marketing, you need both working together.

So, how do you strike that balance? That’s what we are going to dig into. 

Ecommerce Webinar

Beyond Abandoned Cart to Abandoned Product View Revenue

with Email Deliverability Hacks & AI Tools

Watch The Webinar

What Is Relationship Marketing? (And Why It Matters)

At its core, relationship marketing is all about playing the long game. 

It’s not about quick wins or short-term sales. Instead, it’s about building meaningful connections with your customers to increase loyalty, retention, and let’s be real, lifetime value.

Image: TechTarget

Why does it matter? 

Because loyal customers spend more, stick around longer, and recommend you to others. In fact, 65% of a company’s business comes from existing customers. 

That’s a huge wake-up call to focus less on chasing new buyers and more on strengthening the relationships you already have.

But where it gets interesting is that relationship marketing has two sides:

Reactive Relationship Marketing: Think of it as playing defense. You’re responding to customer actions (reviews, support tickets, complaints, or post-purchase follow-ups). It’s about keeping things running smoothly and fixing issues before they escalate.

Proactive Relationship Marketing: This is all offense. You’re anticipating customer needs, offering solutions before they ask, and staying on their radar with personalized outreach. It’s about creating delight and loyalty before problems arise.

Reactive keeps your current customers happy while proactive keeps them coming back for more. 

Both approaches have their strengths but the real benefits happen when you bring them together. 

Reactive Relationship Marketing: Nurturing What’s Already There

Reactive relationship marketing is all about responding to what your customers are already doing. 

Whether it’s addressing feedback, resolving issues, or following up on missed opportunities, this approach is your chance to show customers you’re listening and that you care.

What Reactive Relationship Marketing Looks Like in Action

Here are some examples of reactive strategies that work:

Responding to Negative Reviews: When a customer leaves a bad review, reach out with a solution. For example, “We’re sorry to hear about your experience. Here’s how we’d like to make it right.” A quick response can turn critics into advocates.

Following Up on Abandoned Carts: Send personalized emails reminding customers what they left behind. Add a sweetener like “Your favorites are still waiting. Checkout now and get 10% off!”

Offering Discounts or Credits to Resolve Complaints: If something went wrong, a small gesture like a refund or discount code can go a long way in rebuilding trust.

When to Use Reactive Relationship Marketing

Reactive marketing works best when you’re looking to retain existing customers and turn potential issues into opportunities. 

A quick, thoughtful response can stop churn in its tracks and even strengthen loyalty. I think it’s important to note that while customer expectations have certainly risen over the years, customers don’t expect perfection. They expect accountability. 

Reactive strategies are your chance to meet those expectations and show your brand’s human side.

Proactive Relationship Marketing: Staying Ahead of the Game

Proactive relationship marketing is about being one step ahead. 

Instead of waiting for customers to reach out, you anticipate their needs and deliver value before they even ask. 

We’ve now had TWO customers launch https://t.co/jkRIu5kJDb before their Shark Tank debut and the graph looks the same. A spike! Whether you’re launching on Shark Tank or running a promo, make sure you are ready to to capitalize on the publicity. Get our launch checklist… pic.twitter.com/qgq0haO0Up— CustomersAI (@CustomersAI) November 27, 2024

It’s the difference between a brand that reacts to problems and one that creates memorable experiences.

What Proactive Relationship Marketing Looks Like in Action

Here are a few examples of proactive strategies that build loyalty and prevent issues:

Subscription Renewal Emails: “Your subscription is about to renew. Renew now and enjoy an exclusive bonus.” By reminding customers ahead of time, you avoid surprises and add extra value.

Tailored Product Recommendations: Based on past purchases, suggest complementary products or upgrades. For example, “We noticed you loved [Product A]—here’s [Product B] that pairs perfectly with it.”

Satisfaction Surveys: Reach out with quick surveys to gauge how customers feel about their experience. Use this feedback to improve and show that you care.

When to Use Proactive Relationship Marketing

Proactive strategies are ideal for creating delight and preventing churn. They show your customers that you’re paying attention, you understand their needs, and you’re committed to providing a great experience.

Why wait for a cancellation request when you can send a personalized retention offer? Why let a customer forget about their abandoned cart when you can remind them with a compelling discount? 

Proactive marketing is about making your customers feel valued and when combined with reactive efforts, it becomes a really powerful tool for building long-term customer relationships. 

Let’s look at how these two approaches work together to drive real results.

Unlock High-Intent Leads Hiding on Your Site

Book a demo of Customers.ai’s U.S. website visitor identification, customer journey insights and remarketing platform to skyrocket conversions and sales.

Book a Demo

Why You Need Both Reactive and Proactive Relationship Marketing

Reactive and proactive relationship marketing aren’t either-or strategies. They’re a tag team. 

Focus too much on one and you’re either putting out fires all day or coming off as tone-deaf when real problems hit. 

To truly win, you need both working together.

Reactive Without Proactive = Stuck in Damage Control

If you’re only reacting, you’re in constant catch-up mode. Sure, you’re solving problems but that’s all you’re doing. 

You’re missing the chance to surprise and delight your customers before they even realize they need something. 

Example 1: A customer emails your support team to ask why their order hasn’t shipped after five days. Your reactive response fixes the problem, but proactive communication like an order delay notification or shipping update email could have prevented frustration in the first place.

Example 2: After a negative review, your team jumps in to offer a refund or replacement. Great reaction, but imagine proactively reaching out after a purchase with a “How’s everything with your order?” email. That extra step could’ve caught the issue before it went public.

Proactive Without Reactive = Out of Touch

Being proactive is smart, but it falls flat when paired with poor responsiveness.

Example 1: You send a “VIP early access” email to a loyal customer base but fail to answer questions about the offer in a timely manner. Proactive marketing sparked interest, but the lack of reactive follow-through left customers hanging.

Example 2: A customer experiences a tech issue with your product and submits a support ticket. Instead of resolving it, they receive a proactive survey email asking for feedback on their recent experience. To them, it feels like you’re ignoring the problem while asking for praise.

The Sweet Spot: Balance Both for Maximum Impact

Use reactive strategies to handle immediate needs and fix problems and proactive efforts to keep customers feeling appreciated and engaged. 

The combination makes your brand feel both attentive and thoughtful.

Actionable Examples:

Proactively Reward Loyalty: Send a surprise discount or exclusive offer to your top customers.

Reactively Solve Issues with a Bonus: If a customer has a complaint, resolve it quickly and throw in something extra. For instance, “We’ve fixed your billing issue and added a $10 credit to your account. Thanks for your patience!”

Pair Your Campaigns: If you’re running a proactive outreach campaign (e.g., a product recommendation email), be ready with a reactive follow-up for any questions or concerns that arise.

Balancing these two approaches creates seamless customer experiences and honestly, customers simply want to know you’ve got their back when things go wrong AND that you value them enough to reach out first. 

How to Implement Reactive and Proactive Relationship Marketing

Balancing reactive and proactive relationship marketing strategies isn’t as complicated as it sounds. Here’s how to get both approaches working together seamlessly.

Reactive Marketing Strategies: Handle Issues Before They Escalate

Monitor Reviews and Social Mentions: Set up alerts for brand mentions and reviews on platforms like Google, Yelp, or social media. Address both positive and negative feedback quickly. Example: A prompt reply to a frustrated tweet can turn a public complaint into a loyalty win.

We our customer support team.”The platform is intuitive and packed with powerful features, but what truly sets it apart is the outstanding customer support.” pic.twitter.com/J6UP0gdRUd— CustomersAI (@CustomersAI) December 11, 2024

Automate Post-Purchase Surveys: Send automated “How did we do?” emails after purchases to catch issues early and show customers you care about their experience.

Recover Lost Opportunities: Use workflows to address abandoned carts or incomplete actions. Example: “Hey [Name], you left something in your cart—complete your order today and enjoy free shipping!”

Proactive Marketing Strategies: Anticipate Needs and Build Loyalty

Use Data to Predict Customer Needs: Leverage enriched customer data to craft personalized outreach. Example: “Hi [Name], it’s been 6 months since your last order—ready to restock your favorites?”

Launch Pre-Emptive Campaigns: Anticipate lifecycle events like renewals, product upgrades, or subscription expirations. Example: “Your subscription renews next month! Here’s what’s new and what you’ll love.”

Show Appreciation Before It’s Asked For: Surprise loyal customers with rewards or exclusive perks. Example: “Thanks for being with us for 2 years—enjoy 20% off your next purchase!”

The Blend: Bring It All Together

Integrating both strategies is where the magic happens, and the right tools make it easier:

CRMs: Platforms like HubSpot or Salesforce centralize customer data, enabling seamless transitions between reactive support and proactive outreach.

Automation: Tools like Klaviyo or Customers.ai let you set up workflows to handle both abandoned carts (reactive) and post-purchase recommendations (proactive).

Feedback Loops: Use customer feedback from surveys or reviews to inform proactive strategies. If multiple customers mention the same issue, address it in your next campaign before it becomes widespread.

Pro Tip: Use a single source of truth for your customer data. By centralizing information, you’ll always know whether to send a proactive campaign or address an unresolved issue reactively.

When reactive and proactive strategies work together, your brand becomes a trusted partner.

How Customers.ai Helps Drive Relationship Marketing

Customers.ai isn’t just visitor identification software. It’s a powerful marketing tool that can help you with both reactive and proactive relationship marketing. 

By combining advanced visitor identification, data enrichment, and automation capabilities, it ensures you’re always one step ahead while keeping your customers at the center of everything you do. 

Here’s how:

Reactive Marketing Made Easy

Identify and Respond to Frustrated Visitors: Spot anonymous visitors who drop off during checkout or browse help pages without converting. Customers.ai gives you the data you need to reach out and resolve issues quickly.

Automate Follow-Ups: With automated workflows, you can send post-purchase surveys or abandoned cart reminders without lifting a finger. Example: “We noticed you left something behind—complete your checkout and enjoy 10% off.”

Monitor and Engage with Feedback: Use Customers.ai to track customer actions and sentiment, helping you respond faster to reviews, questions, or complaints.

Proactive Marketing That Anticipates Needs

Tailored Outreach with Enriched Data: Customers.ai enriches your contact profiles with real-time data, enabling hyper-personalized campaigns. Example: “Hi [Name], it’s been 3 months since your last purchase—here’s what’s trending in your favorite category.”

Lifecycle Campaigns Built to Delight: From renewal reminders to loyalty rewards, automate campaigns that anticipate what your customers want before they even ask. Example: “Your subscription is set to renew—here’s a sneak peek at new features coming your way!”

Engage High-Intent Visitors: Customers.ai identifies visitors showing purchase intent and helps you act fast. Example: “We noticed you’ve been checking out [product]. Let us know if you have any questions—we’re here to help!”

Why It Works

Customers.ai bridges the gap between reactive and proactive strategies, ensuring your brand is always aligned with your customer’s needs. Whether it’s solving a problem or surprising them with value, you’ll have the insights and tools to make every interaction count. Pretty good, right?

Pro Tip: Use Customers.ai’s visitor tracking to prioritize high-value leads while automating touchpoints for everyone else. It’s the perfect balance of personal and scalable relationship marketing.

The Power of Relationship Marketing for Brands

Relationship marketing isn’t about picking sides. After all, reactive and proactive strategies aren’t competing. They’re complementary. 

When you combine them, you’re solving problems, anticipating needs, and creating meaningful, lasting connections that keep customers coming back.

So, what’s your next move? 

Take a hard look at your current approach. Are you responding quickly when customers need help? Are you reaching out before they even realize they have a need? If not, it’s time to rethink your strategy.

Tools like Customers.ai can help you seamlessly blend reactive and proactive tactics, ensuring every interaction builds trust, loyalty, and ROI. 

It’s time to stop choosing between the two and start using both to elevate your customer relationships. 

Ready to get started? Get your free trial of Customers.ai today and see how easy it is to track and reach your website visitors.

See Who Is On Your Site Right Now!

Get names, emails, phone numbers & more.

Try it Free, No Credit Card Required

Start Your Free Trial

Important Next Steps

See what targeted outbound marketing is all about. Capture and engage your first 500 website visitor leads with Customers.ai X-Ray website visitor identification for free.

Talk and learn about sales outreach automation with other growth enthusiasts. Join Customers.ai Island, our Facebook group of 40K marketers and entrepreneurs who are ready to support you.

Advance your marketing performance with Sales Outreach School, a free tutorial and training area for sales pros and marketers.

Relationship Marketing FAQs

What is relationship marketing?

Relationship marketing is a strategy focused on building and nurturing long-term relationships with customers rather than just aiming for one-time sales. It prioritizes customer loyalty, retention, and lifetime value by delivering personalized experiences and maintaining regular engagement. This approach ensures customers feel valued and appreciated, leading to higher trust and brand advocacy over time.

How does relationship marketing differ from traditional marketing?

Traditional marketing focuses on acquiring new customers through promotions, advertisements, and campaigns aimed at driving immediate sales. Relationship marketing, on the other hand, emphasizes long-term customer satisfaction and retention by creating ongoing, meaningful interactions. Instead of focusing on one-off transactions, relationship marketing seeks to build trust and loyalty that lead to repeat business and referrals.

Why is relationship marketing important?

Relationship marketing is crucial because loyal customers contribute significantly to a brand’s profitability. Studies show that retaining an existing customer is 5-25 times cheaper than acquiring a new one. Additionally, loyal customers spend more and are more likely to recommend your brand to others. This strategy drives long-term growth by fostering customer satisfaction and trust.

What are the key components of relationship marketing?

Personalization: Tailoring interactions to individual customer preferences.

Customer Engagement: Regular communication through emails, social media, and events.

Trust-Building: Delivering consistent value and keeping promises.

Feedback Loops: Actively seeking and acting on customer feedback to improve the experience.By integrating these elements, businesses can create a strategy that prioritizes loyalty and customer retention.

What are the benefits of relationship marketing?

Relationship marketing helps businesses achieve higher customer retention rates, increased lifetime value, and improved brand reputation. Loyal customers are more likely to recommend your products, reducing acquisition costs through word-of-mouth referrals. Additionally, they are less price-sensitive, meaning they’re willing to pay a premium for trusted brands. This creates a stable revenue stream over time.

What industries benefit most from relationship marketing?

Industries that rely on repeat business and customer loyalty benefit the most from relationship marketing. Examples include ecommerce, SaaS, retail, hospitality, and financial services. In these sectors, retaining customers and building long-term trust can drive consistent revenue and reduce the need for aggressive customer acquisition campaigns.

How does personalization impact relationship marketing?

Personalization plays a significant role in relationship marketing by making customers feel valued and understood. According to a study by Epsilon, 80% of customers are more likely to purchase from a brand that offers personalized experiences. Tailored messages, product recommendations, and exclusive offers build trust and loyalty, resulting in higher engagement and conversions.

What role does technology play in relationship marketing?

Technology is essential for implementing effective relationship marketing strategies. Tools like CRMs (Customer Relationship Management) help businesses track customer interactions, preferences, and purchase history. Automation platforms streamline communication with personalized email campaigns, while data analytics tools provide insights into customer behavior, helping brands make informed decisions.

How can social media enhance relationship marketing?

Social media offers a direct channel for brands to engage with their audience, making it a valuable tool for relationship marketing. Businesses can use social platforms to share content, respond to feedback, and create a sense of community. Features like polls, Q&A sessions, and live videos encourage two-way interaction, building trust and fostering long-term relationships.

What is the difference between reactive and proactive relationship marketing?

Reactive marketing focuses on addressing customer needs or issues after they arise, such as resolving complaints or responding to inquiries. Proactive marketing anticipates customer needs and takes action beforehand, like sending renewal reminders or personalized recommendations. Combining both approaches ensures that customers feel supported while also being surprised and delighted by proactive efforts.

How can relationship marketing improve customer retention?

Relationship marketing focuses on understanding customer needs and delivering consistent value, which builds trust and loyalty. Engaged customers are more likely to stay with a brand they trust. Strategies like personalized follow-ups, exclusive offers, and responsive support ensure customers feel valued, leading to higher retention rates.

What metrics should you track for relationship marketing?

Customer Retention Rate: The percentage of customers who stay with your brand over time.

Customer Lifetime Value (CLV): The total revenue a customer generates during their relationship with your brand.

Net Promoter Score (NPS): A measure of customer satisfaction and likelihood to recommend your brand.

Engagement Metrics: Interactions on email, social media, and other channels that show ongoing interest. Tracking these metrics helps refine your strategy and measure its success.

What are some examples of relationship marketing strategies?

Examples of relationship marketing strategies include personalized email campaigns, loyalty programs, and post-purchase follow-ups. For instance, sending a birthday discount to customers or a thank-you email after a purchase shows appreciation. Creating exclusive VIP groups or offering early access to sales are also effective ways to nurture relationships.

Can small businesses implement relationship marketing?

Yes, relationship marketing is especially beneficial for small businesses looking to build strong customer loyalty. By focusing on personalized interactions and excellent customer service, small businesses can differentiate themselves from larger competitors. Simple strategies like handwritten thank-you notes, loyalty rewards, or regular check-ins can make a big impact.

What are the challenges of relationship marketing?

Challenges include collecting and managing customer data, ensuring personalized communication at scale, and balancing short-term sales goals with long-term relationship-building. Additionally, failing to act on customer feedback or sending irrelevant messages can harm trust. Overcoming these challenges requires the right tools, clear goals, and consistent execution.

How does relationship marketing drive word-of-mouth referrals?

Happy customers are more likely to recommend your brand to friends and family. By delivering exceptional experiences and building trust, relationship marketing creates advocates who spread positive word-of-mouth. According to Nielsen, 92% of consumers trust recommendations from people they know, making this an invaluable benefit of relationship marketing.

How do loyalty programs support relationship marketing?

Loyalty programs incentivize repeat purchases and encourage long-term engagement. By rewarding customers with points, discounts, or exclusive perks, you create a sense of value and appreciation. For example, Starbucks’ Rewards Program offers free drinks and early access to new products, driving loyalty and consistent sales.

What is the role of feedback in relationship marketing?

Feedback is a critical component of relationship marketing, as it helps businesses understand customer needs and improve their offerings. Actively seeking feedback through surveys or reviews demonstrates that you value customer opinions. Acting on this feedback not only strengthens trust but also shows customers that their voices matter.

How does relationship marketing impact revenue?

Relationship marketing directly impacts revenue by increasing customer retention and lifetime value. According to Bain & Company, returning customers spend 67% more than new ones. Additionally, satisfied customers are more likely to recommend your brand, reducing acquisition costs and driving organic growth.

What are some tools for relationship marketing?

CRM Software: Platforms like Salesforce or HubSpot for tracking customer interactions.

Email Marketing Tools: Services like Mailchimp or Klaviyo for personalized communication.

Social Media Management: Tools like Hootsuite or Sprout Social for engagement tracking.

Survey Tools: Platforms like SurveyMonkey to gather and act on customer feedback. These tools help streamline and enhance your relationship marketing efforts.

How can relationship marketing support ecommerce businesses?

For ecommerce businesses, relationship marketing drives repeat purchases, reduces cart abandonment, and builds brand loyalty. Strategies like personalized product recommendations, post-purchase emails, and exclusive discounts help nurture relationships. Loyal ecommerce customers often become brand advocates, further boosting sales.

How can data enrichment improve relationship marketing?

Data enrichment enhances your customer profiles by adding insights like purchase history, job roles, or preferences. This enables hyper-personalized communication, such as tailored product recommendations or targeted offers. With enriched data, you can craft campaigns that resonate deeply with individual customers.

What role does storytelling play in relationship marketing?

Storytelling helps brands connect emotionally with their audience, creating stronger bonds. Sharing stories about your company’s values, customer success, or community impact makes your brand relatable and memorable. For example, a sustainable fashion brand might share stories about how their materials are sourced ethically, resonating with environmentally conscious customers.

How can relationship marketing reduce churn?

Relationship marketing reduces churn by addressing customer pain points and maintaining regular, meaningful communication. Proactive strategies like renewal reminders, personalized offers, and satisfaction surveys show customers you care. Engaged and valued customers are less likely to leave, keeping churn rates low.
The post Relationship Marketing: Reactive vs. Proactive and Why You Need Both appeared first on Customers.ai.

Design multi-agent orchestration with reasoning using Amazon Bedrock a …

As generative AI capabilities evolve, successful business adoptions hinge on the development of robust problem-solving capabilities. At the forefront of this transformation are agentic systems, which harness the power of foundation models (FMs) to tackle complex, real-world challenges. By seamlessly integrating multiple agents, these innovative solutions enable autonomous collaboration, decision-making, and efficient problem-solving in diverse environments. Empirical research conducted by Amazon Web Services (AWS) scientists in conjunction with academic researchers has demonstrated the significant strides made in enhancing the reasoning capabilities through agent collaboration on competitive tasks.
This post provides step-by-step instructions for creating a collaborative multi-agent framework with reasoning capabilities to decouple business applications from FMs. It demonstrates how to combine Amazon Bedrock Agents with open source multi-agent frameworks, enabling collaborations and reasoning among agents to dynamically execute various tasks. The exercise will guide you through the process of building a reasoning orchestration system using Amazon Bedrock, Amazon Bedrock Knowledge Bases, Amazon Bedrock Agents, and FMs. We also explore the integration of Amazon Bedrock Agents with open source orchestration frameworks LangGraph and CrewAI for dispatching and reasoning.
AWS has introduced a multi-agent collaboration capability for Amazon Bedrock, enabling developers to build, deploy, and manage multiple AI agents working together on complex tasks. This feature allows for the creation of specialized agents that handle different aspects of a process, coordinated by a supervisor agent that breaks down requests, delegates tasks, and consolidates outputs. This approach improves task success rates, accuracy, and productivity, especially for complex, multi-step tasks.
For the example code and demonstration discussed in this post, refer to the agentic-orchestration GitHub repository and this AWS Workshop. You can also refer to GitHub repo for Amazon Bedrock multi-agent collaboration code samples.
Key characteristics of an agentic service
In the context of generative AI, “agent” refers to an autonomous function that can interact with its environment, gather data, and make decisions to execute complex tasks to achieve predefined goals. Generative AI agents are autonomous, goal-oriented systems that use FMs, such as large language models (LLMs), to interact with and adapt to their environments. These agents excel in planning, problem-solving, and decision-making, using techniques such as chain-of-thought prompting to break down complex tasks. They can self-reflect, improve their processes, and expand their capabilities through tool use and collaborations with other AI models. These agents can operate independently or collaboratively, executing tasks across various domains while continuously adapting to new information and changing circumstances. Agents can lead to increased creativity and produce content at scale, automating repetitive tasks so humans can focus on strategic work, thus reducing repetitive actions and leading to cost savings. The following diagram shows the high-level architecture of the solution.

To implement an agent on AWS, you can use the Amazon Bedrock Agents Boto3 client as demonstrated in the following code example. After the required AWS and Identity and Access Management (IAM) role is created for the agent, use the create_agent API. This API requires an agent name, an FM identifier, and an instruction string. Optionally, you can also provide an agent description. The created agent is not yet prepared for use. We focus on preparing the agent and then using it to invoke actions and interact with other APIs. Use the following code example to obtain your agent ID; it will be crucial for performing operations with the agent.
# Use the Python boto3 SDK to interact with Amazon Bedrock Agent service

bedrock_agent_client = boto3.client(‘bedrock-agent’)

# Create a new Bedrock Agent
response = bedrock_agent_client.create_agent(
agentName=<agent_name>, #customized text string
agentResourceRoleArn=<agent_role[‘Role’][‘Arn’]>, #IAM role assigned to the agent
description=<agent_description>, #customized text string
idleSessionTTLInSeconds=1800,
foundationModel=<agent_foundation_model>, #e.g. “anthropic.claude-3-sonnet-20240229-v1:0″
instruction=<agent_instruction>, #agent instruction text string
)
agent_id = response[‘agent’][‘agentId’]
Multi-agent pipelines for intra-agent collaboration
Multi-agent pipelines are orchestrated processes within AI systems that involve multiple specialized agents working together to accomplish complex tasks. Within pipelines, agents are organized in a sequential order structure, with different agents handling specific subtasks or roles within the overall workflow. Agents interact with each other, often through a shared “scratchpad” or messaging system, allowing them to exchange information and build upon each other’s work. Each agent maintains its own state, which can be updated with new information as the flow progresses. Complex projects are broken down into manageable subtasks, which are then distributed among the specialized agents. The workflow includes clearly defined processes for how tasks should be orchestrated, facilitating efficient task distribution and alignment with objectives. These processes can govern both inter-agent interactions and intra-agent operations (such as how an agent interacts with tools or processes outputs). Agents can be assigned specific roles (for example, retriever or injector) to tackle different aspects of a problem.
As a practical example, consider a multi-agent pipeline for blog writing, implemented with the multi-agent framework CrewAI. To create a multi-agent pipeline with CrewAI, first define the individual agents that will participate in the pipeline. The agents in the following example are the Planner Agent, a Writer Agent, and an Editor Agent. Next, arrange these agents into a pipeline, specifying the order of task execution and how the data flows between them. CrewAI provides mechanisms for agents to pass information to each other and coordinate their actions. The modular and scalable design of CrewAI makes it well-suited for developing both simple and sophisticated multi-agent AI applications. The following diagram shows this multi-agent pipeline.

from crewai import Agent, Task, Crew, Process

# Create a blog writing multi-agent pipeline, which is comprised of a planner, a writer, and an editor agent
# This code snippet shows only the planner agent, which calls web search tools
# and Amazon Bedrock for the LLM
class blogAgents():
def __init__(self, topic, model_id):
self.topic = topic
self.model_id = model_id

def planner(self, topic, model_id):
return Agent(
role=”Content Planner”,
goal=f”””Plan engaging and factually accurate content on {topic}.”””,
backstory=f”””You’re working on planning a blog article about the topic: {topic}. n
You collect information by searching the web for the latest developments that directly relate to the {topic}. n
You help the audience learn something to make informed decisions regarding {topic}. n
Your work is the basis for the Content Writer to write an article on this {topic}.”””,
allow_delegation=False,
tools=<tools_to_use>,
llm=<Bedrock_foundation_model>,
verbose=True
)
……

# Create the associated blog agent tasks which are comprised of a planner, writer, and editor tasks.
# This code snippet shows only the planner task.
class blogTasks():
def __init__(self, topic, model_id):
self.topic = topic
self.model_id = model_id

def plan(self, planner, topic, model_id):
return Task(
description=(
f”””1. Prioritize the latest trends, key players, and noteworthy news on {topic}.n
2. Identify the target audience, considering their interests and pain points.n
3. Develop a detailed content outline including an introduction, key points, and a call to action.n
4. Include SEO keywords and relevant data or sources.”””
),
expected_output=f”””Convey the latest developments on the {topic} with sufficient depth as a domain expert.n
Create a comprehensive content plan document with an outline, audience analysis,
SEO keywords, and resources.”””,
agent=planner
)
……

# Define planner agent and planning tasks
planner_agent = agents.planner(self.topic, self.model_id)
plan_task = tasks.plan(planner_agent, self.topic, self.model_id)
……

# Define an agentic pipeline to chain the agent and associated tasks
# with service components, embedding engine, and execution process
crew = Crew(
agents=[planner_agent, writer_agent, editor_agent],
tasks=[plan_task, write_task, edit_task],
verbose=True,
memory=True,
embedder={
“provider”: “huggingface”,
“config”: {“model”: “sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2”},
},
cache=True,
process=Process.sequential # Sequential process will have tasks executed one after the other
)
result = crew.kickoff()

As demonstrated in this code example, multi-agent pipelines are generally simple linear structures that may be easy to set up and understand. They have a clear sequential flow of tasks from one agent to the next and can work well for straightforward workflows with a defined order of operations. Meanwhile, the pipeline structure can be less flexible for complex, nonlinear agent interactions, which makes it less able to handle branching logic or cycles. This might be less efficient for problems that require back-and-forth between agents. The next section addresses a graph framework for multi-agent systems, which lend better to more complex scenarios.
Multi-agent graph framework for asynchronous orchestration and reasoning
A multi-agent framework offers significant potential for intelligent, dynamic problem-solving that enable collaborative, specialized task execution. While these systems can enhance inference accuracy and response efficiency by dynamically activating and coordinating agents, they also present critical challenges including potential bias, limited reasoning capabilities, and the need for robust oversight. Effective multi-agent frameworks require careful design considerations such as clear leadership, dynamic team construction, effective information sharing, planning mechanisms like chain-of-thought prompting, memory systems for contextual learning, and strategic orchestration of specialized language models. As the technology evolves, balancing agent autonomy with human oversight and ethical safeguards will be crucial to unlocking the full potential of these intelligent systems while mitigating potential risks.
A multi-agent graph framework is a system that models the interactions and relationships between multiple autonomous agents using a graph-based representation. In this type of framework, agents are represented as nodes in the graph, with each agent having its own set of capabilities, goals, and decision-making processes. The edges in the graph represent the interactions, communications, or dependencies between the agents. These can include things like information sharing, task delegation, negotiation, or coordination. The graph structure allows for the modeling of complex, dynamic relationships between agents, including cycles, feedback loops, and hierarchies. The following diagram shows this architecture.

The graph-based approach provides a flexible and scalable way to represent the structure of multi-agent systems, making it easier to analyze, simulate, and reason about the emergent behaviors that arise from agent interactions. The following code snippet illustrates the process of building a graph framework designed for multi-agent orchestration using LangGraph. This framework is essential for managing and coordinating the interactions between multiple agents within a system, promoting efficient and effective communication and collaboration. Notably, it emphasizes the plug-and-play feature, which allows for dynamic changes and the flexibility to accommodate third-party agents. Frameworks with this capability can seamlessly adapt to new requirements and integrate with external systems, enhancing their overall versatility and usability.
from langgraph.graph import StateGraph, END
……
# Create a graph to orchestrate multiple agents (i.e. nodes)
orch = StateGraph(MultiAgentState)
orch.add_node(“rewrite_agent”, rewrite_node)
orch.add_node(‘booking_assistant’, bedrock_agent_node)
orch.add_node(‘blog_writer’, blog_writer_node)
orch.add_node(“router_agent”, router_node)
orch.add_node(‘search_expert’, search_expert_node)
….

# Create edges to connect agents to form a graph
orch.set_entry_point(“rewrite_agent”)
orch.add_edge(‘rewrite_agent’, ‘router_agent’)
orch.add_conditional_edges(
“RAG_agent”,
decide_to_search,
{
“to_human”: “human”,
“do_search”: “search_expert”,
},
)
orch.add_edge(‘blog_writer’, ‘text2image_generation’)
……

# Compile the graph for agentic orchestration
graph = orch.compile(checkpointer=memory, interrupt_before = [‘human’])

The multi-agent graph approach is particularly useful for domains where complex, dynamic interactions between autonomous entities need to be modeled and analyzed, such as in robotics, logistics, social networks, and more. There are multiple advantages and disadvantages to the multi-agent graph-based approach over the linear multi-agent pipelines approach, which are captured below.
Advantages and limitations
The emergence of agentic services represents a transformative approach to system design. Unlike conventional AI models that adhere to fixed, predetermined workflows, agentic systems are characterized by their capacity to collaborate, adapt, and make decisions in real time. This transition from passive to active AI opens up exciting opportunities and presents unique design challenges for developers and architects. Central to agentic services is the notion of agentic reasoning, which embodies a flexible, iterative problem-solving methodology that reflects human cognitive processes. By integrating design patterns such as reflection, self-improvement, and tool utilization, we can develop AI agents that are capable of ongoing enhancement and broader functionality across various domains.
Agentic services, although promising, face several limitations that must be addressed for their successful production implementation. The complexity of managing multiple autonomous agents, especially as their numbers and scope increase, poses a significant challenge in maintaining system coherence and stability. Additionally, the emergent behaviors of these systems can be difficult to predict and understand, hindering transparency and interpretability, which are crucial for building trust and accountability. Safety and robustness are paramount concerns because unintended behaviors or failures could have far-reaching consequences, necessitating robust safeguards and error-handling mechanisms. As agentic services scale up, maintaining efficient performance becomes increasingly challenging, requiring optimized resource utilization and load balancing. Finally, the lack of widely adopted standards and protocols for agent-based systems creates interoperability issues, making it difficult to integrate these services with existing infrastructure. Addressing these limitations is essential for the widespread adoption and success of agentic services in various domains.
Advantages:

More flexible representation of agent interactions using a graph structure
Better suited for complex workflows with nonlinear agent communication
Can more easily represent cycles and branching logic between agents
 Potentially more scalable for large multi-agent system
Clearer visualization of overall agent system structure

Disadvantages:

More complex initial setup compared to linear pipelines
Can require more upfront planning to design the graph structure
Can require extra source usage and longer response time

Next steps
In the next phase of multi-agent orchestration, our focus will be on enhancing the reasoning, reflection, and self-correction capabilities of our agents. This involves developing advanced algorithms (such as tree-of-thoughts (ToT) prompting, Monte Carlo tree search (MCTS), and others) that allow agents to learn from their peer interactions, adapt to new situations, and correct their behaviors based on feedback. Additionally, we’re working on creating a production-ready framework that can accommodate a variety of agentic services. This framework will be designed to be flexible and scalable, enabling seamless integration of different types of agents and services. These efforts are currently underway, and we’ll provide a detailed update on our progress in the next blog post. Stay tuned for more insights into our innovative approach to multi-agent orchestration.
Conclusion
Multi-agent orchestration and reasoning represent a significant leap forward in generative AI production adoption, offering unprecedented potential for complex problem-solving and decision-making, decoupling your applications from individual FMs. It’s also crucial to acknowledge and address the limitations, including scalability challenges, long latency and likely incompatibility among different agents. As we look to the future, enhancing self and intra-agent reasoning, reflection, and self-correction capabilities of our agents will be paramount. This will involve developing more sophisticated algorithms for metacognition, improving inter-agent communication protocols, and implementing robust error detection and correction mechanisms.
For the example code and demonstration discussed in this post, refer to the agentic-orchestration GitHub repository and this AWS Workshop. You can also refer to GitHub repo for Amazon Bedrock multi-agent collaboration code samples.
The authors wish to express their gratitude to Mark Roy, Maria Laderia Tanke, and Max Iguer for their insightful contributions, as well as to Nausheen Sayed for her relentless coordination.

About the authors
Alfred Shen is a Senior GenAI Specialist at AWS. He has been working in Silicon Valley, holding technical and managerial positions in diverse sectors including healthcare, finance, and high-tech. He is a dedicated applied AI/ML researcher, concentrating on agentic solutions and multimodality.
Anya Derbakova is a Senior Startup Solutions Architect at AWS, specializing in Healthcare and Life Science technologies. A University of North Carolina graduate, she previously worked as a Principal Developer at Blue Cross Blue Shield Association. Anya is recognized for her contributions to AWS professional development, having been featured on the AWS Developer Podcast and participating in multiple educational series. She co-hosted a six-part mini-series on AWS Certification Exam Prep, focusing on cost-optimized cloud architecture strategies. Additionally, she was instrumental in the “Get Schooled on…Architecting” podcast, which provided comprehensive preparation for the AWS Solutions Architect Exam.

Contact Enrichment Strategies: How to Personalize Outreach That Conver …

Nobody’s getting excited about another generic email that starts with “Hi [First Name].” 

In fact, 72% of consumers say they only engage with messages that are personalized to their interests. That’s why your outreach game needs an upgrade – and contact enrichment is the cheat code.

So, what’s contact enrichment? 

Basically, it’s like giving your lead data a glow-up. Instead of just a name and email, you’re layering in juicy details like job titles, company size, social media activity, and even what they’ve browsed on your site. 

With this extra intel, you can create messages that make people think, “Whoa, it’s like they read my mind.”

In this post, we’re gonna break down how to use contact enrichment to craft hyper-personalized campaigns that actually convert. From the tools you need to strategies that work, we’ve got you covered. 

Ready to turn cold leads into red-hot conversions? Let’s get into it.

Unlock High-Intent Leads Hiding on Your Site

Book a demo of Customers.ai’s U.S. website visitor identification, customer journey insights and remarketing platform to skyrocket conversions and sales.

Book a Demo

Why Contact Enrichment Is Essential to Your Email Strategy

Basic data like a first name and email might have cut it back in the early 2000s, but today? It’s barely scratching the surface. 

Customers expect more. They want to feel like you get them and “Hi [First Name]” just isn’t doing that.

That’s where enriched data comes in. 

Instead of bland, one-size-fits-all outreach, imagine knowing your lead’s job title, company size, and even their favorite social platform. 

Suddenly, you’re speaking directly to them. 

And it works! personalized emails deliver 6x higher transaction rates than generic ones. Six. Times. That’s huge.

The thing is, when you have enriched contact data, everything gets better. 

Your targeting is sharper, your messages hit harder, and your campaigns feel more natural. 

The result? Higher engagement, meaningful interactions, and way more conversions.

The Most Effective Contact Enrichment Strategies 

Alright, so you get why enriched contact data is the backbone of killer personalization. Now, let’s dig into the how. 

Forget hoarding data just to say you have it. The real win is using the right contact enrichment strategies and tools to turn raw information into insights you can actually use. 

Let’s break it down:

Start With the Right Tools

If you’re serious about accurate contact enrichment, Customers.ai should be your first stop. 

Unlike basic tools that just pull surface-level info, Customers.ai gives you actionable, real-time behavioral insights. 

Want to know which pages your lead visited, what products caught their attention, or how long they stayed on your pricing page? We can tell you! 

Even better, we work with tools like Klaviyo to automate your workflows, resulting in more personalized and scaleable outreach.

Here is how to identify and enrich your website visitor data with Customers.ai.:

1. Sign up for a free account

If you don’t already have a Customers.ai account, sign up here (no credit card is required) and connect your business.

2. Install the x-ray pixel on your site

Installing the website identification x-ray pixel is easy and can be done through Tag Manager, Shopify, WordPress, and more

3. Verify the x-ray pixel is firing

4. Start identifying your website visitors

That’s it! Once the pixel is installed and verified, you can start identifying your website visitors.

5. Get enriched contact data

As you see above, Customers.ai can also fill in gaps in your CRM with firmographic details like company size and industry. 

Need to target high-level decision-makers?

We can drill into professional profiles like LinkedIn profiles or business emails, ensuring your outreach lands in front of the right people. 

Leverage Multiple Data Sources

When you rely on only one data source, you increase your chances of being wrong. And no one wants to send someone an email with wrong information. 

Instead, you want to piece together data from everywhere to create a full profile. 

Here’s what that could look like:

CRM Systems: Your key to purchase history, contact preferences, and interaction timelines.

Social Media Activity: Platforms like LinkedIn or Twitter can reveal professional interests and current challenges.

Behavioral Tracking: Tools like Customers.ai let you see exactly how leads interact with your site and from which pages they visited to how long they stayed.

Email Engagement Data: Open rates, link clicks, and replies can tell you what topics or offers resonate most.

By blending these sources, you’re understanding the story behind your customer data.

Focus on High-Value Fields

Let’s get specific. Not every piece of data is worth chasing, so prioritize the fields that drive real results:

Industry and Job Title: Knowing their role gives you the context to speak their language. A VP of Marketing has different goals than an IT Manager.

Recent Online Activity: Did they download a guide? Watch a webinar? Visit your pricing page three times in a week? These are the breadcrumbs that tell you what they’re interested in.

Pain Points: Dig into challenges they’ve expressed directly or inferred based on their behavior and industry. Addressing these upfront makes your outreach impossible to ignore.

When you combine smart tools, multiple data streams, and a laser focus on what matters, your outreach can really start delivering results.

Email Deliverability Hacks:

Unlocking Inboxes 2

HOSTED BY

Larry Kim

Founder and CEO, Customers.ai

Free Webinar: Watch Now

How to Use Enriched Contact Data to Personalize Outreach

To make enriched contact data work for you, it’s not enough to just collect it. You need to know how to apply it in a way that resonates with your audience. 

Let’s break down some contact enrichment strategies that will turn this data into outreach that connects and converts.

Segment Smarter

Email marketers know that advanced segmentation is where the magic begins. Use data like industry, company size, or even engagement patterns to create micro-segments that feel personal. 

For example, craft messaging specifically for fast-growing startups or larger enterprises. This level of precision is what separates great campaigns from forgettable ones.

Craft Personalized Messages

As we’ve already said, personalization goes far beyond dropping a name. Enriched data lets you create outreach that feels tailor-made. 

Some examples:

“Hi [Name], I saw your company just raised Series A funding. Here’s how we’ve helped other fast-growing startups like [Competitor].”

“Hi [Name], as a [Job Title] in [Industry], you’re probably juggling [specific challenge]. Here’s how we’ve solved this for companies like [Competitor].”

Use specifics to show you’ve done your homework. The kind of details that make recipients pause and say, “Wow, they really get me.”

Optimize Timing and Channels

It’s not just about what you say but about when and where you say it. 

Use insights like time zone, preferred communication channels, and activity patterns to optimize outreach timing. 

For B2B leads, LinkedIn might be your go-to for engagement. For ecommerce, email follow-ups timed to align with browsing behavior can drive conversions. The key is showing up where your leads are already paying attention, not making them come to you.

When you combine segmentation, hyper-personalization, and strategic timing, your outreach evolves from ordinary to extraordinary.

Go Bigger: Advanced Contact Enrichment Strategies for Better Outreach

Let’s go bigger! We want to move beyond surface-level personalization and focus on advanced contact enrichment strategies that deliver real results. Here’s what that looks like.

Behavior-Triggered Campaigns

Use your enriched contact data to create automated workflows based on your lead’s real-time actions.

Here’s how to do it:

Track key behaviors: Monitor when leads visit your pricing page, download a resource, or abandon their cart. For example, 67% of cart abandoners are likely to convert with a well-timed follow-up email.

Set up automated responses: Use tools like Customers.ai to send personalized emails or SMS messages within minutes of these behaviors. If a lead hovers over your pricing page but doesn’t convert, follow up with a tailored email offering a free consultation or demo.

Integrate Enriched Data into ABM

Account-based marketing (ABM) thrives on precision and enriched contact data gives it the boost it needsl.

Identify high-value accounts: Use enriched data to pinpoint decision-makers and understand their company structure.

Personalize campaigns for each account: If a target account recently expanded their team, craft messaging that highlights how your solution scales with growth. Customers.ai can automate these personalized campaigns while keeping the human touch.

Track engagement at the account level: Use insights like email opens, content downloads, or webinar attendance to refine your approach for each account.

Enrich Social Media Engagement

Your audience is already active on platforms like LinkedIn and Instagram. Use enriched contact data to connect these interactions with your overall strategy.

Identify leads from social activity: Match social media interactions, like LinkedIn comments or Instagram DMs, with CRM data to create a fuller profile.

Respond with context: If someone comments on a LinkedIn post about scaling a team, follow up with a message offering a relevant case study or resource.

Measure conversions: Track how social engagements lead to pipeline activity and sales to ensure your strategy is ROI-driven.

By combining behavior-triggered campaigns, ABM tactics, and social media engagement, enriched data transforms your outreach from standard to strategic.

AI-Powered Advertising

How to Unlock AI and Lead Capture Tech for 10X Return on Ad Spend

HOSTED BY

Larry Kim

Founder and CEO, Customers.ai

Free Webinar: Watch Now

Contact Enrichment No No’s: Strategies to Avoid

Even the best data can backfire if you’re not careful. Here are the top mistakes to avoid and specific examples to bring them to life:

Overloading Data

Too much data can overwhelm your team and derail your strategy. Focus on actionable fields instead of drowning in irrelevant details.

What NOT to do: “Here’s a lead’s favorite snack, their pet’s name, and the last three companies they followed on LinkedIn.” This isn’t helping you craft a better email.

What TO do: “This lead is the VP of Marketing at a SaaS company with 200 employees. They’ve been engaging with our content about demand generation.” Now your team knows exactly how to frame the outreach.

Ignoring Data Updates

Stale or outdated data can make your outreach irrelevant or worse, awkward. Regular updates keep your messaging sharp.

What NOT to do: Sending an email like, “Hi [Name], we see you’re at [Company]” when they left that role three months ago. This screams “we don’t care enough to stay current.”

What TO do: Use tools like Customers.ai to automatically refresh data so you can confidently say, “Hi [Name], congrats on your new role as VP of Growth at [Company]. Here’s how we’ve helped other teams scale during transitions.”

Forgetting the Human Element

Even the best data won’t save a message that feels robotic. Personalization needs to feel genuine, not like it was cobbled together by an algorithm.

What NOT to do: “Hello [Name], we noticed your company size is [Company Size]. Our product helps companies of your size achieve [Generic Outcome].” This feels like a form letter.

What TO do: “Hi [Name], I saw your team is growing fast at [Company]. Scaling can be a challenge, and I’d love to share how we helped [Similar Company] streamline operations during their growth phase.” This keeps it personal and relatable.

When you avoid these missteps, your contact enrichment efforts will stay focused, relevant, and human-centered.

Enrich Your Contact Data, Enrich Your Results

By now, it’s clear that enriched contact data is a must. Whether you’re segmenting smarter, crafting hyper-personalized messages, or optimizing your timing and channels, the right data makes all the difference.

But it’s important to remember that it’s not just about the data. It’s also about how you use it. 

Regular updates, good segmentation, and keeping the human element front and center are keys to a successful contact enrichment strategy.

Take a moment to audit your current contact enrichment process. Are you using tools like Customers.ai website visitor identification to it’s fullest potential? 

If not, now’s the time to streamline and enhance your strategy. Better data leads to better results so don’t settle for average when you can deliver exceptional.

Ready to take your outreach to the next level? Start your free trial of Customers.ai today and get 500 contacts free.

See Who Is On Your Site Right Now!

Get names, emails, phone numbers & more.

Try it Free, No Credit Card Required

Start Your Free Trial

Important Next Steps

See what targeted outbound marketing is all about. Capture and engage your first 500 website visitor leads with Customers.ai X-Ray website visitor identification for free.

Talk and learn about sales outreach automation with other growth enthusiasts. Join Customers.ai Island, our Facebook group of 40K marketers and entrepreneurs who are ready to support you.

Advance your marketing performance with Sales Outreach School, a free tutorial and training area for sales pros and marketers.

The post Contact Enrichment Strategies: How to Personalize Outreach That Converts  appeared first on Customers.ai.