Generate compliant content with Amazon Bedrock and ConstitutionalChain

Generative AI has emerged as a powerful tool for content creation, offering several key benefits that can significantly enhance the efficiency and effectiveness of content production processes such as creating marketing materials, image generation, content moderation etc. Constitutional AI and LangGraph‘s reflection mechanisms represent two complementary approaches to ensuring AI systems behave ethically – with Anthropic embedding principles during training while LangGraph implements them during inference/runtime through reflection and self-correction mechanisms. By using LanGraph’s Constitutional AI, content creators can streamline their workflow while maintaining high standards of user-defined compliance and ethical integrity. This method not only reduces the need for extensive human oversight but also enhances the transparency and accountability of content generation process by AI.
In this post, we explore practical strategies for using Constitutional AI to produce compliant content efficiently and effectively using Amazon Bedrock and LangGraph to build ConstitutionalChain for rapid content creation in highly regulated industries like finance and healthcare. Although AI offers significant productivity benefits, maintaining compliance with strict regulations are crucial. Manual validation of AI-generated content for regulatory adherence can be time-consuming and challenging. We also provide an overview of how Insagic, a Publicis Groupe company, integrated this concept into their existing healthcare marketing workflow using Amazon Bedrock. Insagic is a next-generation insights and advisory business that combines data, design, and dialogues to deliver actionable insights and transformational intelligence for healthcare marketers. It uses expertise from data scientists, behavior scientists, and strategists to drive better outcomes in the healthcare industry.
Understanding Constitutional AI
Constitutional AI is designed to align large language models (LLMs) with human values and ethical considerations. It works by integrating a set of predefined rules, principles, and constraints into the LLM’s core architecture and training process. This approach makes sure that the LLM operates within specified ethical and legal parameters, much like how a constitution governs a nation’s laws and actions.
The key benefits of Constitutional AI for content creation include:

Ethical alignment – Content generated using Constitutional AI is inherently aligned with predefined ethical standards
Legal compliance – The LLM is designed to operate within legal frameworks, reducing the risk of producing non-compliant content
Transparency – The principles guiding the LLM’s decision-making process are clearly defined and can be inspected
Reduced human oversight – By embedding ethical guidelines into the LLM, the need for extensive human review is significantly reduced

Let’s explore how you can harness the power of Constitutional AI to generate compliant content for your organization.
Solution overview
For this solution, we use Amazon Bedrock Knowledge Bases to store a repository of healthcare documents. We employ a Retrieval Augmented Generation (RAG) approach, first retrieving relevant context and synthesizing an answer based on the retrieved context, to generate articles based on the repository. We then use the open source orchestration framework LangGraph and ConstitutionalChain to generate, critique, and review prompts in an Amazon SageMaker notebook and develop an agentic workflow to generate compliance content. The following diagram illustrates this architecture.

This implementation demonstrates a sophisticated agentic workflow that not only generates responses based on a knowledge base but also employs a reflection technique to examine its outputs through ethical principles, allowing it to refine and improve its outputs. We upload a sample set of mental health documents to Amazon Bedrock Knowledge Bases and use those documents to write an article on mental health using a RAG-based approach. Later, we define a constitutional principle with a custom Diversity, Equity, and Inclusion (DEI) principle, specifying how to critique and revise responses for inclusivity.
Prerequisites
To deploy the solution, you need the following prerequisites:

An AWS account
Appropriate AWS Identity and Access Management (IAM) permissions to access an Amazon Simple Storage Service (Amazon S3) bucket, create Amazon Bedrock knowledge bases, and create a SageMaker notebook instance

Create an Amazon Bedrock knowledge base
To demonstrate this capability, we download a mental health article from the following GitHub repo and store it in Amazon S3. We then use Amazon Bedrock Knowledge Bases to index the articles. By default, Amazon Bedrock uses Amazon OpenSearch Serverless as a vector database. For full instructions to create an Amazon Bedrock knowledge base with Amazon S3 as the data source, see Create a knowledge base in Amazon Bedrock Knowledge Bases.

On the Amazon Bedrock console, create a new knowledge base.
Provide a name for your knowledge base and create a new IAM service role.
Choose Amazon S3 as the data source and provide the S3 bucket storing the mental health article.
Choose Amazon Titan Text Embeddings v2 as the embeddings model and OpenSearch Serverless as the vector store.
Choose Create Knowledge Base.

Import statements and set up an Amazon Bedrock client
Follow the instructions provided in the README file in the GitHub repo. Clone the GitHub repo to make a local copy. We recommend running this code in a SageMaker JupyterLab environment. The following code imports the necessary libraries, including Boto3 for AWS services, LangChain components, and Streamlit. It sets up an Amazon Bedrock client and configures Anthropic’s Claude 3 Haiku model with specific parameters.

import boto3
from langchain_aws import ChatBedrock

bedrock_runtime = boto3.client(service_name=”bedrock-runtime”, region_name=”us-east-1″)
llm = ChatBedrock(client=bedrock_runtime, model_id=”anthropic.claude-3-haiku-20240307-v1:0″)
…..

Define Constitutional AI components
Next, we define a Critique class to structure the output of the critique process. Then we create prompt templates for critique and revision. Lastly, we set up chains using LangChain for generating responses, critiques, and revisions.

# LangChain Constitutional chain migration to LangGraph

class Critique(TypedDict):
“””Generate a critique, if needed.”””

critique_needed: Annotated[bool, …, “Whether or not a critique is needed.”]
critique: Annotated[str, …, “If needed, the critique.”]

critique_prompt = ChatPromptTemplate.from_template(
“Critique this response according to the critique request. ”

)

revision_prompt = ChatPromptTemplate.from_template(
“Revise this response according to the critique and reivsion request.nn”
….
)
chain = llm | StrOutputParser()
critique_chain = critique_prompt | llm.with_structured_output(Critique)
revision_chain = revision_prompt | llm | StrOutputParser()

Define a State class and refer to the Amazon Bedrock Knowledge Bases retriever
We define a LangGraph State class to manage the conversation state, including the query, principles, responses, and critiques:

# LangGraph State

class State(TypedDict):
query: str
constitutional_principles: List[ConstitutionalPrinciple]

Next, we set up an Amazon Bedrock Knowledge Bases retriever to extract the relevant information. We refer to the Amazon Bedrock knowledge base we created earlier to create an article based on mental health documents. Make sure to update the knowledge base ID in the following code with the knowledge base you created in previous steps:

#—————————————————————–
# Amazon Bedrock KnowledgeBase

from langchain_aws.retrievers import AmazonKnowledgeBasesRetriever

retriever = AmazonKnowledgeBasesRetriever(
knowledge_base_id=”W3NMIJXLUE”, # Change it to your Knowledge base ID

)

Create LangGraph nodes and a LangGraph graph along with constitutional principles
The next section of code integrates graph-based workflow orchestration, ethical principles, and a user-friendly interface to create a sophisticated Constitutional AI model. The following diagram illustrates the workflow.

It uses a StateGraph to manage the flow between RAG and critique/revision nodes, incorporating a custom DEI principle to guide the LLM’s responses. The system is presented through a Streamlit application, which provides an interactive chat interface where users can input queries and view the LLM’s initial responses, critiques, and revised answers. The application also features a sidebar displaying a graph visualization of the workflow and a description of the applied ethical principle. This comprehensive approach makes sure that the LLM’s outputs are not only knowledge-based but also ethically aligned by using customizable constitutional principles that guide a reflection flow (critique and revise), all while maintaining a user-friendly experience with features like chat history management and a clear chat option.
Streamlit application
The Streamlit application component of this code creates an interactive and user-friendly interface for the Constitutional AI model. It sets up a side pane that displays a visualization of the LLM’s workflow graph and provides a description of the DEI principle being applied. The main interface features a chat section where users can input their queries and view the LLM’s responses.

# ————————————————————————
# Streamlit App

# Clear Chat History fuction
def clear_screen():
st.session_state.messages = [{“role”: “assistant”, “content”: “How may I assist you today?”}]

with st.sidebar:
st.subheader(‘Constitutional AI Demo’)
…..
ConstitutionalPrinciple(
name=”DEI Principle”,
critique_request=”Analyze the content for any lack of diversity, equity, or inclusion. Identify specific instances where the text could be more inclusive or representative of diverse perspectives.”,
revision_request=”Rewrite the content by incorporating critiques to be more diverse, equitable, and inclusive. Ensure representation of various perspectives and use inclusive language throughout.”
)
“””)
st.button(‘Clear Screen’, on_click=clear_screen)

# Store LLM generated responses
if “messages” not in st.session_state.keys():
st.session_state.messages = [{“role”: “assistant”, “content”: “How may I assist you today?”}]

# Chat Input – User Prompt
if prompt := st.chat_input():
….

with st.spinner(f”Generating…”):
….
with st.chat_message(“assistant”):
st.markdown(“**[initial response]**”)
….
st.session_state.messages.append({“role”: “assistant”, “content”: “[revised response] ” + generation[‘response’]})

The application maintains a chat history, displaying both user inputs and LLM responses, including the initial response, any critiques generated, and the final revised response. Each step of the LLM’s process is clearly labeled and presented to the user. The interface also includes a Clear Screen button to reset the chat history. When processing a query, the application shows a loading spinner and displays the runtime, providing transparency into the LLM’s operation. This comprehensive UI design allows users to interact with the LLM while observing how constitutional principles are applied to refine the LLM’s outputs.
Test the solution using the Streamlit UI
In the Streamlit application, when a user inputs a query, the application initiates the process by creating and compiling the graph defined earlier. It then streams the execution of this graph, which includes the RAG and critique/revise steps. During this process, the application displays real-time updates for each node’s execution, showing the user what’s happening behind the scenes. The system measures the total runtime, providing transparency about the processing duration. When it’s complete, the application presents the results in a structured manner within the chat interface. It displays the initial LLM-generated response, followed by any critiques made based on the constitutional principles, and finally shows the revised response that incorporates these ethical considerations. This step-by-step presentation allows users to see how the LLM’s response evolves through the constitutional AI process, from initial generation to ethical refinement. As mentioned, in the GitHub README file, in order to run the Streamlit application, use the following code:

pip install -r requirements.txt
streamlit run main.py

For details on using a Jupyter proxy to access the Streamlit application, refer to Build Streamlit apps in Amazon SageMaker Studio.
Modify the Studio URL, replacing lab? with proxy/8501/.

How Insagic uses Constitutional AI to generate compliant content
Insagic uses real-world medical data to help brands understand people as patients and patients as people, enabling them to deliver actionable insights in the healthcare marketing space. Although generating deep insights in the health space can yield profound dividends, it must be done with consideration for compliance and the personal nature of health data. By defining federal guidelines as constitutional principles, Insagic makes sure that the content delivered by generative AI complies with federal guidelines for healthcare marketing.
Clean up
When you have finished experimenting with this solution, clean up your resources to prevent AWS charges from being incurred:

Empty the S3 buckets.
Delete the SageMaker notebook instance.
Delete the Amazon Bedrock knowledge base.

Conclusion
This post demonstrated how to implement a sophisticated generative AI solution using Amazon Bedrock and LangGraph to generate compliant content. You can also integrate this workflow to generate responses based on a knowledge base and apply ethical principles to critique and revise its outputs, all within an interactive web interface. Insagic is looking at more ways to incorporate this into existing workflows by defining custom principles to achieve compliance goals.
You can expand this concept further by incorporating Amazon Bedrock Guardrails. Amazon Bedrock Guardrails and LangGraph Constitutional AI can create a comprehensive safety system by operating at different levels. Amazon Bedrock provides API-level content filtering and safety boundaries, and LangGraph implements constitutional principles in reasoning workflows. Together, they enable multi-layered protection through I/O filtering, topic restrictions, ethical constraints, and logical validation steps in AI applications.
Try out the solution for your own use case, and leave your feedback in the comments.

About the authors
Sriharsh Adari is a Senior Solutions Architect at Amazon Web Services (AWS), where he helps customers work backwards from business outcomes to develop innovative solutions on AWS. Over the years, he has helped multiple customers on data platform transformations across industry verticals. His core area of expertise include Technology Strategy, Data Analytics, and Data Science. In his spare time, he enjoys playing sports, binge-watching TV shows, and playing Tabla.
David Min is a Senior Partner Sales Solutions Architect at Amazon Web Services (AWS) specializing in Generative AI, where he helps customers transform their businesses through innovative AI solutions. Throughout his career, David has helped numerous organizations across industries bridge the gap between cutting-edge AI technology and practical business applications, focusing on executive engagement and successful solution adoption.
Stephen Garth is a Data Scientist at Insagic, where he develops advanced machine learning solutions, including LLM-powered automation tools and deep clustering models for actionable, consumer insights. With a strong background spanning software engineering, healthcare data science, and computational research, he is passionate to bring his expertise in AI-driven analytics and large-scale data processing to drive solutions.
Chris Cocking specializes in scalable enterprise application design using multiple programming languages. With a nearly 20 years of experience, he excels in LAMP and IIS environments, SEO strategies, and most recently designing agentic systems. Outside of work, Chris is an avid bassist and music lover, which helps fuel his creativity and problem-solving skills.

How to Use Git and Git Bash Locally: A Comprehensive Guide

Table of contentsIntroductionInstallationWindowsmacOSLinuxVerifying InstallationGit Bash BasicsNavigation CommandsFile OperationsKeyboard ShortcutsGit ConfigurationAdditional ConfigurationsBasic Git WorkflowInitializing a RepositoryChecking StatusStaging FilesCommitting ChangesBranching and MergingWorking with BranchesMerging BranchesHandling Merge ConflictsDeleting BranchesRemote RepositoriesAdding a Remote RepositoryAdvanced Git CommandsStashing ChangesReverting ChangesInteractive RebaseTroubleshootingCommon Issues and SolutionsGit Best Practices.gitignore ExampleConclusion

Introduction

Git is a distributed version control system that helps you track changes in your code, collaborate with others, and maintain a history of your project. Git Bash is a terminal application for Windows that provides a Unix-like command-line experience for using Git.

This guide will walk you through setting up Git, using Git Bash, and mastering essential Git commands for local development.

Installation

Windows

Download Git for Windows from git-scm.com

Run the installer with default options (or customize as needed)

Git Bash will be installed automatically as part of the package

macOS

Install Git using Homebrew: brew install git

Alternatively, download from git-scm.com

Linux

For Debian/Ubuntu: sudo apt-get install git

For Fedora: sudo dnf install git

For other distributions, use the appropriate package manager

Verifying Installation

Open Git Bash (Windows) or Terminal (macOS/Linux) and type:

This should display the installed Git version.

Git Bash Basics

Git Bash provides a Unix-like shell experience on Windows. Here are some essential commands:

Navigation Commands

pwd – Print working directory

ls – List files and directories

cd [directory] – Change directory

mkdir [directory] – Create a new directory

rm [file] – Remove a file

rm -r [directory] – Remove a directory and its contents

File Operations

touch [filename] – Create an empty file

cat [filename] – Display file contents

nano [filename] or vim [filename] – Edit files in the terminal

Keyboard Shortcuts

Ctrl + C – Terminate the current command

Ctrl + L – Clear the screen

Tab – Auto-complete commands or filenames

Up/Down arrows – Navigate through command history

Git Configuration

Before using Git, configure your identity:

Additional Configurations

Set your default editor:

Enable colorful output:

View all configurations:

Basic Git Workflow

Initializing a Repository

Navigate to your project folder and initialize a Git repository:

Checking Status

See which files are tracked, modified, or staged:

Staging Files

Add files to the staging area:

Committing Changes

Save staged changes to the repository:

Or open an editor to write a more detailed commit message:

Viewing Commit History

Branching and Merging

Working with Branches

Create a new branch:

Switch to a branch:

Create and switch to a new branch in one command:

List all branches:

Merging Branches

Merge changes from another branch into your current branch:

Handling Merge Conflicts

When Git can’t automatically merge changes, you’ll need to resolve conflicts:

Git will mark the conflicted files

Open the files and look for conflict markers (<<<<<<<, =======, >>>>>>>)

Edit the files to resolve conflicts

Add the resolved files: git add <filename>

Complete the merge: git commit

Deleting Branches

Delete a branch after merging:

Remote Repositories

Adding a Remote Repository

Viewing Remote Repositories

Pushing to a Remote Repository

Pulling from a Remote Repository

Cloning a Repository

Advanced Git Commands

Stashing Changes

Temporarily store modified files to work on something else:

Reverting Changes

Undo commits:

Reset to a previous state (use with caution):

Viewing and Comparing Changes

Interactive Rebase

Rewrite, squash, or reorder commits:

Troubleshooting

Common Issues and Solutions

Problem: “fatal: not a git repository”

Solution: Make sure you’re in the correct directory or initialize a repository with git init

Problem: Unable to push to remote repository

Solution:

Check if you have the correct permissions

Pull latest changes first: git pull origin main

Check if remote URL is correct: git remote -v

Problem: Merge conflicts

Solution: Resolve conflicts manually, then git add the resolved files and git commit

Problem: Accidental commit

Solution: Use git reset –soft HEAD~1 to undo the last commit while keeping changes

Git Best Practices

Commit frequently with clear, descriptive commit messages

Create branches for new features or bug fixes

Pull before pushing to minimize conflicts

Write meaningful commit messages that explain why changes were made

Use .gitignore to exclude unnecessary files (build artifacts, dependencies, etc.)

Review changes before committing with git diff and git status

Keep commits focused on a single logical change

Use tags for marking releases or important milestones

Back up your repositories regularly

Document your Git workflow for team collaboration

.gitignore Example

Create a .gitignore file in your repository root:

Customize this file according to your project’s specific needs.

Conclusion

Git and Git Bash provide powerful tools for version control and collaborative development. In this guide, we covered installation across platforms, essential Git Bash commands, repository initialization, the core add-commit workflow, branching strategies, remote repository management, and advanced operations like stashing and rebasing. We also addressed common troubleshooting scenarios and best practices to maintain a clean workflow. With these fundamentals, you’re now equipped to track changes, collaborate effectively, and maintain a structured history of your projects.
The post How to Use Git and Git Bash Locally: A Comprehensive Guide appeared first on MarkTechPost.

How to Build a Prototype X-ray Judgment Tool (Open Source Medical Infe …

In this tutorial, we demonstrate how to build a prototype X-ray judgment tool using open-source libraries in Google Colab. By leveraging the power of TorchXRayVision for loading pre-trained DenseNet models and Gradio for creating an interactive user interface, we show how to process and classify chest X-ray images with minimal setup. This notebook guides you through image preprocessing, model inference, and result interpretation, all designed to run seamlessly on Colab without requiring external API keys or logins. Please note that this demo is intended for educational purposes only and should not be used as a substitute for professional clinical diagnosis.

Copy CodeCopiedUse a different Browser!pip install torchxrayvision gradio

First, we install the torchxrayvision library for X-ray analysis and Gradio to create an interactive interface.

Copy CodeCopiedUse a different Browserimport torch
import torchxrayvision as xrv
import torchvision.transforms as transforms
import gradio as gr

We import PyTorch for deep learning operations, TorchXRayVision for X‑ray analysis, torchvision’s transforms for image preprocessing, and Gradio for building an interactive UI.

Copy CodeCopiedUse a different Browsermodel = xrv.models.DenseNet(weights=”densenet121-res224-all”)
model.eval()

Then, we load a pre-trained DenseNet model using the “densenet121-res224-all” weights and set it to evaluation mode for inference.

Copy CodeCopiedUse a different Browsertry:
pathology_labels = model.meta[“labels”]
print(“Retrieved pathology labels from model.meta.”)
except Exception as e:
print(“Could not retrieve labels from model.meta. Using fallback labels.”)
pathology_labels = [
“Atelectasis”, “Cardiomegaly”, “Consolidation”, “Edema”,
“Emphysema”, “Fibrosis”, “Hernia”, “Infiltration”, “Mass”,
“Nodule”, “Pleural Effusion”, “Pneumonia”, “Pneumothorax”, “No Finding”
]

Now, we attempt to retrieve pathology labels from the model’s metadata and fall back to a predefined list if the retrieval fails.

Copy CodeCopiedUse a different Browserdef classify_xray(image):
try:
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.Grayscale(num_output_channels=1),
transforms.ToTensor()
])
input_tensor = transform(image).unsqueeze(0) # add batch dimension

with torch.no_grad():
preds = model(input_tensor)

pathology_scores = preds[0].detach().numpy()
results = {}
for idx, label in enumerate(pathology_labels):
results[label] = float(pathology_scores[idx])

sorted_results = sorted(results.items(), key=lambda x: x[1], reverse=True)
top_label, top_score = sorted_results[0]

judgement = (
f”Prediction: {top_label} (score: {top_score:.2f})nn”
f”Full Scores:n{results}”
)
return judgement
except Exception as e:
return f”Error during inference: {str(e)}”

Here, with this function, we preprocess an input X-ray image, run inference using the pre-trained model, extract pathology scores, and return a formatted summary of the top prediction and all scores while handling errors gracefully.

Copy CodeCopiedUse a different Browseriface = gr.Interface(
fn=classify_xray,
inputs=gr.Image(type=”pil”),
outputs=”text”,
title=”X-ray Judgement Tool (Prototype)”,
description=(
“Upload a chest X-ray image to receive a classification judgement. ”
“This demo is for educational purposes only and is not intended for clinical use.”
)
)

iface.launch()

Finally, we build and launch a Gradio interface that lets users upload a chest X-ray image. The classify_xray function processes the image to output a diagnostic judgment.

Gradio Interface for the tool

Through this tutorial, we’ve explored the development of an interactive X-ray judgment tool that integrates advanced deep learning techniques with a user-friendly interface. Despite the inherent limitations, such as the model not being fine-tuned for clinical diagnostics, this prototype serves as a valuable starting point for experimenting with medical imaging applications. We encourage you to build upon this foundation, considering the importance of rigorous validation and adherence to medical standards for real-world use.

Here is the Colab Notebook. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 85k+ ML SubReddit.

The post How to Build a Prototype X-ray Judgment Tool (Open Source Medical Inference System) Using TorchXRayVision, Gradio, and PyTorch appeared first on MarkTechPost.

This AI Paper Introduces Diversified DPO and ORPO: Post-Training Metho …

Creative writing is a domain that thrives on diversity and imagination. Unlike fact-based or task-specific writing, where a single correct output may exist, creative writing involves numerous valid responses to a prompt. Stories, poems, and narratives can branch in countless directions, each with stylistic flavor and meaning. This inherent open-mindedness makes creative writing a prime challenge for AI systems, which need to maintain narrative coherence while producing novel and distinct outputs.

The core issue lies in how large language models are refined after their initial training. Post-training methods often emphasize quality improvements by aligning responses with user preferences or maximizing reward scores. However, these adjustments inadvertently cause the models to produce responses that are too similar across prompts. In creative settings, this leads to a noticeable drop in output diversity. A lack of variation limits the expressive power of the model, resulting in uniform storylines or similar sentence constructions even when prompts are vastly different.

Earlier solutions attempted to address this by tweaking decoding methods or prompt strategies. Researchers used sampling temperature adjustment, top-k or top-p filtering, or iterative prompting to introduce randomness. Some explored methods, such as beam search modifications or self-critiquing, to encourage alternative responses. While these helped diversify outputs, they often came with a cost—sacrificing overall response quality, increasing generation time, or introducing inconsistencies in tone and grammar. More crucially, they did not adopt the model’s core training process to learn from diverse samples.

Researchers from Midjourney and New York University proposed a novel adjustment during the post-training phase. They introduced “Diversified DPO” and “Diversified ORPO”—enhanced versions of two popular preference-based optimization techniques. Their innovation was incorporating a deviation score, quantifying how much a training example differs from others responding to the same prompt. Rare and diverse responses are given more importance during learning by using this score to weight training losses. The researchers specifically implemented these strategies on large models like Meta’s Llama-3.1-8B and Mistral-7B using parameter-efficient fine-tuning via LoRA.

In this approach, deviation acts as a learning signal. For every training pair of a better and worse response to a prompt, the deviation of the better response is computed using both semantic and stylistic embeddings. These embeddings measure not only content differences but also stylistic uniqueness between responses. The resulting score then influences how much that training pair contributes to the model’s weight updates. This method increases the likelihood that the model generates distinct yet high-quality outputs. The training used over 400,000 prompt-response pairs with Reddit upvotes as quality signals and introduced mixing methods to effectively balance semantic and style deviations.

Quantitative results demonstrated the success of the proposed method. The best-performing model, Llama-3.1-8B with Diversified DPO using semantic and style deviation (DDPO-both), achieved nearly the same reward score as GPT-4o while significantly outperforming it in diversity. Specifically, the model had semantic diversity approaching that of the human-crafted reference dataset and style diversity slightly below it. In head-to-head human evaluations, 68% of reviewers preferred DDPO-both’s outputs over GPT-4o’s for quality, and 100% chose them as more diverse. Compared to the baseline DPO, DDPO-both still came out ahead, selected 50% of the time for quality and 62% for diversity. When fewer responses per prompt were available during training, slight drops in reward scores were mitigated using a minimum deviation threshold or sampling higher-quality responses.

This research highlighted a compelling solution to the diversity-quality trade-off in AI-generated creative writing. By emphasizing deviation in training, the researchers enabled models to value uniqueness without compromising coherence. The outcome is a model that delivers richer and more varied storytelling, marking a meaningful step forward in creative AI development.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 85k+ ML SubReddit.
The post This AI Paper Introduces Diversified DPO and ORPO: Post-Training Methods to Boost Output Diversity in Creative Writing with LLMs appeared first on MarkTechPost.

Build agentic systems with CrewAI and Amazon Bedrock

This post is co-authored with Joao Moura and Tony Kipkemboi from CrewAI.
The enterprise AI landscape is undergoing a seismic shift as agentic systems transition from experimental tools to mission-critical business assets. In 2025, AI agents are expected to become integral to business operations, with Deloitte predicting that 25% of enterprises using generative AI will deploy AI agents, growing to 50% by 2027. The global AI agent space is projected to surge from $5.1 billion in 2024 to $47.1 billion by 2030, reflecting the transformative potential of these technologies.
In this post, we explore how CrewAI’s open source agentic framework, combined with Amazon Bedrock, enables the creation of sophisticated multi-agent systems that can transform how businesses operate. Through practical examples and implementation details, we demonstrate how to build, deploy, and orchestrate AI agents that can tackle complex tasks with minimal human oversight. Although “agents” is the buzzword of 2025, it’s important to understand what an AI agent is and where deploying an agentic system could yield benefits.
Agentic design
An AI agent is an autonomous, intelligent system that uses large language models (LLMs) and other AI capabilities to perform complex tasks with minimal human oversight. Unlike traditional software, which follows pre-defined rules, AI agents can operate independently, learn from their environment, adapt to changing conditions, and make contextual decisions. They are designed with modular components, such as reasoning engines, memory, cognitive skills, and tools, that enable them to execute sophisticated workflows. Traditional SaaS solutions are designed for horizontal scalability and general applicability, which makes them suitable for managing repetitive tasks across diverse sectors, but they often lack domain-specific intelligence and the flexibility to address unique challenges in dynamic environments. Agentic systems, on the other hand, are designed to bridge this gap by combining the flexibility of context-aware systems with domain knowledge. Consider a software development use case AI agents can generate, evaluate, and improve code, shifting software engineers’ focus from routine coding to more complex design challenges. For example, for the CrewAI git repository, pull requests are evaluated by a set of CrewAI agents who review code based on code documentation, consistency of implementation, and security considerations. Another use case can be seen in supply chain management, where traditional inventory systems might track stock levels, but lack the capability to anticipate supply chain disruptions or optimize procurement based on industry insights. In contrast, an agentic system can use real-time data (such as weather or geopolitical risks) to proactively reroute supply chains and reallocate resources. The following illustration describes the components of an agentic AI system:

Overview of CrewAI
CrewAI is an enterprise suite that includes a Python-based open source framework. It simplifies the creation and management of AI automations using either AI flows, multi-agent systems, or a combination of both, enabling agents to work together seamlessly, tackling complex tasks through collaborative intelligence. The following figure illustrates the capability of CrewAI’s enterprise offering:

CrewAI’s design centers around the ability to build AI automation through flows and crews of AI agents. It excels at the relationship between agents and tasks, where each agent has a defined role, goal, and backstory, and can access specific tools to accomplish their objectives. This framework allows for autonomous inter-agent delegation, where agents can delegate tasks and inquire among themselves, enhancing problem-solving efficiency. This growth is fueled by the increasing demand for intelligent automation and personalized customer experiences across sectors like healthcare, finance, and retail.
CrewAI’s agents are not only automating routine tasks, but also creating new roles that require advanced skills. CrewAI’s emphasis on team collaboration, through its modular design and simplicity principles, aims to transcend traditional automation, achieving a higher level of decision simplification, creativity enhancement, and addressing complex challenges.
CrewAI key concepts
CrewAI’s architecture is built on a modular framework comprising several key components that facilitate collaboration, delegation, and adaptive decision-making in multi-agent environments. Let’s explore each component in detail to understand how they enable multi-agent interactions.
At a high level, CrewAI creates two main ways to create agentic automations: flows and crews.
Flows
CrewAI Flows provide a structured, event-driven framework to orchestrate complex, multi-step AI automations seamlessly. Flows empower users to define sophisticated workflows that combine regular code, single LLM calls, and potentially multiple crews, through conditional logic, loops, and real-time state management. This flexibility allows businesses to build dynamic, intelligent automation pipelines that adapt to changing conditions and evolving business needs. The following figure illustrates the difference between Crews and Flows:

When integrated with Amazon Bedrock, CrewAI Flows unlock even greater potential. Amazon Bedrock provides a robust foundation by enabling access to powerful foundation models (FMs).
For example, in a customer support scenario, a CrewAI Flow orchestrated through Amazon Bedrock could automatically route customer queries to specialized AI agent crews. These crews collaboratively diagnose customer issues, interact with backend systems for data retrieval, generate personalized responses, and dynamically escalate complex problems to human agents only when necessary.
Similarly, in financial services, a CrewAI Flow could monitor industry conditions, triggering agent-based analysis to proactively manage investment portfolios based on industry volatility and investor preferences.
Together, CrewAI Flows and Amazon Bedrock create a powerful synergy, enabling enterprises to implement adaptive, intelligent automation that addresses real-world complexities efficiently and at scale.
Crews
Crews in CrewAI are composed of several key components, which we discuss in this section.
Agents
Agents in CrewAI serve as autonomous entities designed to perform specific roles within a multi-agent system. These agents are equipped with various capabilities, including reasoning, memory, and the ability to interact dynamically with their environment. Each agent is defined by four main elements:

Role – Determines the agent’s function and responsibilities within the system
Backstory – Provides contextual information that guides the agent’s decision-making processes
Goals – Specifies the objectives the agent aims to accomplish
Tools – Extends the capabilities of agents to access more information and take actions

Agents in CrewAI are designed to work collaboratively, making autonomous decisions, delegating tasks, and using tools to execute complex workflows efficiently. They can communicate with each other, use external resources, and refine their strategies based on observed outcomes.
Tasks
Tasks in CrewAI are the fundamental building blocks that define specific actions an agent needs to perform to achieve its objectives. Tasks can be structured as standalone assignments or interdependent workflows that require multiple agents to collaborate. Each task includes key parameters, such as:

Description – Clearly defines what the task entails
Agent assignment – Specifies which agent is responsible for executing the task

Tools
Tools in CrewAI provide agents with extended capabilities, enabling them to perform actions beyond their intrinsic reasoning abilities. These tools allow agents to interact with APIs, access databases, execute scripts, analyze data, and even communicate with other external systems. CrewAI supports a modular tool integration system where tools can be defined and assigned to specific agents, providing efficient and context-aware decision-making.
Process
The process layer in CrewAI governs how agents interact, coordinate, and delegate tasks. It makes sure that multi-agent workflows operate seamlessly by managing task execution, communication, and synchronization among agents.
More details on CrewAI concepts can be found in the CrewAI documentation.
CrewAI enterprise suite
For businesses looking for tailored AI agent solutions, CrewAI provides an enterprise offering that includes dedicated support, advanced customization, and integration with enterprise-grade systems like Amazon Bedrock. This enables organizations to deploy AI agents at scale while maintaining security and compliance requirements.
Enterprise customers get access to comprehensive monitoring tools that provide deep visibility into agent operations. This includes detailed logging of agent interactions, performance metrics, and system health indicators. The monitoring dashboard enables teams to track agent behavior, identify bottlenecks, and optimize multi-agent workflows in real time.
Real-world enterprise impact
CrewAI customers are already seeing significant returns by adopting agentic workflows in production. In this section, we provide a few real customer examples.
Legacy code modernization
A large enterprise customer needed to modernize their legacy ABAP and APEX code base, a typically time-consuming process requiring extensive manual effort for code updates and testing.
Multiple CrewAI agents work in parallel to:

Analyze existing code base components
Generate modernized code in real time
Execute tests in production environment
Provide immediate feedback for iterations

The customer achieved approximately 70% improvement in code generation speed while maintaining quality through automated testing and feedback loops. The solution was containerized using Docker for consistent deployment and scalability. The following diagram illustrates the solution architecture.

Back office automation at global CPG company
A leading CPG company automated their back-office operations by connecting their existing applications and data stores to CrewAI agents that:

Research industry conditions
Analyze pricing data
Summarize findings
Execute decisions

The implementation resulted in a 75% reduction in processing time by automating the entire workflow from data analysis to action execution. The following diagram illustrates the solution architecture.

Get started with CrewAI and Amazon Bedrock
Amazon Bedrock integration with CrewAI enables the creation of production-grade AI agents powered by state-of-the-art language models.
The following is a code snippet on how to set up this integration:

from crewai import Agent, Crew, Process, Task, LLM
from crewai_tools import SerperDevTool, ScrapeWebsiteTool
import os

# Configure Bedrock LLM
llm = LLM(
model=”bedrock/anthropic. anthropic.claude-3-5-sonnet-20241022-v2:0″,
aws_access_key_id=os.getenv(‘AWS_ACCESS_KEY_ID’),
aws_secret_access_key=os.getenv(‘AWS_SECRET_ACCESS_KEY’),
aws_region_name=os.getenv(‘AWS_REGION_NAME’)
)

# Create an agent with Bedrock as the LLM provider
security_analyst = Agent(
config=agents_config[‘security_analyst’],
tools=[SerperDevTool(), ScrapeWebsiteTool()],
llm=llm
)

Check out the CrewAI LLM documentation for detailed instructions on how to configure LLMs with your AI agents.
Amazon Bedrock provides several key advantages for CrewAI applications:

Access to state-of-the-art language models such as Anthropic’s Claude and Amazon Nova – These models provide the cognitive capabilities that power agent decision-making. The models enable agents to understand complex instructions, generate human-like responses, and make nuanced decisions based on context.
Enterprise-grade security and compliance features – This is crucial for organizations that need to maintain strict control over their data and enforce compliance with various regulations.
Scalability and reliability backed by AWS infrastructure – This means your agent systems can handle increasing workloads while maintaining consistent performance.

Amazon Bedrock Agents and Amazon Bedrock Knowledge Bases as native CrewAI Tools
Amazon Bedrock Agents offers you the ability to build and configure autonomous agents in a fully managed and serverless manner on Amazon Bedrock. You don’t have to provision capacity, manage infrastructure, or write custom code. Amazon Bedrock manages prompt engineering, memory, monitoring, encryption, user permissions, and API invocation. BedrockInvokeAgentTool enables CrewAI agents to invoke Amazon Bedrock agents and use their capabilities within your workflows.
With Amazon Bedrock Knowledge Bases, you can securely connect FMs and agents to your company data to deliver more relevant, accurate, and customized responses. BedrockKBRetrieverTool enables CrewAI agents to retrieve information from Amazon Bedrock Knowledge Bases using natural language queries.
The following code shows an example for Amazon Bedrock Agents integration:

from crewai import Agent, Task, Crew

from crewai_tools.aws.bedrock.agents.invoke_agent_tool import BedrockInvokeAgentTool

# Initialize the Bedrock Agents Tool

agent_tool = BedrockInvokeAgentTool(
agent_id=”your-agent-id”,
agent_alias_id=”your-agent-alias-id”
)

# Create an CrewAI agent that uses the Bedrock Agents Tool

aws_expert = Agent(
role=’AWS Service Expert’,
goal=’Help users understand AWS services and quotas’,
backstory=’I am an expert in AWS services and can provide detailed information about them.’,
tools=[agent_tool],
verbose=True
)

The following code shows an example for Amazon Bedrock Knowledge Bases integration:

# Create and configure the BedrockKB tool
kb_tool = BedrockKBRetrieverTool(
knowledge_base_id=”your-kb-id”,
number_of_results=5
)

# Create an CrewAI agent that uses the Bedrock Agents Tool
researcher = Agent(
role=’Knowledge Base Researcher’,
goal=’Find information about company policies’,
backstory=’I am a researcher specialized in retrieving and analyzing company documentation.’,
tools=[kb_tool],
verbose=True
)

Operational excellence through monitoring, tracing, and observability with CrewAI on AWS
As with any software application, achieving operational excellence is crucial when deploying agentic applications in production environments. These applications are complex systems comprising both deterministic and probabilistic components that interact either sequentially or in parallel. Therefore, comprehensive monitoring, traceability, and observability are essential factors for achieving operational excellence. This includes three key dimensions:

Application-level observability – Provides smooth operation of the entire system, including the agent orchestration framework CrewAI and potentially additional application components (such as a frontend)
Model-level observability – Provides reliable model performance (including metrics like accuracy, latency, throughput, and more)
Agent-level observability – Maintains efficient operations within single-agent or multi-agent systems

When running agent-based applications with CrewAI and Amazon Bedrock on AWS, you gain access to a comprehensive set of built-in capabilities across these dimensions:

Application-level logs – Amazon CloudWatch automatically collects application-level logs and metrics from your application code running on your chosen AWS compute platform, such as AWS Lambda, Amazon Elastic Container Service (Amazon ECS), or Amazon Elastic Compute Cloud (Amazon EC2). The CrewAI framework provides application-level logging, configured at a minimal level by default. For more detailed insights, verbose logging can be enabled at the agent or crew level by setting verbose=True during initialization.
Model-level invocation logs – Furthermore, CloudWatch automatically collects model-level invocation logs and metrics from Amazon Bedrock. This includes essential performance metrics.
Agent-level observability – CrewAI seamlessly integrates with popular third-party monitoring and observability frameworks such as AgentOps, Arize, MLFlow, LangFuse, and others. These frameworks enable comprehensive tracing, debugging, monitoring, and optimization of the agent system’s performance.

Solution overview
Each AWS service has its own configuration nuances, and missing just one detail can lead to serious vulnerabilities. Traditional security assessments often demand multiple experts, coordinated schedules, and countless manual checks. With CrewAI Agents, you can streamline the entire process, automatically mapping your resources, analyzing configurations, and generating clear, prioritized remediation steps.
The following diagram illustrates the solution architecture.

Our use case demo implements a specialized team of three agents, each with distinct responsibilities that mirror roles you might find in a professional security consulting firm:

Infrastructure mapper – Acts as our system architect, methodically documenting AWS resources and their configurations. Like an experienced cloud architect, it creates a detailed inventory that serves as the foundation for our security analysis.
Security analyst – Serves as our cybersecurity expert, examining the infrastructure map for potential vulnerabilities and researching current best practices. It brings deep knowledge of security threats and mitigation strategies.
Report writer – Functions as our technical documentation specialist, synthesizing complex findings into clear, actionable recommendations. It makes sure that technical insights are communicated effectively to both technical and non-technical stakeholders.

Implement the solution
In this section, we walk through the implementation of a security assessment multi-agent system. The code for this example is located on GitHub. Note that not all code artifacts of the solution are explicitly covered in this post.
Step 1: Configure the Amazon Bedrock LLM
We’ve saved our environment variables in an .env file in our root directory before we pass them to the LLM class:

from crewai import Agent, Crew, Process, Task, LLM
from crewai.project import CrewBase, agent, crew, task

from aws_infrastructure_security_audit_and_reporting.tools.aws_infrastructure_scanner_tool import AWSInfrastructureScannerTool
from crewai_tools import SerperDevTool, ScrapeWebsiteTool
import os

@CrewBase
class AwsInfrastructureSecurityAuditAndReportingCrew():
“””AwsInfrastructureSecurityAuditAndReporting crew”””
def __init__(self) -> None:
self.llm = LLM( model=os.getenv(‘MODEL’),
aws_access_key_id=os.getenv(‘AWS_ACCESS_KEY_ID’),
aws_secret_access_key=os.getenv(‘AWS_SECRET_ACCESS_KEY’),
aws_region_name=os.getenv(‘AWS_REGION_NAME’)
)

Step 2: Define agents
These agents are already defined in the agents.yaml file, and we’re importing them into each agent function in the crew.py file:


# Configure AI Agents

@agent
def infrastructure_mapper(self) -> Agent:
return Agent(
config=self.agents_config[‘infrastructure_mapper’],
tools=[AWSInfrastructureScannerTool()],
llm=self.llm
)

@agent
def security_analyst(self) -> Agent:
return Agent(
config=self.agents_config[‘security_analyst’],
tools=[SerperDevTool(), ScrapeWebsiteTool()],
llm=self.llm
)

@agent
def report_writer(self) -> Agent:
return Agent(
config=self.agents_config[‘report_writer’],
llm=self.llm
)

Step 3: Define tasks for the agents
Similar to our agents in the preceding code, we import tasks.yaml into our crew.py file:


# Configure Tasks for the agents

@task
def map_aws_infrastructure_task(self) -> Task:
return Task(
config=self.tasks_config[‘map_aws_infrastructure_task’]
)

@task
def exploratory_security_analysis_task(self) -> Task:
return Task(
config=self.tasks_config[‘exploratory_security_analysis_task’]
)

@task
def generate_report_task(self) -> Task:
return Task(
config=self.tasks_config[‘generate_report_task’]
)

Step 4: Create the AWS infrastructure scanner tool
This tool enables our agents to interact with AWS services and retrieve information they need to perform their analysis:

class AWSInfrastructureScannerTool(BaseTool):
name: str = “AWS Infrastructure Scanner”
description: str = (
“A tool for scanning and mapping AWS infrastructure components and their configurations. ”
“Can retrieve detailed information about EC2 instances, S3 buckets, IAM configurations, ”
“RDS instances, VPC settings, and security groups. Use this tool to gather information ”
“about specific AWS services or get a complete infrastructure overview.”
)
args_schema: Type[BaseModel] = AWSInfrastructureScannerInput

def _run(self, service: str, region: str) -> str:
try:
if service.lower() == ‘all’:
return json.dumps(self._scan_all_services(region), indent=2, cls=DateTimeEncoder)
return json.dumps(self._scan_service(service.lower(), region), indent=2, cls=DateTimeEncoder)
except Exception as e:
return f”Error scanning AWS infrastructure: {str(e)}”

def _scan_all_services(self, region: str) -> Dict:
return {
‘ec2’: self._scan_service(‘ec2’, region),
‘s3’: self._scan_service(‘s3’, region),
‘iam’: self._scan_service(‘iam’, region),
‘rds’: self._scan_service(‘rds’, region),
‘vpc’: self._scan_service(‘vpc’, region)
}

# More services can be added here

Step 5: Assemble the security audit crew
Bring the components together in a coordinated crew to execute on the tasks:

@crew
def crew(self) -> Crew:
“””Creates the AwsInfrastructureSecurityAuditAndReporting crew”””
return Crew(
agents=self.agents, # Automatically created by the @agent decorator
tasks=self.tasks, # Automatically created by the @task decorator
process=Process.sequential,
verbose=True,
)

Step 6: Run the crew
In our main.py file, we import our crew and pass in inputs to the crew to run:

def run():
“””
Run the crew.
“””
inputs = {}
AwsInfrastructureSecurityAuditAndReportingCrew().crew().kickoff(inputs=inputs)

The final report will look something like the following code:

“`markdown
### Executive Summary

In response to an urgent need for robust security within AWS infrastructure, this assessment identified several critical areas requiring immediate attention across EC2 Instances, S3 Buckets, and IAM Configurations. Our analysis revealed two high-priority issues that pose significant risks to the organization’s security posture.

### Risk Assessment Matrix

| Security Component | Risk Description | Impact | Likelihood | Priority |
|——————–|——————|———|————|———-|
| S3 Buckets | Unintended public access | High | High | Critical |
| EC2 Instances | SSRF through Metadata | High | Medium | High |
| IAM Configurations | Permission sprawl | Medium | High | Medium |

### Prioritized Remediation Roadmap

1. **Immediate (0-30 days):**
– Enforce IMDSv2 on all EC2 instances
– Conduct S3 bucket permission audit and rectify public access issues
– Adjust security group rules to eliminate broad access

2. **Short Term (30-60 days):**
– Conduct IAM policy audit to eliminate unused permissions
– Restrict RDS access to known IP ranges
“`

This implementation shows how CrewAI agents can work together to perform complex security assessments that would typically require multiple security professionals. The system is both scalable and customizable, allowing for adaptation to specific security requirements and compliance standards.
Conclusion
In this post, we demonstrated how to use CrewAI and Amazon Bedrock to build a sophisticated, automated security assessment system for AWS infrastructure. We explored how multiple AI agents can work together seamlessly to perform complex security audits, from infrastructure mapping to vulnerability analysis and report generation. Through our example implementation, we showcased how CrewAI’s framework enables the creation of specialized agents, each bringing unique capabilities to the security assessment process. By integrating with powerful language models using Amazon Bedrock, we created a system that can autonomously identify security risks, research solutions, and generate actionable recommendations.
The practical example we shared illustrates just one of many possible applications of CrewAI with Amazon Bedrock. The combination of CrewAI’s agent orchestration capabilities and advanced language models in Amazon Bedrock opens up numerous possibilities for building intelligent, autonomous systems that can tackle complex business challenges.
We encourage you to explore our code on GitHub and start building your own multi-agent systems using CrewAI and Amazon Bedrock. Whether you’re focused on security assessments, process automation, or other use cases, this powerful combination provides the tools you need to create sophisticated AI solutions that can scale with your needs.

About the Authors
Tony Kipkemboi is a Senior Developer Advocate and Partnerships Lead at CrewAI, where he empowers developers to build AI agents that drive business efficiency. A US Army veteran, Tony brings a diverse background in healthcare, data engineering, and AI. With a passion for innovation, he has spoken at events like PyCon US and contributes to the tech community through open source projects, tutorials, and thought leadership in AI agent development. Tony holds a Bachelor’s of Science in Health Sciences and is pursuing a Master’s in Computer Information Technology at the University of Pennsylvania.
João (Joe) Moura is the Founder and CEO of CrewAI, the leading agent orchestration platform powering multi-agent automations at scale. With deep expertise in generative AI and enterprise solutions, João partners with global leaders like AWS, NVIDIA, IBM, and Meta AI to drive innovative AI strategies. Under his leadership, CrewAI has rapidly become essential infrastructure for top-tier companies and developers worldwide and used by most of the F500 in the US.
Karan Singh is a Generative AI Specialist at AWS, where he works with top-tier third-party foundation model and agentic frameworks providers to develop and execute joint go-to-market strategies, enabling customers to effectively deploy and scale solutions to solve enterprise generative AI challenges. Karan holds a Bachelor’s of Science in Electrical Engineering from Manipal University, a Master’s in Science in Electrical Engineering from Northwestern University, and an MBA from the Haas School of Business at University of California, Berkeley.
Aris Tsakpinis is a Specialist Solutions Architect for Generative AI focusing on open source models on Amazon Bedrock and the broader generative AI open source ecosystem. Alongside his professional role, he is pursuing a PhD in Machine Learning Engineering at the University of Regensburg, where his research focuses on applied natural language processing in scientific domains.