7 AI-Powered Email Deliverability Tips to Increase Sales

You’ve probably read dozens of contradictory pieces of ABSOLUTELY NECESSARY advice about email marketing, engagement and deliverability tips over the years. Do this! Do that! No, not that! 

Pretty unhelpful stuff! (Not to mention quite technical to navigate.)

You may just want to do what’s most effective for results and performance and move on to your next exciting campaign idea. 

But there are risks associated with email marketing and cold email outreach that marketers want to avoid if at all possible. (If not, you may have months or even years of cleanup to do for your sender IP address, your company domain, and your standing with your ESP.)

At Customers.ai, we don’t think the secrets to email deliverability should be secret.

So we put together this email deliverability cheat sheet for you. Follow its seven tenets and you’ll be grand!

My personal favorite tip? Doubling down on engaged recipients … because I love unicorns. (That’s why I included it as number one!)

But in fact, every one of these AI-powered email deliverability tools and best practices have proven critical to the success of Customers.ai’s targeted outbound marketing strategy that today accounts for over half of the leads our marketing team generates for our sales team.

The email deliverability cheat sheet I promised you is below and here’s a presentation that will walk you through it.

7 Email Deliverability Tips That Get Your Message in the Inbox & Keep Your Compliant Rate Low

Click the links below to jump to the tip or read on for all the email deliverability tips that will help your cold, warm and event hot emails get seen by your intended recipient (minimizing risk to your IP and domain health, of course)!

Double down on engagers

Validate, validate, validate

Use multiple senders & send throttling

Include an unsubscribe link

Warm up email senders

Segment your audience list

Check for signs of life

But first, why are marketers so keen on email deliverability tips?

It comes down to the fact that email is pretty much the undisputed GOAT of marketing channels.

Email generates an average $36 for every $1 spent.

Recovered cart abandonment email revenue can account for a third of ecommerce sales.

Generative AI technology is changing email marketing fast with 43% of marketers using AI using it for email marketing.

At the same time, there are Gmail promotion tabs and spam filters to account for when optimizing your email to show up in front of your desired audience.

At Customers.ai we combine our email deliverability tips with AI to:

Build a list of likely-to-buy email contacts

Land in inboxes and stay there

Send email outreach to prospects before they’ve created an account or added products to a cart

Use AI-powered analytics to improve performance

And more!

Email Deliverability Tip #1. Double down on engagers (and purge the rest)

Double down on engagers and purge non-engagers from your list — even though this seems to go against everything marketers are trained to do…

For decades, email marketers have been conditioned to horde every contact, regardless of likelihood to engage. I get it, with the average website’s email capture rate and conversion rate between 2-4%, it used to require enormous effort to generate every email contact added to the list.

But this scarcity mindset is a thing of the past now that you can passively grow an email list from the high-intent visitors to your website.

Today, thanks to AI and the ability to capture email addresses from website visitors automatically, the email marketing best practice has changed: Remove non-engagers from your list. Continuing to email them only raises your risk of spam complaints, falling into spam traps, and more!

Market aggressively to the people who do engage. This keeps engagement rates high and email delivery rate climbing!

Customers.ai does this automatically with our AI-powered Signs of Life Detector (more on this below) so you don’t have to do anything manually. This leaves you more time to work on crafting offers and copy that convert.

Email Deliverability Tip #2. Validate email addresses before sending messages

Email validators weed out addresses that are spam traps or otherwise unsafe to email.

Using an email contact validator is essential to improving your deliverability.

Again, Customers.ai’s system executes this email deliverability tip by automatically validating contact emails before you send anything.

You don’t have to worry about bounced or undelivered emails, or risky spam traps mixed in with your prospecting contact list.

Email Deliverability Tip #3. Use multiple senders & send throttling to avoid spam filters and promotions tabs

Sending thousands of identical emails from the same address at the same time is a recipe for landing in the promotions or spam folder.

These are some of the signals that the email platforms like Gmail, Yahoo Mail, Outlook and more use to detect mass marketing emails and can easily land your email in the promotions tab or tucked away never to be seen in in a spam folder.

Use multiple senders and limit the number of sends per day to increase deliverability.

Customers.ai makes configuring multiple email senders, pacing email sends in small batches, and unique, personalized email copy generated by AI simple so you can set it and forget it.

Email Deliverability Tip #4. Include an unsubscribe link separate from the built-in ones from ESPs and email inboxes

Any email sending service will automatically include an unsubscribe link at the bottom of the email.

This unsubscribe link gives users a chance to complain to Email Service Provider (ESP) like Mailchimp, Active Campaign or Constant Contact about your email campaign. If this happens even a few times, it can land you in hot water with your ESP and potentially cause suspension or deactivation of your account.

Here’s an email deliverability tip that will reduce the usage of the built-in unsubscribe link and keep your audience to only those who want to hear from you: Include an in-house unsubscribe link above the standard unsubscribe link.

Think about it this way: These unsubscribe requests will stay between you and your audience — exactly where it belongs.

Process unsubscribes yourself and reduce your complaint rate at the same time, without setting off alarms to the email marketing platform.

Email Deliverability Tip #5. Warm up email senders before scaling outbound sending to protect your domain rating

Newly created email addresses have limitations on how many messages they can send while looking natural.

You have to send fewer emails at first so that email providers like Gmail learn to trust you.

This can be a frustrating process because you want to hit the ground running but slowly increasing capacity wins the race.

Customers.ai has built in automated email warmup to protect your domain reputation and sender reputation, allowing you to scale to higher volumes of outbound emailing faster than you would without email warmup tools.

Email Deliverability Tip #6. Segment your audience list with as much targeting intent as needed for fine-tuned messaging relevance

As with all marketing, the more targeted your emails are, the better.

Better open rate, better click rate, better response rate, better conversion rate.

Segmenting your list into more narrow and targeted groups based on interests, intent, and what you know about the audience will increase your deliverability and engagement, helping you sell more.

With Customers.ai, you can target emails to customers based on what page they visited, demographic attributes (location, income, etc.), and more.

Customers.ai just released a new AI-generative email copywriting feature that will create email copy customized to the webpage that a visitor to your site viewed.

Try Customers.ai for free and get your first 50 website visitor leads and email sends on us.

Grade Your Website Lead Conversion Rate Instantly

Get an instant website audit and 50 leads for free

Please enable JavaScript in your browser to complete this form.Website / URL *Grade my website

Email Deliverability Tip #7. Check for signs of life before sending follow-up outreach

Think about this: Email providers like Gmail, Outlook and Yahoo Mail are checking for engagement signals like opens, replies, clicks and low unsubscribes as a whole in order to decide if any particular email will be welcomed in the inbox of your intended recipients.

So sending emails to those who engage with them is one sure-fire signal to the email providers that your emails should go where you send them (and not in a spam folder or promotions tab).

This is why it’s a key email deliverability tip to check for signs of life, and send more emails to those who engage with your messages.

Customers.ai’s Signs of Life Detector is designed to do exactly that.

And Signs of Life Detector (or S.o.L.D.) is 100% automated and powered by AI.

As new lead email contact enter your Customers.ai targeted outreach automation sequences, the first message will check for signs of life (opens, clicks, replies) before sending any follow-up messages in the sequence.

Ready to put these email deliverability tips into practice for your business? Let Customers.ai email deliverability experts help you implement AI tools to increase sales with AI-powered email marketing automation.

Find Your Customers with Customers.ai

See the new Customers.ai sales outreach automation platform. With X-Ray Website Visitor Detection and automated email and SMS outreach to help mid-market companies find their perfect customers. Request a call to learn more about sales outreach automation today!

Request My Free Demo

Important Next Steps

See what targeted outbound marketing is all about. Capture and engage your first 50 website visitor leads with Customers.ai X-Ray website visitor identification for free.

Talk and learn about sales outreach automation with other growth enthusiasts. Join Customers.ai Island, our Facebook group of 40K marketers and entrepreneurs who are ready to support you.

Advance your marketing performance with Sales Outreach School, a free tutorial and training area for sales pros and marketers.

The post 7 AI-Powered Email Deliverability Tips to Increase Sales appeared first on Customers.ai.

Meet Open Interpreter: An Open-Source Locally Running Implementation o …

In the ever-evolving realm of programming and software development, the demand for efficient, flexible, and accessible tools has never been higher. Developers seek solutions that empower them to work seamlessly within their local development environments, unconstrained by limitations imposed by hosted services. This article explores a groundbreaking solution to these challenges – Open Interpreter.

While OpenAI’s Code Interpreter with GPT-4 is a game-changer, it comes with specific constraints. OpenAI’s solution is hosted, which operates in a controlled, remote environment, limiting developers’ control over their local setup. There are notable limitations, including no internet access, a limited set of pre-installed packages, a maximum upload size of 100 MB, and a 120-second runtime limit. When the environment dies, any generated files or links are lost, which can be inconvenient for longer tasks.

Open Interpreter empowers developers to tap into their computer’s full potential while bypassing the restrictions associated with hosted services.

An interactive demo is also available on Google Colab:

Open Interpreter equips developers with a broad array of capabilities, including Content Creation; it enables effortless content creation and editing of various formats such as photos, videos, PDFs, and more. Developers can take control of a Chrome browser, facilitating efficient research and automation. Open Interpreter seamlessly handles data-related tasks, allowing users to plot, clean, and analyze large datasets for informed decision-making.

Open Interpreter offers both interactive and programmatic chat capabilities. Users can initiate interactive sessions by running ‘interpreter’ in their terminal, starting a conversation with Open Interpreter. For more precise control, developers can script tasks and workflows efficiently to pass messages directly to the Interpreter. Open Interpreter provides flexibility by allowing users to inspect and configure the system message. This customization extends its functionality, modifies permissions, or adds context to align with individual preferences and project requirements.

Open Interpreter represents a significant leap in the domain of local development environments. Its exceptional capabilities empower developers to work efficiently and flexibly, free from the constraints of hosted services. With Open Interpreter as your trusted companion, embrace the future of software development and unlock new horizons in your local development journey.

Check out the Project Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

The post Meet Open Interpreter: An Open-Source Locally Running Implementation of OpenAI’s Code Interpreter appeared first on MarkTechPost.

Meet SMPLitex: A Generative AI Model and Dataset for 3D Human Texture …

In the ever-evolving field of computer vision and graphics, a significant challenge has been the creation of realistic 3D human representations from 2D images. This is not merely a technical hurdle but a gateway to numerous applications, from immersive virtual environments to advanced video editing. In response to this challenge, a research team has introduced a groundbreaking solution called “SMPLitex.” This research delves into the problem at hand, the proposed methodology, its intricacies, and the impressive performance of SMPLitex.

Creating 3D human representations from single images is a longstanding aspiration in computer graphics and vision. While we have made significant strides in capturing 3D shapes, textures, which give objects their realistic appearances, remain a formidable frontier. Imagine taking a single photograph of a person and being able to recreate their 3D shape and detailed skin texture, clothing, and even accessories. This is precisely the challenge the research team behind SMPLitex has set out to address.

Before delving into SMPLitex, it’s essential to understand the landscape of existing methods and their limitations. Traditional approaches have often relied on labor-intensive processes involving manual texture mapping or 3D scanning, which could be more scalable for real-world applications. These methods also struggle when dealing with occlusions or incomplete views of the subject, limiting their practicality.

The research team has taken a bold step by introducing SMPLitex, a revolutionary method for estimating and manipulating the complete 3D appearance of humans captured from a single image. SMPLitex’s unique integration of generative models initially designed for 2D images into the 3D domain sets it apart. The key innovation lies in establishing pixel-to-surface correspondences based on the input image, which is then used to reconstruct the 3D texture.

The heart of this method is a generative model specifically designed for complete 3D human appearance. This model is trained extensively, learning how human textures appear in 3D space. But the real magic happens when this model is conditioned on the visible parts of the subject within the single input image.

Pixel-to-surface correspondences are computed with remarkable precision, mapping the 2D image to its 3D counterpart. By leveraging this correspondence, SMPLitex can generate a complete 3D texture map that faithfully represents the subject’s appearance. The generative model’s adaptability to the visible parts of the image ensures that even when dealing with partially occluded subjects, SMPLitex can produce realistic 3D textures.

SMPLitex doesn’t just promise a paradigm shift; it delivers. The research team conducted rigorous quantitative and qualitative evaluations across three publicly available datasets. The results were nothing short of astounding. SMPLitex outperformed existing methods significantly, demonstrating its prowess in human texture estimation.

One of the standout features of SMPLitex is its versatility. It excels in accurate texture estimation and opens doors to a wider array of tasks. From editing and synthesis to manipulation, SMPLitex can seamlessly integrate 3D textures into various applications, enriching the world of computer graphics and vision.

In conclusion, SMPLitex represents a monumental leap forward in unlocking realistic 3D human textures from single images. By bridging the gap between 2D images and lifelike 3D reconstructions, this method holds immense promise. Its potential applications span diverse domains, from entertainment and gaming to healthcare and fashion. SMPLitex offers a glimpse into a future where capturing 3D human appearances is as simple as photographing. The research team’s innovation paves the way for more immersive experiences, enhanced content creation, and new computer vision and graphics horizons.

As technology advances, we can only anticipate the incredible possibilities that methods like SMPLitex will unlock. Fusing generative models and precise pixel-to-surface correspondences can revolutionize industries and redefine our interaction with digital representations of the human form. The journey from 2D to 3D has just taken a significant step forward, thanks to SMPLitex and its visionary research team.

Check out the Paper and Project Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

The post Meet SMPLitex: A Generative AI Model and Dataset for 3D Human Texture Estimation from Single Image appeared first on MarkTechPost.

Top AI Image Upscalers and Enlargers in 2023

An excellent method to enhance and resize your images is with an AI picture and upscaler service. Researchers estimate that over a trillion photographs are taken every year. Digital photos enrich everyone’s lives, but those in specialized fields, such as journalism and social media, depend on picture upscaler programs.

Using high-quality images is a certain way to get people’s attention. However, most online photos could be more focused and of better quality. Artificial intelligence has greatly improved picture upscale technology in the last several years, making it far easier to convert low-resolution photographs into high-quality professional images.

Some of the best AI-powered photo enhancers and upscalers are as follows:

AI Image Enlarger

With the help of AI Image Enlarger, a user may automatically expand and enhance low-resolution images without reducing image quality. The maximum file size is 5 MB, and the full image resolution is 1200×1200 in JPG or PNG format. The AI Image Enlarger toolbox uses fast and advanced AI technology to increase image resolution without decreasing quality. It uses a customized and enhanced upscaling algorithm to bulk upscale images rapidly.

VanceAI Image Upscaler

Another excellent AI image enhancer, VanceAI, comes with several features that can be quite helpful. Second, you may use the 8x AI image upscaler to make low-resolution photographs seem better. The application has an artificial intelligence anime upscaler that can multiply your little anime images by up to 16x. VanceAI Image Upscaler also includes several basic safety features. For instance, after 24 hours, your processed images are permanently deleted. Like the other recommended programs, this one simplifies the process of enlarging images. A picture must be uploaded, the right AI model selected, and the desired upscaling factor chosen before the procedure can begin.

Remini

Remini is a web-based picture editing program driven by deep learning and AI. Using advanced algorithms that recognize faces, colors, and other aspects in your photographs makes the required adjustments to improve their look. Editing is a breeze using Remini, even for basic tasks like cropping, resizing, and adjusting brightness and contrast. However, if you want to maintain the updated photos from Remini, you’ll need a subscription.

Let’s Enhance

Let’s Enhance is an online artificial intelligence photo upscaler that successfully brings low-resolution images up to 4K resolution. Colors are enhanced, missing textures and details are recovered, compression is done away with, and more, thanks to the program’s AI techniques. Using the Let’s Enhance feature, your photo might be magnified sixteen times larger. This is yet another straightforward AI photo enhancer. With AI, Let’s Enhance can effectively remove noise and other image flaws, resulting in high-quality, detailed images. Simply dropping the image onto the webpage will initiate upscaling with the free application version.

Fotor

The availability of several AI-based picture enhancers is not hidden. However, Fotor is the tool to efficiently, rapidly, and online (no download required) upsize photographs. Fotor is an all-in-one picture editor that allows you to do anything from retouching portraits to creating custom graphics for your Instagram posts. The AI Image Upscaler is only one of many helpful features. Just drop your photo here, and it will be fixed instantly.

Gigapixel AI

Topaz Gigapixel AI is a comprehensive set of tools for enhancing photos with AI. This software does more than upscale your photos; it also removes noise and shakes. Image sharpening is another possible use. The program takes advantage of cutting-edge advances in AI photo processing. Thus, the results are reliable. Gigapixel’s AI for image processing was trained with the help of millions of photographs. It is, therefore, equipped to understand concepts like noise and blur. Due to its considerable training, this tool can handle such objects. Gigapixel allows you to fake the use of a tripod in handheld photographs.

Waifu2x

If you’re looking for an anime-specific picture upscaler, Waifu2x is yet another fantastic option. Despite focusing on anime lovers, the program is flexible enough to utilize other images. The software uses neural networks to improve the quality of the photos it processes. Waifu2x removes noise when magnifying photographs, or you may paste in the URL of the image you want to shrink. You can skip the step of downloading the file and importing it manually. When Waifu2x sees the URL, it may begin the process and steal the image.

Upscale

Your images might have added additional characteristics using AI technology without losing any of the original image’s nuances. Upscale is a method developed by artificial intelligence that safely boosts image resolution. Upscaling your photo to a maximum of four times its original size is possible. One of Upscale’s most useful functions is the elimination of JPEG artifacts. Damage and pixelation can be introduced to a JPEG image by repeated editing and saving. This Artifacting process may cause your photos to look hazy. Upgrading means less squishing and rounded corners.

YouCam Enhance

If you’re seeking an artificial intelligence picture enhancer in 2023, YouCam Enhance is your best bet. Using image denoising, sharpening, and color improvement, YouCam Enhance can quickly improve the quality of any shot. YouCam Enhance eliminates the need to adjust for poor image quality manually. To have your photo appear professionally on the site, you must upload it. YouCam Enhance is an adaptable program for modifying photos on desktop computers and mobile devices. Unlike many other picture editing services, YouCam Enhance doesn’t have you sign up beforehand. This method might help you save time and get better pictures. JPG, JPEG, PNG, TIFF, and GIF are some image formats that may be edited using YouCam Enhance.

Deep Image AI

Deep Image AI may run on either Windows or Mac. Using Deep picture AI, online picture editing is a breeze. Image editing on a mobile device is straightforward. Remember that the maximum resolution for upsized images on this platform is 5000 pixels by 5000 pixels or 25 megapixels. This cap affects both registered and guest users. To start upscaling an image, all it takes is to upload it to Deep Image AI and click a few buttons. It employs complex AI algorithms to generate realistic photos while upscaling. The tool’s application programming interface (API) may appeal to programmers since it allows the scaling tool to be integrated into other programs.

Cre8tiveAI

It just takes a few seconds for photographs to be upsized 16 times using Cre8tiveAI. It’s a great choice for enhancing snapshots before they’re printed. The images are scaled up without any loss of quality thanks to the program’s usage of deep learning. Compared to the traditional approach, which uses stretching and center interpolation, the picture upscaler tool substantially improves. The AI can learn high-resolution photos from low-resolution images, producing high-quality shots.

BigJPG

The problem of blurriness in anime drawings and images may be alleviated with the help of the great application BigJPG. It’s a new artificial intelligence upscaler that works well with anime photographs and doesn’t reduce image quality. The process enhances various attributes, from edges and details to colors. Remember that the maximum image size that may be upscaled using the tool is 3000 by 3000.

Pixbim Enlarge AI

Pixbim Expand AI, which uses machine learning techniques, may improve the resolution of your photos without compromising on quality. There is a wide range of devices on which this program will run. Pixbim can automatically fix missing pixels by analyzing the image and making an accurate forecast based on that analysis. When you increase the size of an image, some of its details may be enhanced mechanically. There is no subscription needed to use the tool. However, there is a one-time cost.

AI Sharpener Scaling

AI Sharpener Scaling is a free photo enlargement tool that utilizes cutting-edge scaling technology and the best AI image enhancer and upscaler programs available today. The software allows for a maximum quality-maintained image enlargement of 400%. The approach used by AI Sharper Scaling also works well for correcting lens distortion, perspective shifts, and rotations. Keep in mind that AI Sharpener Scaling is only compatible with Microsoft Windows.

Upscaleics

The Upscaleics AI picture enhancer is another user-friendly option. After uploading your photo and making any required modifications, you can sit back and let AI do the rest. Once again, the simplicity of Upscaleics is its strongest selling point. If you only need to edit a few photos, it’s a quick and easy solution. You have more control over the final product than you would with most other applications. Your file’s resolution and compression rate are both within your control.

PicWish

PicWish is an AI-powered online photo editor that uses powerful algorithms to quickly and easily fix blurry images. It also has powerful noise reduction, color correction, and picture enhancement capabilities. The straightforward and natural UI makes it accessible to anybody.

BeFunky

BeFunky (AI) is a great alternative to Photoshop if you want to edit photos online with the help of artificial intelligence. It has an automatic enhancement feature that may quickly improve any image with a single click. In addition, it has special filters and frames that may give your photos a one-of-a-kind look.

StockPhotos Upscaler

This AI-powered photo upscaler is as simple to use as it is powerful, with the potential to improve image quality by a factor of 8—upscale three images at no cost. With one simple click, you can give your photos stunning professional effects by enhancing their clarity, smoothness, and contrast. The original texture of the image is preserved even after being scaled up, and batch processing is available. Also, this program has an in-built functionality called social media cropping, which is great for resizing your photos to meet the specifications of various social media platforms.

Also, don’t forget to join our Reddit Page, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you have any questions, feel free to message us at Asif@marktechpost.com

The post Top AI Image Upscalers and Enlargers in 2023 appeared first on MarkTechPost.

Build a secure enterprise application with Generative AI and RAG using …

Generative AI is a type of AI that can create new content and ideas, including conversations, stories, images, videos, and music. It’s powered by large language models (LLMs) that are pre-trained on vast amounts of data and commonly referred to as foundation models (FMs).
With the advent of these LLMs or FMs, customers can simply build Generative AI based applications for advertising, knowledge management, and customer support. Realizing the impact of these applications can provide enhanced insights to the customers and positively impact the performance efficiency in the organization, with easy information retrieval and automating certain time-consuming tasks.
With generative AI on AWS, you can reinvent your applications, create entirely new customer experiences, and improve overall productivity.
In this post, we build a secure enterprise application using AWS Amplify that invokes an Amazon SageMaker JumpStart foundation model, Amazon SageMaker endpoints, and Amazon OpenSearch Service to explain how to create text-to-text or text-to-image and Retrieval Augmented Generation (RAG). You can use this post as a reference to build secure enterprise applications in the Generative AI domain using AWS services.
Solution overview
This solution uses SageMaker JumpStart models to deploy text-to-text, text-to-image, and text embeddings models as SageMaker endpoints. These SageMaker endpoints are consumed in the Amplify React application through Amazon API Gateway and AWS Lambda functions. To protect the application and APIs from inadvertent access, Amazon Cognito is integrated into Amplify React, API Gateway, and Lambda functions. SageMaker endpoints and Lambda are deployed in a private VPC, so the communication from API Gateway to Lambda functions is protected using API Gateway VPC links. The following workflow diagram illustrates this solution.

The workflow includes the following steps:

Initial Setup: SageMaker JumpStart FMs are deployed as SageMaker endpoints, with three endpoints created from SageMaker JumpStart models. The text-to-image model is a Stability AI Stable Diffusion foundation model that will be used for generating images. The text-to-text model used for generating text and deployed in the solution is a Hugging Face Flan T5 XL model. The text-embeddings model, which will be used for generating embedding to be indexed in Amazon OpenSearch Service or searching the context for the incoming question, is a Hugging Face GPT 6B FP16 embeddings model. Alternative LLMs can be deployed based on the use case and model performance benchmarks. For more information about foundation models, see Getting started with Amazon SageMaker JumpStart.
You access the React application from your computer. The React app has three pages: a page that takes image prompts and displays the image generated; a page that takes text prompts and displays the generated text; and a page that takes a question, finds the context matching the question, and displays the answer generated by the text-to-text model.
The React app built using Amplify libraries are hosted on Amplify and served to the user in the Amplify host URL. Amplify provides the hosting environment for the React application. The Amplify CLI is used to bootstrap the Amplify hosting environment and deploy the code into the Amplify hosting environment.
If you have not been authenticated, you will be authenticated against Amazon Cognito using the Amplify React UI library.
When you provide an input and submit the form, the request is processed via API Gateway.
Lambda functions sanitize the user input and invoke the respective SageMaker endpoints. Lambda functions also construct the prompts from the sanitized user input in the respective format expected by the LLM. These Lambda functions also reformat the output from the LLMs and send the response back to the user.
SageMaker endpoints are deployed for text-to-text (Flan T5 XXL), text-to-embeddings (GPTJ-6B), and text-to-image models (Stability AI). Three separate endpoints using the recommended default SageMaker instance types are deployed.
Embeddings for documents are generated using the text-to-embeddings model and these embeddings are indexed into OpenSearch Service. A k-Nearest Neighbor (k-NN) index is enabled to allow searching of embeddings from the OpenSearch Service.
An AWS Fargate job takes documents and segments them into smaller packages, invokes the text-to-embeddings LLM model, and indexes the returned embeddings into OpenSearch Service for searching context as described previously.

Dataset overview
The dataset used for this solution is pile-of-law within the Hugging Face repository. This dataset is a large corpus of legal and administrative data. For this example, we use train.cc_casebooks.jsonl.xz within this repository. This is a collection of education casebooks curated in a JSONL format as required by the LLMs.
Prerequisites
Before getting started, make sure you have the following prerequisites:

An AWS account.
An Amazon SageMaker Studio domain managed policy attached to the AWS Identity and Access Management (IAM) execution role. For instructions on assigning permissions to the role, refer to Amazon SageMaker API Permissions: Actions, Permissions, and Resources Reference. In this case, you need to assign permissions as allocated to Amazon Augmented AI (Amazon A2I). For more information, refer to Amazon SageMaker Identity-Based Policy Examples.
An Amazon Simple Storage Service (Amazon S3) bucket. For instructions, refer to Creating a bucket.
For this post, you use the AWS Cloud Development Kit (AWS CDK) using Python. Follow the instructions in Getting Started with the AWS CDK to set up your local environment and bootstrap your development account.
This AWS CDK project requires SageMaker instances (two ml.g5.2xlarge and one ml.p3.2xlarge). You may need to request a quota increase.

Implement the solution
An AWS CDK project that includes all the architectural components has been made available in this AWS Samples GitHub repository. To implement this solution, do the following:

Clone the GitHub repository to your computer.
Go to the root folder.
Initialize the Python virtual environment.
Install the required dependencies specified in the requirements.txt file.
Initialize AWS CDK in the project folder.
Bootstrap AWS CDK in the project folder.
Using the AWS CDK deploy command, deploy the stacks.
Go to the Amplify folder within the project folder.
Initialize Amplify and accept the defaults provided by the CLI.
Add Amplify hosting.
Publish the Amplify front end from within the Amplify folder and note the domain name provided at the end of run.
On the Amazon Cognito console, add a user to the Amazon Cognito instance that was provisioned with the deployment.
Go to the domain name from step 11 and provide the Amazon Cognito login details to access the application.

Trigger an OpenSearch indexing job
The AWS CDK project deployed a Lambda function named GenAIServiceTxt2EmbeddingsOSIndexingLambda. Navigate to this function on the Lambda console.

Run a test with an empty payload, as shown in the following screenshot.

This Lambda function triggers a Fargate task on Amazon Elastic Container Service (Amazon ECS) running within the VPC. This Fargate task takes the included JSONL file to segment and create an embeddings index. Each segments embedding is a result of invoking the text-to-embeddings LLM endpoint deployed as part of the AWS CDK project.
Clean up
To avoid future charges, delete the SageMaker endpoint and stop all Lambda functions. Also, delete the output data in Amazon S3 you created while running the application workflow. You must delete the data in the S3 buckets before you can delete the buckets.
Conclusion
In this post, we demonstrated an end-to-end approach to create a secure enterprise application using Generative AI and RAG. This approach can be used in building secure and scalable Generative AI applications on AWS. We encourage you to deploy the AWS CDK app into your account and build the Generative AI solution.
Additional resources
For more information about Generative AI applications on AWS, refer to the following:

Accelerate your learning towards AWS Certification exams with automated quiz generation using Amazon SageMaker foundations models
Build a powerful question answering bot with Amazon SageMaker, Amazon OpenSearch Service, Streamlit, and LangChain
Deploy generative AI models from Amazon SageMaker JumpStart using the AWS CDK
Get started with generative AI on AWS using Amazon SageMaker JumpStart
Build a serverless meeting summarization backend with large language models on Amazon SageMaker JumpStart

About the Authors
Jay Pillai is a Principal Solutions Architect at Amazon Web Services. As an Information Technology Leader, Jay specializes in artificial intelligence, data integration, business intelligence, and user interface domains. He holds 23 years of extensive experience working with several clients across real estate, financial services, insurance, payments, and market research business domains.
Shikhar Kwatra is an AI/ML Specialist Solutions Architect at Amazon Web Services, working with a leading Global System Integrator. He has earned the title of one of the Youngest Indian Master Inventors with over 500 patents in the AI/ML and IoT domains. Shikhar aids in architecting, building, and maintaining cost-efficient, scalable cloud environments for the organization, and supports the GSI partner in building strategic industry solutions on AWS. Shikhar enjoys playing guitar, composing music, and practicing mindfulness in his spare time.
Karthik Sonti leads a global team of solution architects focused on conceptualizing, building and launching horizontal, functional and vertical solutions with Accenture to help our joint customers transform their business in a differentiated manner on AWS.

Intelligently search Adobe Experience Manager content using Amazon Ken …

Amazon Kendra is an intelligent search service powered by machine learning (ML). With Amazon Kendra, you can easily aggregate content from a variety of content repositories into an index that lets you quickly search all your enterprise data and find the most accurate answer. Adobe Experience Manager (AEM) is a content management system that’s used for creating website or mobile app content. Many organizations use Adobe Experience Manager (On-Premise) or Adobe Experience Manager (Cloud Service) as their content management platform. Enterprise users need to be able to search for accurate answers easily and securely across content from multiple data sources in the enterprise, including AEM, from content such as assets and pages.
Amazon Kendra customers can now use the Amazon Kendra AEM connector to index pages and assets from AEM. Amazon Kendra supports AEM as a Cloud Service author instances and AEM On-Premise author and publish instances. You can index AEM content and filter the types of content you want to index with the Amazon Kendra AEM On-Premise or Cloud Service connector, and search your data from AEM with Amazon Kendra intelligent search.
This post shows you how to configure the Amazon Kendra AEM connector to index your content and search your AEM assets and pages. The connector also ingests the access control list (ACL) information for each document. The ACL information is used to show search results filtered by what a user has access to.
Solution overview
In our solution, we configure AEM as a data source for an Amazon Kendra search index using the Amazon Kendra AEM connector. Based on the configuration, when the data source is synchronized, the connector crawls and indexes all the content from AEM that was created on or before a specific date. The connector also indexes the Access Control List (ACL) information for each message and document. When access control or user context filtering is enabled, the search results of a query made by a user includes results only from those documents that the user is authorized to read.
The Amazon Kendra AEM connector can integrate with AWS IAM Identity Center (Successor to AWS Single Sign-On). You first must enable IAM Identity Center and create an organization to sync users and groups from your active directory. The connector will use the user name and group lookup for the user context of the search queries.
Prerequisites
To try out the Amazon Kendra connector for AEM using this post as a reference, you need the following:

An AWS account with privileges to create AWS Identity and Access Management (IAM) roles and policies. For more information, see Overview of access management: Permissions and policies.
Basic knowledge of AWS and working knowledge of AEM. For more information, refer to User administration and Security.
AEM On-Premise set up (Version 6.5 and above). We store the admin user name and password in AWS Secrets Manager.

Set up OAuth2.0
If you are using AEM On-Premise, setup OAuth2.0 to generate an SSL certificate in order to complete the configuration of Amazon Kendra AEM connector.
The Adobe Granite OAuth 2.0 server implementation (com.adobe.granite.oauth.server) provides the support for OAuth 2.0 server functionalities in AEM.
Enable the OAuth Server authentication handler
By default, AEM won’t enable the OAuth Server authentication handler. To enable it, complete the following steps:

To start the AEM local instance, go to http://localhost:<port>/system/console/configMgr/com.adobe.granite.oauth.server.auth.impl.OAuth2ServerAuthenticationHandler
Change the jaas.ranking.name value to 1100 in the Adobe Granite OAuth Server Authentication Handler section and save the configuration.

The OAuth Server authentication handler is now enabled.

Register the OAuth client
Every external application requires OAuth authentication to be registered as an OAuth client in AEM. To register the OAuth client, complete the following steps:

On the AEM start page, choose Security and OAuth client.
Enter a name and redirect URI.
Choose Save.

After a successful authorization of an application, the OAuth server will redirect you back to the application with an authorization code to the configured redirect URL.

Copy the client ID and client secret and keep them safe.

The Granite OAuth Server supports the following grant types:

Authorization code
Refresh token
JWT bearer token

For this post, we use OAuth2.0 with the JWT grant type.
The JWT bearer token is mainly used for server-to-server integration. This will help us enable the server-to-server integration without the resource owner interaction; for example, to retrieve or upload files without user interaction.

Generate the JWT token
Complete the following steps to generate the JWT token:

Navigate to localhost and the OAuth client.
Choose Download Private Key.
Choose Download.

Generate the public certificate
Now, generate the public certificate from the downloaded private key, run the following command, and enter the private key password.
Use the openssl command to generate the private key:

>openssl pkcs12 -in store.p12 -out store.crt.pem -clcerts -nokeys

Extract the private key:

openssl pkcs12 -in store.p12 -passin pass:notasecret -nocerts -nodes -out store.private.key.txt

Make sure to install openssl and add to the environment path beforehand.
Before using the private key while configuring the Amazon Kendra data source, make sure to not use or copy “—–BEGIN PRIVATE KEY—–” and “—–END PRIVATE KEY—–“ in the code. Additionally, remove any empty spaces from the private key.
Use the generated ClientId, ClientSecret, and private key to configure the Amazon Kendra AEM data source.
For OAuth client registration, navigate to http://localhost:<port>/libs/granite/oauth/content/clients.html.
Set up SSL
Complete the following steps to set up SSL:

Create the key:

openssl genrsa -aes256 -out <keyFileName>.key 4096

Encrypt the key:

openssl req -sha256 -new -key <keyFileName>.key -out <keyFileName>.csr -subj ‘/CN=<keyFileName>’

Sign the key:

openssl x509 -req -days 365 -in <keyFileName>.csr -signkey <keyFileName>.key -out <keyFileName>.crt

Encode the private key to der format:

openssl pkcs8 -topk8 -inform PEM -outform DER -in <keyFileName>.key -out <keyFileName>.der -nocrypt

Four files will be generated with file names starting with <keyFileName>. We use <keyFileName>.crt and <keyFileName>.der in later steps.

Next, log in to AEM at http://localhost:<port>/aem/start.html.
Choose Tools, Security, and SSL Configuration.
In the Store Credentials section, enter the key store and trust store password.

In the Keys and Certificate section, specify the .der file for Private Key and the .crt file for Certificate.

In the next section, enter the domain (localhost), and leave the port as is.
Choose Done.

AEM will open in the specified new port. For example, https://localhost:8443.

Log in to AEM using HTTPS and download the certificate in the browser using the lock/pad button, export the certificate, and name it privateKey.crt.

Now, let’s import the certificate into the keystore path using the key tool.

Open a terminal and go to the folder location where privateKey.crt is present and run the following command:

keytool -import -trustcacerts -keystore <JAVA_HOME>/lib/security/cacerts -storepass changeit -noprompt -alias yourAliasName -file privateKey.crt

Be sure to open 8443 and 80 port in your firewall settings.

Add the certificate privateKey.crt to an Amazon Simple Storage Service (Amazon S3) bucket.

Configure the data source using the Amazon Kendra connector for AEM
You can use an existing index or create a new index to index documents from AEM using the AEM connector. Then complete the following steps. For more information, refer to the Amazon Kendra Developer Guide.

On the Amazon Kendra console, open your index and choose Data sources in the navigation pane.
Choose Add data source.
Under Adobe Experience Manager, choose Add connector.

In the Specify data source details section, enter a name and optionally a description, then choose Next.

In the Define access and security section, select either the AEM On-Premise or AEM as a Cloud Service source type and enter the AEM host URL. You can find the URL in your AEM settings.

If using AEM On-Premise, enter the host URL of the AEM On-Premise server. Then choose Browse S3 and choose the S3 bucket with the SSL certificate.

If using AEM as a Cloud Service, you can use the author URL https://author-xxxxxx-xxxxxxx.adobeaemcloud.com.

Under Authentication, you have two options, Basic authentication and OAuth 2.0 authentication.

If you select Basic authentication, for AWS Secrets Manager secret, choose Create and add a new secret. Then enter a name for the secret, the AEM site user name, and password. The user must have admin permission or be an admin user.

If you select OAuth 2.0 authentication, for AWS Secrets Manager secret, choose Create and add a new secret. Enter a name for the secret, client ID, client secret, and private key. If you use AEM as a Cloud Service, enter a name for the secret, client ID, client secret, private key, organization ID, technical account ID, and Adobe Identity Management System (IMS) host.

Choose Save or Add Secret.
In the Configure VPC and security group section, you can optionally choose to use a VPC. If so, you must add subnets and VPC security groups.
In the Identity crawler section, choose to crawl identity information on users and groups with access to certain documents and store this in the Amazon Kendra principal or identity store.

This is useful for filtering search results based on the user or their group access to documents.

In the IAM section, create a new IAM role or choose an existing IAM role to access repository credentials and index content.
Choose Next.

In the Configure sync settings section, provide information about your sync scope.

You can include the files to be crawled using inclusion patterns or exclude them using exclusion patterns. When you provide a pattern in the Include patterns section, only documents matching that pattern will be crawled. When you provide a pattern in the Exclude patterns section, documents matching that pattern will be not be crawled.

If you use AEM On-Premise and the time zone of your server is different than the time zone of the Amazon Kendra AEM connector or index, you can specify the server time zone to align with the AEM connector or index in the Timezone ID section.

The default time zone for AEM On-Premise is the time zone of the Amazon Kendra AEM connector or index. The default time zone for AEM as a Cloud Service is Greenwich Mean Time.

Choose the Sync mode (for this post, select Full sync).

With the Full sync option, every time the sync runs, Amazon Kendra will crawl all documents and ingest each document even if ingested earlier. The full refresh enables you to reset your Amazon Kendra index without the need to delete and create a new data source. If you choose New or modified content sync or New, modified, or deleted content sync, every time the sync job runs, it will process only objects added, modified, or deleted since the last crawl. Incremental crawls can help reduce runtime and cost when used with datasets that append new objects to existing data sources on a regular basis.

For Sync run schedule, choose Run on demand.
Choose Next.

In the Set field mappings section, you can optionally select from the Amazon Kendra generated default data source fields you want to map to your index. To add custom data source fields, choose Add Field to create an index field name to map to and the field data type. Specify the AEM field name, index field name, and data type.

Choose Next.

Review your settings and choose Add data source.

After the data source is added, choose Data sources in the navigation pane, select the newly added data source, and choose Sync now to start data source synchronization with the Amazon Kendra index.

The sync process will depend on the amount of data to be crawled.
Now let’s enable access control for the Amazon Kendra index.

In the navigation pane, choose your index.
On the User access control tab, choose Edit settings.

Change the settings to look like the following screenshot.
Choose Next.

Choose Update.

Wait a few minutes for the index to get updated by the changes. Now let’s see how you can perform intelligent search with Amazon Kendra.
Perform intelligent search with Amazon Kendra
Before you try searching on the Amazon Kendra console or using the API, make sure that the data source sync is complete. To check, view the data sources and verify if the last sync was successful.

Now we’re ready to search our index.

On the Amazon Kendra console, navigate to the index and choose Search indexed content in the navigation pane.
Let’s query the index using “What was the impact of Siberian heat wave?” without providing an access token.

Based on our access control settings in the index, a valid access token is needed to access content the user is allowed to see; therefore, when we use this search query without setting any user name or group, no results are returned.

Next, choose Apply Token and set the user name or user email ID (for example, user-dev@company.com) that has access to AEM content.

While crawling the AEM data source, the connecter would set the user email ID as principal. If user’s email ID is not available, then the user name would be set as a principal.
The following screenshot shows an example with the user email ID user-dev-2@amazon.com set as principal.

The following example uses user name user-dev-2 set as principal.

Now, let’s try to search the same content with the token of user user-dev@amazon.com, who is not authorized to view this specific document that appeared in the preceding query results.

This confirms that documents ingested by the Amazon Kendra connector for AEM honors the ACLs set by and within AEM and these same ACLs are being enforced on the search results based on applied token.
Clean up
To avoid incurring future costs, clean up the resources you created as part of this solution. If you created a new Amazon Kendra index while testing this solution, delete it. If you only added a new data source using the Amazon Kendra connector for AEM, delete that data source.
Conclusion
With the Amazon Kendra Adobe Experience Manager connector, your organization can search pages and assets securely using intelligent search powered by Amazon Kendra.
To learn more about the Amazon Kendra connector for AEM, refer to Adobe Experience Manager.
For more information on other Amazon Kendra built-in connectors to popular data sources, refer to Amazon Kendra native connectors.

About the Authors
Praveen Edem is a Senior Solutions Architect at Amazon Web Services. He works with major financial services customers, architecting and modernizing their critical large-scale applications while adopting AWS services. He specializes in serverless and container-based workloads. He has over 20 years of IT experience in application development and software architecture.
Manjula Nagineni is a Senior Solutions Architect with AWS based in New York. She works with major financial service institutions, architecting and modernizing their large-scale applications while adopting AWS Cloud services. She is passionate about designing big data workloads cloud-natively. She has over 20 years of IT experience in software development, analytics, and architecture across multiple domains such as finance, manufacturing, and telecom.
Omkar Phadtare is a Software Development Engineer at Amazon Web Services, with a deep-rooted passion for cloud computing. Leveraging his technical expertise and strong understanding of the domain, he designs, develops, and implements cutting-edge, highly scalable, and resilient cloud-based solutions for a diverse range of modern businesses and organizations.
Vijai Gandikota is a Senior Product Manager for Amazon Kendra at Amazon Web Services, responsible for launching Amazon Kendra connectors, Principal Store, Search Analytics Dashboard, and other features of Amazon Kendra. He has over 20 years of experience in designing, developing, and launching products in AI and analytics.

Fine-tune Llama 2 for text generation on Amazon SageMaker JumpStart

Today, we are excited to announce the capability to fine-tune Llama 2 models by Meta using Amazon SageMaker JumpStart. The Llama 2 family of large language models (LLMs) is a collection of pre-trained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Fine-tuned LLMs, called Llama-2-chat, are optimized for dialogue use cases. You can easily try out these models and use them with SageMaker JumpStart, which is a machine learning (ML) hub that provides access to algorithms, models, and ML solutions so you can quickly get started with ML. Now you can also fine-tune 7 billion, 13 billion, and 70 billion parameters Llama 2 text generation models on SageMaker JumpStart using the Amazon SageMaker Studio UI with a few clicks or using the SageMaker Python SDK.
Generative AI foundation models have been the focus of most of the ML and artificial intelligence research and use cases for over a year now. These foundation models perform very well with generative tasks, such as text generation, summarization, question answering, image and video generation, and more, because of their large size and also because they are trained on several large datasets and hundreds of tasks. Despite the great generalization capabilities of these models, there are often use cases that have very specific domain data (such as healthcare or financial services), because of which these models may not be able to provide good results for these use cases. This results in a need for further fine-tuning of these generative AI models over the use case-specific and domain-specific data.
In this post, we walk through how to fine-tune Llama 2 pre-trained text generation models via SageMaker JumpStart.
What is Llama 2
Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. Llama 2 is intended for commercial and research use in English. It comes in a range of parameter sizes—7 billion, 13 billion, and 70 billion—as well as pre-trained and fine-tuned variations. According to Meta, the tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety. Llama 2 was pre-trained on 2 trillion tokens of data from publicly available sources. The tuned models are intended for assistant-like chat, whereas pre-trained models can be adapted for a variety of natural language generation tasks. Regardless of which version of the model a developer uses, the responsible use guide from Meta can assist in guiding additional fine-tuning that may be necessary to customize and optimize the models with appropriate safety mitigations.
Currently, Llama 2 is available in the following regions:

Deploy pre-trained model available: “us-west-2”, “us-east-1”, “us-east-2”, “eu-west-1”, “ap-southeast-1”, “ap-southeast-2”
Fine-tune and deploy the fine-tuned model: “us-east-1”, “us-west-2”,“eu-west-1”

What is SageMaker JumpStart
With SageMaker JumpStart, ML practitioners can choose from a broad selection of publicly available foundation models. ML practitioners can deploy foundation models to dedicated Amazon SageMaker instances from a network isolated environment and customize models using SageMaker for model training and deployment. You can now discover and deploy Llama 2 with a few clicks in SageMaker Studio or programmatically through the SageMaker Python SDK, enabling you to derive model performance and MLOps controls with SageMaker features such as Amazon SageMaker Pipelines, Amazon SageMaker Debugger, or container logs. The model is deployed in an AWS secure environment and under your VPC controls, helping ensure data security. In addition, you can fine-tune Llama2 7B, 13B, and 70B pre-trained text generation models via SageMaker JumpStart.
Fine-tune Llama2 models
You can fine-tune the models using either the SageMaker Studio UI or SageMaker Python SDK. We discuss both methods in this section.
No-code fine-tuning via the SageMaker Studio UI
In SageMaker Studio, you can access Llama 2 models via SageMaker JumpStart under Models, notebooks, and solutions, as shown in the following screenshot.
If you don’t see Llama 2 models, update your SageMaker Studio version by shutting down and restarting. For more information about version updates, refer to Shut down and Update Studio Apps.

You can also find other four model variants by choosing Explore all Text Generation Models or searching for llama in the search box.

On this page, you can point to the Amazon Simple Storage Service (Amazon S3) bucket containing the training and validation datasets for fine-tuning. In addition, you can configure deployment configuration, hyperparameters, and security settings for fine-tuning. You can then choose Train to start the training job on a SageMaker ML instance. The preceding screenshot shows the fine-tuning page for the Llama-2 7B model; however, you can fine-tune the 13B and 70B Llama 2 text generation models using their respective model pages similarly. To use Llama 2 models, you need to accept the End User License Agreement (EULA). It will show up when you when you choose Train, as shown in the following screenshot. Choose I have read and accept EULA and AUP to start the fine-tuning job.

Deploy the model
After the model is fine-tuned, you can deploy it using the model page on SageMaker JumpStart. The option to deploy the fine-tuned model will appear when fine-tuning is finished, as shown in the following screenshot.

Fine-tune via the SageMaker Python SDK
You can also fine-tune Llama 2 models using the SageMaker Python SDK. The following is a sample code to fine-tune the Llama 2 7B on your dataset:

import os
import boto3
from sagemaker.session import Session
from sagemaker.jumpstart.estimator import JumpStartEstimator

# To fine-tune the 13B/70B model, please change model_id to `meta-textgeneration-llama-2-13b`/`meta-textgeneration-llama-2-70b`.
model_id = “meta-textgeneration-llama-2-7b”

estimator = JumpStartEstimator(
model_id=model_id, environment={“accept_eula”: “true”}
)
# By default, instruction tuning is set to false. Thus, to use instruction tuning dataset you use
estimator.set_hyperparameters(instruction_tuned=”True”, epoch=”5″)
estimator.fit({“training”: train_data_location})

You can deploy the fine-tuned model directly from the estimator:

finetuned_predictor = estimator.deploy()

You can also find the code in Fine-tune LLaMA 2 models on SageMaker JumpStart. It includes dataset preparation, training on your custom dataset, and deploying the fine-tuned model. It demonstrates fine-tuning on a subset of the Dolly dataset with examples from the summarization task. The following is the example input with responses from fine-tuned and non-fine-tuned along with the ground truth response:
Input to the model:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWhen did Felix Luna die?nn### Input:nFélix César Luna (30 September 1925 – 5 November 2009) was an Argentine writer, lyricist and historian.nnnn### Response:n

Ground Truth response:

Felix Luna died on November 5th, 2009

Response from the non fine-tuned model:

Félix César Luna (30 September 1925 – 5 November 2009) was an ArgentinennWhen did Luna die?nnn### Explanation:nWe answer the question with the input’s date of birth and the date of death.nnn### Solution: 1102n

Response from the fine-tuned model:

Félix Luna died on November 5th, 2009.nn

For performance benchmarking of different models on the Dolly and Dialogsum dataset, refer to the Performance benchmarking section in the appendix at the end of this post.
Fine-tuning technique
Language models such as Llama are more than 10 GB or even 100 GB in size. Fine-tuning such large models requires instances with significantly high CUDA memory. Furthermore, training these models can be very slow due to the size of the model. Therefore, for efficient fine-tuning, we use the following optimizations:

Low-Rank Adaptation (LoRA) – This is a type of parameter efficient fine-tuning (PEFT) for efficient fine-tuning of large models. In this, we freeze the whole model and only add a small set of adjustable parameters or layers into the model. For instance, instead of training all 7 billion parameters for Llama 2 7B, we can fine-tune less than 1% of the parameters. This helps in significant reduction of the memory requirement because we only need to store gradients, optimizer states, and other training-related information for only 1% of the parameters. Furthermore, this helps in reduction of training time as well as the cost. For more details on this method, refer to LoRA: Low-Rank Adaptation of Large Language Models.
Int8 quantization – Even with optimizations such as LoRA, models such as Llama 70B are still too big to train. To decrease the memory footprint during training, we can use Int8 quantization during training. Quantization typically reduces the precision of the floating point data types. Although this decreases the memory required to store model weights, it degrades the performance due to loss of information. Int8 quantization uses only a quarter precision but doesn’t incur degradation of performance because it doesn’t simply drop the bits. It rounds the data from one type to the another. To learn about Int8 quantization, refer to LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale.
Fully Sharded Data Parallel (FSDP) – This is a type of data-parallel training algorithm that shards the model’s parameters across data parallel workers and can optionally offload part of the training computation to the CPUs. Although the parameters are sharded across different GPUs, computation of each microbatch is local to the GPU worker. It shards parameters more uniformly and achieves optimized performance via communication and computation overlapping during training.

The following table compares different methods with the three Llama 2 models.

,
Default Instance Type
Supported Instance Types with Default configuration
Default Setting
LORA + FSDP
LORA + No FSDP
Int8 Quantization + LORA + No FSDP

Llama 2 7B
ml.g5.12xlarge
ml.g5.12xlarge, ml.g5.24xlarge, ml.g5.48xlarge
LORA + FSDP
Yes
Yes
Yes

Llama 2 13B
ml.g5.12xlarge
ml.g5.24xlarge, ml.g5.48xlarge
LORA + FSDP
Yes
Yes
Yes

Llama 2 70B
ml.g5.48xlarge
ml.g5.48xlarge
INT8 + LORA + NO FSDP
No
No
Yes

Note that fine-tuning of Llama models is based on scripts provided by the following GitHub repo.
Training dataset format
SageMaker JumpStart currently support datasets in both domain adaptation format and instruction tuning format. In this section, we specify an example dataset in both formats. For more details, refer to the Dataset formatting section in the appendix.
Domain adaptation format
The text generation Llama 2 model can be fine-tuned on any domain-specific dataset. After it’s fine-tuned on the domain-specific dataset, the model is expected to generate domain-specific text and solve various NLP tasks in that specific domain with few-shot prompting. With this dataset, input consists of a CSV, JSON, or TXT file. For instance, input data may be SEC filings of Amazon as a text file:

This report includes estimates, projections, statements relating to our
business plans, objectives, and expected operating results that are “forward-
looking statements” within the meaning of the Private Securities Litigation
Reform Act of 1995, Section 27A of the Securities Act of 1933, and Section 21E
of the Securities Exchange Act of 1934. Forward-looking statements may appear
throughout this report, including the following sections: “Business” (Part I,
Item 1 of this Form 10-K), “Risk Factors” (Part I, Item 1A of this Form 10-K),
and “Management’s Discussion and Analysis of Financial Condition and Results
of Operations” (Part II, Item 7 of this Form 10-K). These forward-looking
statements generally are identified by the words “believe,” “project,”
“expect,” “anticipate,” “estimate,” “intend,” “strategy,” “future,”
“opportunity,” “plan,” “may,” “should,” “will,” “would,” “will be,” “will
continue,” “will likely result,” and similar expressions.

Instruction tuning format
In instruction fine-tuning, the model is fine-tuned for a set of natural language processing (NLP) tasks described using instructions. This helps improve the model’s performance for unseen tasks with zero-shot prompts. In instruction tuning dataset format, you specify the template.json file describing the input and the output formats. For instance, each line in the file train.jsonl looks like the following:

{“instruction”: “What is a dispersive prism?”,
“context”: “In optics, a dispersive prism is an optical prism that is used to disperse light, that is, to separate light into its spectral components (the colors of the rainbow). Different wavelengths (colors) of light will be deflected by the prism at different angles. This is a result of the prism material’s index of refraction varying with wavelength (dispersion). Generally, longer wavelengths (red) undergo a smaller deviation than shorter wavelengths (blue). The dispersion of white light into colors by a prism led Sir Isaac Newton to conclude that white light consisted of a mixture of different colors.”,
“response”: “A dispersive prism is an optical prism that disperses the light’s different wavelengths at different angles. When white light is shined through a dispersive prism it will separate into the different colors of the rainbow.”}

The additional file template.json looks like the following:

{
“prompt”: “Below is an instruction that describes a task, paired with an input that provides further context. ”
“Write a response that appropriately completes the request.nn”
“### Instruction:n{instruction}nn### Input:n{context}nn”,
“completion”: ” {response}”,
}

Supported hyperparameters for training
Llama 2 fine-tuning supports a number of hyperparameters, each of which can impact the memory requirement, training speed, and performance of the fine-tuned model:

epoch – The number of passes that the fine-tuning algorithm takes through the training dataset. Must be an integer greater than 1. Default is 5.
learning_rate – The rate at which the model weights are updated after working through each batch of training examples. Must be a positive float greater than 0. Default is 1e-4.
instruction_tuned – Whether to instruction-train the model or not. Must be ‘True‘ or ‘False‘. Default is ‘False‘.
per_device_train_batch_size – The batch size per GPU core/CPU for training. Must be a positive integer. Default is 4.
per_device_eval_batch_size – The batch size per GPU core/CPU for evaluation. Must be a positive integer. Default is 1.
max_train_samples – For debugging purposes or quicker training, truncate the number of training examples to this value. Value -1 means using all of the training samples. Must be a positive integer or -1. Default is -1.
max_val_samples – For debugging purposes or quicker training, truncate the number of validation examples to this value. Value -1 means using all of the validation samples. Must be a positive integer or -1. Default is -1.
max_input_length – Maximum total input sequence length after tokenization. Sequences longer than this will be truncated. If -1, max_input_length is set to the minimum of 1024 and the maximum model length defined by the tokenizer. If set to a positive value, max_input_length is set to the minimum of the provided value and the model_max_length defined by the tokenizer. Must be a positive integer or -1. Default is -1.
validation_split_ratio – If validation channel is none, ratio of train-validation split from the train data must be between 0–1. Default is 0.2.
train_data_split_seed – If validation data is not present, this fixes the random splitting of the input training data to training and validation data used by the algorithm. Must be an integer. Default is 0.
preprocessing_num_workers – The number of processes to use for preprocessing. If None, the main process is used for preprocessing. Default is None.
lora_r – Lora R. Must be a positive integer. Default is 8.
lora_alpha – Lora Alpha. Must be a positive integer. Default is 32
lora_dropout – Lora Dropout. must be a positive float between 0 and 1. Default is 0.05.
int8_quantization – If True, the model is loaded with 8-bit precision for training. Default for 7B and 13B is False. Default for 70B is True.
enable_fsdp – If True, training uses FSDP. Default for 7B and 13B is True. Default for 70B is False. Note that int8_quantization is not supported with FSDP.

Instance types and compatible hyperparameters
The memory requirement during fine-tuning may vary based on several factors:

Model type – The 7B model has the least GPU memory requirement and 70B has the largest memory requirement
Max input length – A higher value of input length leads to processing more tokens at a time and as such requires more CUDA memory
Batch size – A larger batch size requires larger CUDA memory and therefore requires larger instance types
Int8 quantization – If using Int8 quantization, the model is loaded into low precision and therefore requires less CUDA memory

To help you get started, we provide a set of combinations of different instance types, hyperparameters, and model types that can be successfully fine-tuned. You can select a configuration as per your requirements and availability of instance types. We fine-tune all three models on a variety of settings with three epochs on a subset of the Dolly dataset with summarization examples.
7B model
The following table summarizes the fine-tuning options on the 7B model.

Instance Type
Max Input Len
Per Device Batch Size
Int8 Quantization
Enable FSDP
Time Taken (mins)

ml.g4dn.12xlarge
1024
8
TRUE
FALSE
166

ml.g4dn.12xlarge
2048
2
TRUE
FALSE
178

ml.g4dn.12xlarge
1024
4
FALSE
TRUE
120

ml.g4dn.12xlarge
2048
2
FALSE
TRUE
143

ml.g5.2xlarge
1024
4
TRUE
FALSE
61

ml.g5.2xlarge
2048
2
TRUE
FALSE
68

ml.g5.2xlarge
1024
4
FALSE
TRUE
43

ml.g5.2xlarge
2048
2
FALSE
TRUE
49

ml.g5.4xlarge
1024
4
FALSE
TRUE
39

ml.g5.4xlarge
2048
2
FALSE
TRUE
50

ml.g5.12xlarge
1024
16
TRUE
FALSE
57

ml.g5.12xlarge
2048
4
TRUE
FALSE
64

ml.g5.12xlarge
1024
4
FALSE
TRUE
26

ml.g5.12xlarge
2048
4
FALSE
TRUE
23

ml.g5.48xlarge
1024
16
TRUE
FALSE
59

ml.g5.48xlarge
2048
4
TRUE
FALSE
67

ml.g5.48xlarge
1024
8
FALSE
TRUE
22

ml.g5.48xlarge
2048
4
FALSE
TRUE
21

13B
The following table summarizes the fine-tuning options on the 13B model.

Instance Type
Max Input Len
Per Device Batch Size
Int8 Quantization
Enable FSDP
Time Taken (mins)

ml.g4dn.12xlarge
1024
4
TRUE
FALSE
283

ml.g4dn.12xlarge
2048
2
TRUE
FALSE
328

ml.g5.12xlarge
1024
8
TRUE
FALSE
92

ml.g5.12xlarge
2048
4
TRUE
FALSE
104

ml.g5.48xlarge
1024
8
TRUE
FALSE
95

ml.g5.48xlarge
2048
4
TRUE
FALSE
107

ml.g5.48xlarge
1024
8
FALSE
TRUE
35

ml.g5.48xlarge
2048
2
FALSE
TRUE
41

70B
The following table summarizes the fine-tuning options on the 70B model.

Instance Type
Max Input Len
Per Device Batch Size
Int8 Quantization
Enable FSDP
Time Taken (mins)

ml.g5.48xlarge
1024
4
TRUE
FALSE
396

ml.g5.48xlarge
2048
1
TRUE
FALSE
454

Recommendations on instance types and hyperparameters
When fine-tuning the model’s accuracy, keep in mind the following:

Larger models such as 70B provide better performance than 7B
Performance without Int8 quantization is better than performance with INT8 quantization

Note the following training time and CUDA memory requirements:

Setting int8_quantization=True decreases the memory requirement and leads to faster training.
Decreasing per_device_train_batch_size and max_input_length reduces the memory requirement and therefore can be run on smaller instances. However, setting very low values may increase the training time.
If you’re not using Int8 quantization (int8_quantization=False), use FSDP (enable_fsdp=True) for faster and efficient training.

When choosing the instance type, consider the following:

G5 instances provide the most efficient training among the instance types supported. Therefore, if you have G5 instances available, you should use them.
Training time largely depends on the amount of the number of GPUs and the CUDA memory available. Therefore, training on instances with the same number of GPUs (for example, ml.g5.2xlarge and ml.g5.4xlarge) is roughly the same. Therefore, you can use the cheaper instance for training (ml.g5.2xlarge).
When using p3 instances, training will be done with 32-bit precision because bfloat16 is not supported on these instances. Therefore, the training job will consume double the amount of CUDA memory when training on p3 instances compared to g5 instances.

To learn about the cost of training per instance, refer to Amazon EC2 G5 Instances.
If the dataset is in instruction tuning format and input+completion sequences are small (such as 50–100 words), then a high value of max_input_length leads to very poor performance. The default value of this parameter is -1, which corresponds to the max_input_length of 2048 for Llama models. Therefore, we recommend that if your dataset contain small samples, use a small value for max_input_length (such as 200–400).
Lastly, due to high demand of the G5 instances, you may experience unavailability of these instances in your region with the error “CapacityError: Unable to provision requested ML compute capacity. Please retry using a different ML instance type.” If you experience this error, retry the training job or try a different Region.
Issues when fine-tuning very large models
In this section, we discuss two issues when fine-tuning very large models.
Disable output compression
By default, the output of a training job is a trained model that is compressed in a .tar.gz format before it’s uploaded to Amazon S3. However, due to the large size of the model, this step can take a long time. For example, compressing and uploading the 70B model can take more than 4 hours. To avoid this issue, you can use the disable output compression feature supported by the SageMaker training platform. In this case, the model is uploaded without any compression, which is further used for deployment:

estimator = JumpStartEstimator(
model_id=model_id, environment={“accept_eula”: “true”}, disable_output_compression=True
)

SageMaker Studio kernel timeout issue
Due to the size of the Llama 70B model, the training job may take several hours and the SageMaker Studio kernel may die during the training phase. However, during this time, training is still running in SageMaker. If this happens, you can still deploy the endpoint using the training job name with the following code:

from sagemaker.jumpstart.estimator import JumpStartEstimator
training_job_name = <<<INSERT_TRAINING_JOB_NAME>>>

attached_estimator = JumpStartEstimator.attach(training_job_name, model_id)
attached_estimator.logs()
attached_estimator.deploy()

To find the training job name, navigate to the SageMaker console and under Training in the navigation pane, choose Training jobs. Identify the training job name and substitute it in the preceding code.
Conclusion
In this post, we discussed fine-tuning Meta’s Llama 2 models using SageMaker JumpStart. We showed that you can use the SageMaker JumpStart console in SageMaker Studio or the SageMaker Python SDK to fine-tune and deploy these models. We also discussed the fine-tuning technique, instance types, and supported hyperparameters. In addition, we outlined recommendations for optimized training based on various tests we carried out. The results for fine-tuning the three models over two datasets are shown in the appendix at the end of this post. As we can see from these results, fine-tuning improves summarization compared to non-fine-tuned models. As a next step, you can try fine-tuning these models on your own dataset using the code provided in the GitHub repository to test and benchmark the results for your use cases.
The authors would like to acknowledge the technical contributions of Christopher Whitten, Xin Huang, Kyle Ulrich, Sifei Li, Amy You, Adam Kozdrowicz, Evan Kravitz , Benjamin Crabtree, Haotian An, Manan Shah, Tony Cruz, Ernev Sharma, Jonathan Guinegagne and June Won.

About the Authors
Dr. Vivek Madan is an Applied Scientist with the Amazon SageMaker JumpStart team. He got his PhD from University of Illinois at Urbana-Champaign and was a Post Doctoral Researcher at Georgia Tech. He is an active researcher in machine learning and algorithm design and has published papers in EMNLP, ICLR, COLT, FOCS, and SODA conferences.
Dr. Farooq Sabir is a Senior Artificial Intelligence and Machine Learning Specialist Solutions Architect at AWS. He holds PhD and MS degrees in Electrical Engineering from the University of Texas at Austin and an MS in Computer Science from Georgia Institute of Technology. He has over 15 years of work experience and also likes to teach and mentor college students. At AWS, he helps customers formulate and solve their business problems in data science, machine learning, computer vision, artificial intelligence, numerical optimization, and related domains. Based in Dallas, Texas, he and his family love to travel and go on long road trips.
Dr. Ashish Khetan is a Senior Applied Scientist with Amazon SageMaker JumpStart and helps develop machine learning algorithms. He got his PhD from University of Illinois Urbana-Champaign. He is an active researcher in machine learning and statistical inference, and has published many papers in NeurIPS, ICML, ICLR, JMLR, ACL, and EMNLP conferences.

Appendix
This appendix provides additional information about performance benchmarking and dataset formatting.
Performance benchmarking
In this section, we provide results for fine-tuning the three Llama 2 models (7B, 13B, and 70B) on two different datasets: Dolly and Dialogsum. For the Dolly dataset, our task is to summarize a paragraph of text, whereas for Dialogsum, we are fine-tuning the model to summarize a discussion between two people. In the following tables, we show the input to the model (prompt and instructions), ground truth (summary), response from the pre-trained Llama 2 model, and response from the fine-tuned Llama 2 model for each of the three Llama 2 models. We show inference results for five data points. You can notice from the following tables that the summaries improve for both the datasets when we fine-tune the models.

Results for fine-tuning the Llama 2 7B text generation model on the Dolly dataset:

Inputs
Ground Truth
Response from Non-Fine-Tuned Model
Response from Fine-Tuned Model

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWhere is Allauchnn### Input:nAllauch (French pronunciation: ; Occitan: Alaug) is a French commune situated east of Marseille in the department of Bouches-du-Rhône in the Provence-Alpes-Côte d’Azur region of France.nnThe inhabitants of the commune are known as Allaudiens or Allaudiennes in French.nnnn### Response:n
Allauch is a French commune situated east of Marseille in the department of Bouches-du-Rhône in the Provence-Alpes-Côte d’Azur region of France.nnThe inhabitants of the commune are known as Allaudiens or Allaudiennes in French.
Allauch is a town located 28 km east of Marseille in the department of Bouches-du-Rhône in the Provence-Alpes-Côte d’Azur region of France. It is considered to be one of the most beautiful villages in France. The town is surrounded by hills and vineyards and has a population of around 6,000 people. It was first mentioned in 1006 and has a rich
The commune is situated on a rocky plateau called Massif du Gélas, situated east of Marseille. It has a beautiful view of Marseille and its coastline. Allauch is now a well-developed commune with its own town hall and an array of shops, bars and restaurants. Due to its location between Marseille and Alpilles, it has retained many natural features of the area, including the Glandes Massif, also known as the “W

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWhat is an egg laying mammal?nn### Input:nThe platypus (Ornithorhynchus anatinus), sometimes referred to as the duck-billed platypus, is a semiaquatic, egg-laying mammal endemic to eastern Australia, including Tasmania. The platypus is the sole living representative or monotypic taxon of its family (Ornithorhynchidae) and genus (Ornithorhynchus), though a number of related species appear in the fossil record.nnnn### Response:n
Duck-billed platypus is an egg-laying mammal found in eastern Australia. It is the sole living representative in its genus Ornithorhynchus.
nI did quite a bit of research on your question. I read all the information that you provided in your second paragraph. I also read a bit about platypus, including [this page](https://phys.org/news/2019-04-australia-platypus-predator-species.html). I think I found some answers to your question. Are you asking if the platypus is an egg laying mammal or if
A mammal that lays eggs, like a platypusnn

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWhen did Felix Luna die?nn### Input:nFélix César Luna (30 September 1925 – 5 November 2009) was an Argentine writer, lyricist and historian.nnnn### Response:n
Felix Luna died on November 5th, 2009
Félix César Luna (30 September 1925 – 5 November 2009) was an ArgentinennWhen did Luna die?nnn### Explanation:nWe answer the question with the input’s date of birth and the date of death.nnn### Solution: 1102n
Félix Luna died on November 5th, 2009.nn

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nJot down some important points about optical illusion from the given passage.nn### Input:nIn visual perception, an optical illusion (also called a visual illusion ) is an illusion caused by the visualnsystem and characterized by a visual percept that arguably appears to differ from reality. Illusions comenin a wide variety; their categorization is difficult because the underlying cause is often not clear but anclassification proposed by Richard Gregory is useful as an orientation. According to that, there are threenmain classes: physical, physiological, and cognitive illusions, and in each class there are four kinds:nAmbiguities, distortions, paradoxes, and fictions. A classical example for a physical distortion would benthe apparent bending of a stick half immerged in water; an example for a physiological paradox isnthe motion aftereffect (where, despite movement, position remains unchanged). An example for anphysiological fiction is an afterimage. Three typical cognitive distortions are the Ponzo , Poggendorff,nand Müller-Lyer illusion. Physical illusions are caused by the physical environment, e.g. by the opticalnproperties of water. Physiological illusions arise in the eye or the visual pathway, e.g. from the effects ofnexcessive stimulation of a specific receptor type. Cognitive visual illusions are the result of unconsciousninferences and are perhaps those most widely known.nnnn### Response:n
1. Visual Illusion is caused by the visual system and characterized by a visual percept that appearsnto be different from reality.n2. According to Richard Gregory a classification proposed by him is useful as an orientation that,nthere are three main classes: physical, physiological, and cognitive illusions, and in each classnthere are four kinds: Ambiguities, distortions, paradoxes, and fictions.n3. An example for a physical distortion would be the apparent bending of a stick half immerged innwater; an example for a physiological paradox is the motion aftereffect (where, despitenmovement, position remains unchanged). n4. An example for a physiological fiction is an afterimagen5. Three cognitive distortions are the Ponzo , Poggendorff, and Müller-Lyer illusion.n6. Physical illusions are caused by the physical environment, e.g. by the optical properties ofnwater. Physiological illusions arise in the eye or the visual pathway.n7. The most widely know illusion is visual illusions which are the result of unconscious inferences.
The optical illusion mentioned in the passage is the illusionary motion of the stick that is partially plunged in water.nn## 50.n### Instruction:nWrite an argument to support or refute the claim:nAll of the students in our lab have the same average IQ.nn### Input:nMany students in the experiment have the same average IQ, and I believe that they are the 10% with the highestn
nOptical illusions make use of the physiological and psychological limitations related to the way that we interpret and respond to visual stimuli, both stimuli in the world around us and visual stimuli produced internally. They provide a common stimulus to the two systems (physical and psychological/perceptual) which allow us to compare both, and understand the way that each works. There are a variety of different forms of optical illusions, that include psychological,

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nSummarize The Innovator’s Dilemma in a few sentencenn### Input:nIt describes how large incumbent companies lose market share by listening to their customers and providing what appears to be the highest-value products, but new companies that serve low-value customers with poorly developed technology can improve that technology incrementally until it is good enoughnnnn### Response:n
The Innovator’s Dilemma: When New Technologies Cause Great Firms to Fail, first published in 1997, is the best-known work of the Harvard professor and businessman Clayton Christensen. It expands on the concept of disruptive technologies, a term he coined in a 1995 article “Disruptive Technologies: Catching the Wave”. It describes how large incumbent companies lose market share by listening to their customers and providing what appears to be the highest-value products, but new companies that serve low-value customers with poorly developed technology can improve that technology incrementally until it is good enough to quickly take market share from established business. Christensen recommends that large companies maintain small, nimble divisions that attempt to replicate this phenomenon internally to avoid being blindsided and overtaken by startup competitors.
Hey there,n nSorry for the late response!! I was on vacation this past week :)nnThe Innovator’s Dilemma by Clayton Christensen explains that “the innovator’s dilemma is that a company’s pursuit of sustained profits by introducing or improving its products to provide more value to existing customers will result in those same customers abandoning the company to buy innovations that offer at least some value
The Innovator’s Dilemma is a book that explains how large incumbent companies lose market share by listening to their customers and providing what appears to be the highest-value products, but new companies that serve low-value customers with poorly developed technology can improve that technology incrementally until it is good enough nn

Results for fine-tuning the Llama 2 13B text generation model on the Dolly dataset:

Inputs
Ground Truth
Response from Non-Fine-Tuned Model
Response from Fine-Tuned Model

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWho is the next coach at Fc Bayern München to take over from Julian Nagelsmann?nn### Input:nNagelsmann era (2021–2023)nUnder new coach Julian Nagelsmann, Bayern have completed the feat of winning 10 consecutive Bundesliga titles following a 3–1 Der Klassiker win. However, the team unexpectedly lost to Villarreal in the Champions League quarter-finals, going out at that stage for the second year in a row. On 24 March 2023, Nagelsmann was released by Bayern and replaced by Thomas Tuchel, who received a contract until 30 June 2025.nnnn### Response:n
Thomas Tuchel replaced Julian Nagelsmann in March 2023 and is now the new coach at FC Bayern Munich, and was previously the coach at Chelsea.
nn“`pythonn{n “instruction”: “Who is the next coach at Fc Bayern München to take over from Julian Nagelsmann?”, n “action”: “next coach of FC Bayern”, n “context”:{n “Nagelsmann era (2021–2023)”: n ” Under new coach Julian Nagelsmann, Bayern have completed the feat of winning 10 consecutive Bundesliga titles
The next coach at Fc Bayern München will be Thomas Tuchel. Tuchel will take over from Julian Nagelsmann upon Nagelsmann’s departure from the team on 24 March, 2023.nn

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nTell me about the life of Adolphe Cohnnn### Input:nAdolphe Cohn (May 29, 1851– 1930) was a Franco-American educator, born in Paris. He was graduated “bachelier ès lettres” from the University of Paris in 1868, and studied law, historical criticism, and philology at various institutions of higher learning in Paris, receiving the degrees of LL.B. in 1873. A pupil of the École des Chartes, his thesis was called Vues sur l’histoire de l’organisation judiciaire en France du IXe au XIIIe siècle considérée au point de vue des juridictions extraordinaires and he got the diploma of “archiviste paléographe” in 1874. At the commencement of the Franco-Prussian War in 1870, he enlisted and served in the French army throughout the struggle.nnCohn emigrated to New York City on May 13, 1875, and from 1876 to 1884 was the American correspondent of La République française, then edited by Léon Gambetta, whom he had known in France, and whose political views he had adopted. In March, 1882, Cohn was appointed tutor in French at Columbia College, and soon afterward made an instructor. By a popular vote of the French residents of New York he was chosen to deliver the funeral oration on Gambetta in 1883 at Tammany Hall, and in 1885 was called from Cambridge, Massachusetts, for a similar purpose, upon the death of Victor Hugo.nnnn### Response:n
Adolphe Cohn (May 29, 1851 – 1930) was a Franco-American educator, born in Paris. He was graduated “bachelier es lettres” from the University of Paris in 1868, and studied law, historical criticism, and philology at various institutions of higher learning in Paris, receiving the degrees of LL.B. in 1873. nnA pupil of the École des Chartes, his thesis was called Vues sur l’histoire de l’organisation judiciaire en France du IXe au XIIIe siècle considérée au point de vue des juridictions extraordinaires and he got the diploma of “archiviste paléographe” in 1874. At the commencement of the Franco-Prussian War in 1870, he enlisted and served in the French army throughout the struggle.nnCohn emigrated to New York City on May 13, 1875, and from 1876 to 1884 was the American correspondent of La République française, then edited by Léon Gambetta, whom he had known in France, and whose political views he had adopted. In March, 1882, Cohn was appointed tutor in French at Columbia College, and soon afterward made an instructor. By a popular vote of the French residents of New York he was chosen to deliver the funeral oration on Gambetta in 1883 at Tammany Hall, and in 1885 was called from Cambridge, Massachusetts, for a similar purpose, upon the death of Victor Hugo.
nAdolphe Cohn (May 29, 1851– 1930) was a Franco-American educator, born in Paris. He was graduated “bachelier ès lettres” from the University of Paris in 1868, and studied law, historical criticism, and philology at various institutions of higher learning in Paris, receiving the degrees of LL.B. in 1873. A pupil of the É
Adolphe Cohn was a Franco-American educator who was born in Paris on May 29, 1851. He attended the University of Paris and earned a law degree in 1873. He also worked at the École des Chartes and earned his “archiviste paléographe” degree in 1874. At the start of the Franco-Prussian War, he joined the French army and served them for the whole

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nHow has McDonald’s target audience changed over time?nn### Input:nFast food restaurants often aim some of their advertising directly at the youth population. McDonald’s Happy Meals, which include a toy often tied in with a newly released family film, is a significant example. Ronald McDonald, a clown advertising mascot introduced in 1963 and designed to appeal to young children, is another. In addition, in 1987 McDonald’s incorporated a Play Place in their restaurants to further advertise to children, making their restaurants a more appealing environment for children. Additionally, from 1996 to 2006, Disney was an exclusive partner with McDonald’s, linking their products together. They announced the end of this deal in May 2006, with some reports saying that Disney was worried about childhood obesity. Other than Disney, McDonald’s has also been partnered with Nintendo since 1985, when the Nintendo Entertainment System was first introduced. In 1987, McDonald’s also created a Nintendo play land in their restaurants where kids could play on Nintendo consoles while waiting for their food. During 2007, McDonald’s began to provide WiFi from the Nintendo consoles; giving children the ability to play wherever they were. With also a target audience of children releasing these new films and incorporating toys, it uses the kids to reach out to their parents pockets because of their large investment.nnnn### Response:n
McDonald’s has consistently aimed their advertising directly at the youth population.nnStarting in 1963, they introduced Ronald McDonald, a clown advertising mascot designed to appeal to young children. Over the years, McDonald’s partnered with multiple companies that have products targeted at children, such as Nintendo and Disney.
Fast food restaurants have always targeted the youth population. Happy meals are also a huge factor when it comes to advertsing to youths.nAs many movies are realeased Happy Meals also come with a toy to promote that movie in your local McDonald’snn—n## 400-999nn### Comment:nAs a young company with a relatively narrow vision of the world, McDonald’s didn
Since launch, McDonald’s has targeted its audience to young children and teens with its kid oriented fast food restaurants and food products, as well as its advertising, which has largely targeted the youth. While the children are being driven to McDonalds by their parents, the restaurant is in essence a parental advertisement for their children: McDonalds appeals to children with their brightly colored restaurants and toys found in their food that appeal to

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWhat is Anandapuram mandalnn### Input:nAnandapuram mandal is one of the 46 mandals in Visakhapatnam district of the Indian state of Andhra Pradesh. It is under the administration of Visakhapatnam revenue division and the headquarters are located at Anandapuram. Anandapuram mandal is bounded by Bheemunipatnam mandal to the east, Chinagadila to the south, Padmanabham to the north and Kothavalasa to the west.nnnn### Response:n
Anandapuram mandal is one of the 46 mandals in Visakhapatnam district of the Indian state of Andhra Pradesh. It is under the administration of Visakhapatnam revenue division and the headquarters are located at Anandapuram.
nnnn
Anandapuram mandal is one of the 46 mandals in Visakhapatnam district of the Indian state of Andhra Pradesh. It is under the administration of Visakhapatnam revenue division and the headquarters are located at Anandapuram. Anandapuram mandal is bounded by Bheemunipatnam mandal to the east, Chinagadila to the south, Padmanabham to the north and Koth

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWhat happened in Woodstocknn### Input:nWoodstock Music and Art Fair, commonly referred to as Woodstock, was a music festival held during August 15–18, 1969, on Max Yasgur’s dairy farm in Bethel, New York, United States, 40 miles (65 km) southwest of the town of Woodstock. Billed as “an Aquarian Exposition: 3 Days of Peace & Music” and alternatively referred to as the Woodstock Rock Festival, it attracted an audience of more than 400,000 attendees. Thirty-two acts performed outdoors despite sporadic rain. It was one of the largest music festivals held in history.nnThe festival has become widely regarded as a pivotal moment in popular music history as well as a defining event for the counterculture generation. The event’s significance was reinforced by a 1970 documentary film, an accompanying soundtrack album, and a song written by Joni Mitchell that became a major hit for both Crosby, Stills, Nash & Young and Matthews Southern Comfort. Musical events bearing the Woodstock name were planned for anniversaries, which included the tenth, twentieth, twenty-fifth, thirtieth, fortieth, and fiftieth. In 2004, Rolling Stone magazine listed it as number 19 of the 50 Moments That Changed the History of Rock and Roll. In 2017, the festival site became listed on the National Register of Historic Places.nnnn### Response:n
The Woodstock Music and Art Fair, commonly referred to as Woodstock, was a music festival held during August 15–18, 1969, on Max Yasgur’s dairy farm in Bethel, New York, United States, 40 miles (65 km) southwest of the town of Woodstock. Billed as “an Aquarian Exposition: 3 Days of Peace & Music” and alternatively referred to as the Woodstock Rock Festival, it attracted an audience of more than 400,000 attendees. Thirty-two acts performed outdoors despite sporadic rain. It was one of the largest music festivals held in history.nnThe festival has become widely regarded as a pivotal moment in popular music history as well as a defining event for the counterculture generation. The event’s significance was reinforced by a 1970 documentary film, an accompanying soundtrack album, and a song written by Joni Mitchell that became a major hit for both Crosby, Stills, Nash & Young and Matthews Southern Comfort. Musical events bearing the Woodstock name were planned for anniversaries, which included the tenth, twentieth, twenty-fifth, thirtieth, fortieth, and fiftieth. In 2004, Rolling Stone magazine listed it as number 19 of the 50 Moments That Changed the History of Rock and Roll. In 2017, the festival site became listed on the National Register of Historic Places.
nAnswer:nn“`jsndocument.getElementById(‘_0x1000’).innerHTML = “https://commons.wikimedia.org/wiki/File:BethelMeetingHouse.JPG#/media/File:BethelMeetingHouse.JPG”;nn“`nn</details>nn## Solution:nn<details>n<summary>To View Solution</summary>nn“`jsnn// Solution required
Woodstock Music and Art Fair held during the latter part of August 15–18, was a historic event in Bethel, New York by the counterculture generation. The crowd at the festival was more than 400,000 strong in attendance. About 32 acts performed outdoors in spite of sporadic rain. It was regarded as one of the biggest music festivals that had ever been in history.nn

Results for fine-tuning the Llama 2 70B text generation model on the Dolly dataset:

Inputs
Ground Truth
Response from Non-Fine-Tuned Model
Response from Fine-Tuned Model

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWho is the company ABBnn### Input:nABB Ltd. is a Swedish-Swiss multinational corporation headquartered in Zürich, Switzerland. The company was formed in 1988 when Sweden’s Allmänna Svenska Elektriska Aktiebolaget (ASEA) and Switzerland’s Brown, Boveri & Cie merged to create ASEA Brown Boveri, later simplified to the initials ABB. Both companies were established in the late 1800s and were major electrical equipment manufacturers, a business that ABB remains active in today. The company has also since expanded to robotics and automation technology.nnnn### Response:n
ABB Ltd. is a Swedish-Swiss multinational corporation headquartered in Zürich, Switzerland. The company was formed in 1988 when Sweden’s Allmanna Svenska Elektriska Aktiebolaget (ASEA) and Switzerland’s Brown, Boveri & Cie merged to create ASEA Brown Boveri, later simplified to the initials ABB. nnBoth companies were established in the late 1800s and were major electrical equipment manufacturers, a business that ABB remains active in today. The company has also since expanded to robotics and automation technology.
* ABB is a swedish-swiss multinational corporationnn### LognEND I/O DATA —n
ABB is a Swedish-Swiss multinational corporation headquartered in Zürich, Switzerland. The company was formed in 1988 when Sweden’s Allmänna Svenska Elektriska Aktiebolaget (ASEA) and Switzerland’s Brown, Boveri & Cie merged to create ASEA Brown Boveri, later simplified to the initials ABB. Both companies were established in the late 1800s and were

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWhy did a William the Conqueror, a Frenchman, become the King of England?nn### Input:nIn the 1050s and early 1060s, William became a contender for the throne of England held by the childless Edward the Confessor, his first cousin once removed. There were other potential claimants, including the powerful English earl Harold Godwinson, whom Edward named as king on his deathbed in January 1066. Arguing that Edward had previously promised the throne to him and that Harold had sworn to support his claim, William built a large fleet and invaded England in September 1066. He decisively defeated and killed Harold at the Battle of Hastings on 14 October 1066. After further military efforts, William was crowned king on Christmas Day, 1066, in London. He made arrangements for the governance of England in early 1067 before returning to Normandy. Several unsuccessful rebellions followed, but William’s hold was mostly secure on England by 1075, allowing him to spend the greater part of his reign in continental Europe.nnnn### Response:n
William the Conqueror was from Normandy, located in northwest France. The King of England in the late 1050s and early 1060s was Edward the Confessor. Edward was William’s cousin. Since Edward was childless, there were multiple people who claimed that they were the rightful heir and should become king. These conflicting claims gave rise to armed conflict. William invaded England, where he won the Battle of Hastings, defeating earl Harold Godwinson. A number of battles followed, but William became King of England late in 1066. He dealt with multiple rebellions over the next decade, but retained his power. Ultimately he was able to return to Normandy and rule England from there.
William the Conqueror became King of England as in the 1050s and early 1060s, he became a contender for the throne of England held by the childless Edward the Confessor, his first cousin once removed.nnnResponse can be empty but must fit within 256 characters long.n
After the death of King Harold in the Battle of Hastings in October 14, 1066, William the Conqueror was crowned the King of England on Christmas Day, 1066, establishing the Norman conquest of England.nn

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nTell me about the RMS Adriaticnn### Input:nRMS Adriatic was a British ocean liner of the White Star Line. She was the fourth of a quartet of ships of more than 20,000 GRT, dubbed The Big Four. The ship was the only one of the four which was never the world’s largest ship. However, she was the largest, the fastest, and the most luxurious of the Big Four, being the first ocean liner to have an indoor swimming pool and a Turkish bath.nnShe began her career on the brand new route from Southampton to New York before joining, from 1911, her sister ships on the secondary route from Liverpool. They were in fact slow liners intended to provide a service at moderate prices. When World War I broke out, the Adriatic was among the ships that continued their civilian transatlantic service, while carrying many provisions. In 1917, she was requisitioned and served as a troop transport.nnAfter the war, she was refitted several times, and was gradually used for cruises, which became her main services in the 1930s. When Cunard Line and White Star Line merged in 1934, she was quickly deemed unnecessary and sold at the end of the year. She was scrapped in Osaka in 1935.nnnn### Response:n
The RMS Adriatic was a British ocean liner of the White Star Line. She was the fourth of a quartet of ships of more than 20,000 GRT, dubbed The Big Four. The ship was the only one of the four which was never the world’s largest ship. However, she was the largest, the fastest, and the most luxurious of the Big Four, being the first ocean liner to have an indoor swimming pool and a Turkish bath.
RMS Adriatic was a British ocean liner of the White Star Line. She was the fourth of a quartet of ships of more than 20,000 GRT, dubbed The Big Four. The ship was the only one of the four which was never the world’s largest ship. nn### ContextnA question of this type is commonly used to query a search engine. The input is an input of the query, and the response is the public
The RMS Adriatic was a british ocean liner of the white star line. She was launched in 1906 but her first commercial sailing was in 1907. She was the fourth of a quartet of ships over 20,000 tonnes and dubbed the Big Four. The other three were the RMS Celtic in 1901, the RMS Baltic in 1903, and the R

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nWhat is the CGRO that detects photons?nn### Input:nThe Compton Gamma Ray Observatory (CGRO) was a space observatory detecting photons with energies from 20 keV to 30 GeV, in Earth orbit from 1991 to 2000. The observatory featured four main telescopes in one spacecraft, covering X-rays and gamma rays, including various specialized sub-instruments and detectors. Following 14 years of effort, the observatory was launched from Space Shuttle Atlantis during STS-37 on April 5, 1991, and operated until its deorbit on June 4, 2000. It was deployed in low Earth orbit at 450 km (280 mi) to avoid the Van Allen radiation belt. It was the heaviest astrophysical payload ever flown at that time at 17,000 kilograms (37,000 lb).nnnn### Response:n
The CGRO is the Compton Gamma Ray Observatory (CGRO) and was a space observatory detecting photons with energies from 20 keV to 30GeV in earth from 1991 to 2000.
**Transcriber:** The EGRET instrument detects photons.nn<!–nThe following comments are intended to provide “tips” that can be used to apply the concepts from the course to this problem. Some of these comments are “correct” (specifically the explanations of the reasoning for each “correct” response), and some are “incorrect” (those explain what is wrong with the incorrect responsese).nnThe “Correct” way to
The Compton Gamma Ray Observatory (CGRO) measures photons with energies from 20 keV to 30 GeV.nn

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:nHow good as a pianist is Yeol Eum Sonnn### Input:nYeol Eum Son (born May 2, 1986, in Wonju, South Korea) is a South Korean classical pianist. She first drew international attention in October 2004 at age 18 when she appeared as a soloist performing Liszt Piano Concerto No. 1 with the New York Philharmonic under the baton of Lorin Maazel on their Asia tour in Seoul, Daejeon, and Tokyo. Son again performed with Maazel and the New York Philharmonic when they returned to the Seoul Arts Center in February 2008, this time as soloist for Beethoven Piano Concerto No. 2.nnHer awards include Silver Medals at both the Thirteenth Van Cliburn International Piano Competition (2009) and 2011 International Tchaikovsky Competition, where she also received the Best Chamber Concerto Performance and the Best Performance of the Commissioned Work prizes. She also won the Bronze Medal at the Arthur Rubinstein International Piano Master Competition (2005) nnnn### Response:n
Yeol Eum Son (born May 2, 1986, in Wonju, South Korea) is a South Korean classical pianist. She first drew international attention in October 2004 at age 18 when she appeared as a soloist performing Liszt Piano Concerto No. 1 with the New York Philharmonic under the baton of Lorin Maazel on their Asia tour in Seoul, Daejeon, and Tokyo. Son again performed with Maazel and the New York Philharmonic when they returned to the Seoul Arts Center in February 2008, this time as soloist for Beethoven Piano Concerto No. 2.nnHer awards include Silver Medals at both the Thirteenth Van Cliburn International Piano Competition (2009) and 2011 International Tchaikovsky Competition, where she also received the Best Chamber Concerto Performance and the Best Performance of the Commissioned Work prizes. She also won the Bronze Medal at the Arthur Rubinstein International Piano Master Competition (2005)
Yeol Eum Son is an accomplished pianist, who has performed with some of the world’s most prestigious orchestras. In 2009, she won silver at the Thirteenth Van Cliburn International Piano Competition and in 2005 she won bronze at the Arthur Rubinstein International Piano Master Competition. She continues to perform with great success all over the world.
Yeol Eum Son is a very good pianist. She began her international career by joining some of the greatest musicians in the world at age 18. And was already a Silver Medalist in two international competitions by the time she turned 25.nn

Results for fine-tuning the Llama 2 7B text generation model on the Dialogsum dataset:

Inputs
Ground Truth
Response from Non-Fine-Tuned Model
Response from Fine-Tuned Model

Summarize this dialog:n#Person1#: This is a great meal!n#Person2#: Thank you. I am glad that everything turned out so well.n#Person1#: Who taught you how to cook this well?n#Person2#: I took a cooking class at the Culinary Institute.n#Person1#: Wow, what is in this chicken dish?n#Person2#: Isn’t that great? That is coconut ginger chicken with rice.n#Person1#: Is the soup a shrimp soup?n#Person2#: Yes, there is shrimp in there along with sea vegetables and lemon grass.n#Person1#: I think I got lucky in choosing a wine that blends with this meal.n#Person2#: I appreciate you bringing the wine. It goes well with the chicken.n—nSummary:nnn### Response:n
#Person1# thinks the meal prepared by #Person2# is great. #Person2# introduces the ingredients of the dishes.
After the dialog is complete, the person will provide an appropriate response to the information gathered. For example, person 1 can say,nn#Person1#: This is great! Thank you.nn#Person2#: You are very welcome. I thought that you would like your meal. I’m happy that you enjoyed it.nn—n— — — — –nnn### Summary:nnn### Response:nAfter the dialog
#Person1# and #Person2# eat at #Person2#’s place. They talk about the food and the wine that blend with the chicken.nn

Summarize this dialog:n#Person1#: Excuse me, I am writing an essay about ancient Greece. Are there any books on this topic that you can recommend?n#Person2#: Geese? You want to write about ancient geese? I’ve never heard of such a thing. Anyway, it would be with a book of some animals, that’s row J. Or you want to try science, that’s in Row G.n#Person1#: Animals? But Greece is a country, not an animal.n#Person2#: Oh, Greece. Well, you know, today must be your lucky day. We have just received a new collection of books on that very topic. There are writers over here in row C.n#Person1#: I see. That’s a lot of books, which do you suggest?n#Person2#: Well, the one by Robin Smith is excellent but it is very long. It’s the one with the statue on the cover. There is also one by Milton Pike. It has a painting of a soldier on the front. As you can see, it’s not very thick, so it shouldn’t take too much time to read. But probably the best book is the one with the picture of ancient Olympic Stadium. It has won several history works rewards.n#Person1#: Well, it’s hard to decide, but I’m afraid I don’t have enough time for a lot of reading. I think I’ll get the book by Milton Pike.n#Person2#: OK. Could I have your library card please?n#Person1#: Sure, here you are.n—nSummary:nnn### Response:n
#Person1# asks #Person2# to recommend some books on ancient Greece. #Person2# suggests the one by Robin Smith which is very long, the one by Milton Pike which is not very thick, and the one winning several history works rewards. Since #Person1# doesn’t have enough time, #Person1#’ll get the book by Milton Pike.
n—nnSummary:n
#Person1#’s writing an essay about ancient Greece. #Person2# recommends the books about ancient Greece to #Person1# which are all newly received and suggests getting the book by Milton Pike.nn

Summarize this dialog:n#Person1#: Well, I see you’ve got your books, George. But are you reading? Looks like you’re just coloring.n#Person2#: I’m marking the important parts of the book. The colors are for different things.n#Person1#: Like what?n#Person2#: Main ideas are yellow. Important details to remember are pink and the light green is for difficult ideas.n#Person1#: This seems like a lot of activities. Aren’t you wasting your time?n#Person2#: I don’t think so, mom. I’ll read it all tonight. I’ll listen to it tomorrow. And then when it’s time to study, I will already know most of the information.n#Person1#: In my day, we just read the book and tried to remember the information.n#Person2#: The teacher told us the importance of active reading. And it’s the best way to learn at whatever speed you choose.n#Person1#: I see. And what’s this notebook for?n#Person2#: That’s for taking notes. What kind of things do you write down?n#Person1#: Sometimes I write a summary. On this page I write down questions. And of course I have another page where I write down the new words.n—nSummary:nnn### Response:n
George tells his mother that marking the important parts of the book in different colors is his way of learning. He also takes notes on the notebook and sometimes writes a summary.
#Person1#: Ohhh! Active reading, reading questions, vocabulary notes, summaries, and highlighting. I can see now what a difference you have when it comes to learning information. I’m impressed.n#Person2#: Thanks. Reading is the best way to learn and remember new information. Are you using active reading right now?n#Person1#: Hmmm….. I’ll have to think about that.nnn#Person
George describes the activities he’s doing in order to read his books fast. George shows the notebook to mom and explains how to take notes.nn

Summarize this dialog:n#Person1#: Good morning, sir. May I see your passport?n#Person2#: Yes, of course. Here it is.n#Person1#: Thank you. Are all these yours?n#Person2#: No, some of them are my friend’s. Do you want her passport, too?n#Person1#: Yes, please. Do you have any carry-on items?n#Person2#: Do you want me to put it on the belt, too?n#Person1#: Yes, please. Would you please open this big suitcase?n#Person2#: OK. Just a moment. It’s just some clothes and toiletries.n#Person1#: OK. Do you have anything to declare?n#Person2#: I think there is nothing.n#Person1#: Any fresh food, live plants, or animals?n#Person2#: Oh, no. None of that.n#Person1#: Thank you. You may lock up your suitcase. Here are your passports.n#Person2#: Thanks a lot.n—nSummary:nnn### Response:n
#Person1# sees #Person2#’s and #Person2#’s friend’s passport and checks their luggage. #Person2# tells #Person1# they’ve nothing to declare. #Person1# lets them go.
> I think there is nothing.nnn—n**Categories:**nn- [Conversation (32)](../conversation/32.md)n – [TSA (4)](../conversation/32/4.md)nn### Key:nn :information_source: 31 (TSA)n:information_source: 32 (TSA)nn—-n* Any copyright
The immigration official asks #Person2# to show his passport and open his big suitcase. #Person2# has nothing to declare and he may lock up his suitcase.nn

Summarize this dialog:n#Person1#: Chinese tea is famous all over the world. I want to buy several boxes.n#Person2#: There are various kinds of tea in China. It depends on the taste and effect.n#Person1#: Well, I do not know Chinese tea at all.n#Person2#: This is green tea, you can drink it in summer. It can relieve summer-heat and it is good for maintaining your lungs.n#Person1#: Sounds interesting. What about this?n#Person2#: It is black tea, it has a strong taste and promotes digestion.n#Person1#: What’s this? It smells fragrant.n#Person2#: It is jasmine tea of high quality. It is fragrant and has a pleasant taste.n#Person1#: I will take the jasmine tea. Could you pack it with the wrapping paper and ribbon? It is a gift for my friend.n#Person2#: No problem. Wait for a moment.n—nSummary:nnn### Response:n
#Person2# recommends green tea, black tee, and jasmine tee to #Person1# who wants to buy Chinese tea and #Person1# decides to take the jasmine tea.
#[English]#: Chinese tea is famous all over the world. I want to buy several boxes.n#[French]#: Il y a différentes sorts de thé chinois au pays. C’est selon le goût, et les effets qu’on veut obtenir.n#[Spanish]#: El té chino es muy famosos, tengo que comprar varias cajas.n#[German]#: Ich mö
#Person1# wants to buy several boxes of Chinese tea, with different kinds of tea. #Person2# helps #Person1# to select and will pack it up.nn

Results for fine-tuning the Llama-2 13B model on the Dialogsum dataset:

Inputs
Ground Truth
Response from Non-Fine-Tuned Model
Response from Fine-Tuned Model

Summarize this dialog:n#Person1#: Mrs. Thompson once ditheist, Jason.n#Person2#: Why? We shouldn’t be in trouble or anything? In fact we were super active during the entire class discussion this morning.n#Person1#: That’s just it. I don’t think it’s a bad thing at all. I think she is going to ask us to run for class monitor.n#Person2#: Whoa! Really? That’s a lot of responsibility.n#Person1#: Well, I think you’d be great at it. Besides, it’s a big honor to be asked to do it. It means that Mrs. Thompson sees us as leaders.n#Person2#: I’m not sure if I can do it by myself, though. Could we do it together?n#Person1#: Let’s ask Mrs. Thompson. I hear her coming in from the hallway now.n—nSummary:nnn### Response:n
#Person1# tells Jason Mrs. Thompson once ditheist and #Person1# guesses she wants them to run for class monitors. They’ll ask Mrs. Thompson whether they can do this together.
n* When the teacher asked Jason and his classmate if they wanted to run for class monitor, he and his classmate agreed it together. Both boys are Christian, but they are not friends in class.n* Their faith in Jesus may not be publicly shared, but it is evident through prayer and the example they follow together to help others and show love.nn—n## 1654nn**[Situation:** Students are talking about the recent
Mrs. Thompson asks Jason and James to run for the class monitor because she thinks they are leaders.nn

Summarize this dialog:n#Person1#: Hey, let’s pull over and look at that house over there.n#Person2#: Yes, that looks great! Is it on our Open House list?n#Person1#: Yes, it is on our list.n#Person2#: Well, it looks great from the curb. Let’s go on inside.n#Person1#: Let’s sign the registry. The realtor looks kind of busy right now.n#Person2#: OK, let’s go check out the kitchen next.n#Person1#: Oh look, they have some refreshments in here.n#Person2#: Make sure that you pick up a flyer with the information about the home.n#Person1#: Does it say how much they are asking for this house?n#Person2#: Yes, it seems to have been discounted from two hundred and thirty-five thousand to two hundred and twenty-five thousandn—nSummary:nnn### Response:n
#Person1# and #Person2# come to see a house. They check out the inside and talk about its price.
This is an AMI response to the following stimulus.nnPerson 2: nYes, that looks great! Is it on our Open House list? nYes, it is on our list.nWell, it looks great from the curb. Let’s go on inside.nLet’s sign the registry. The realtor looks kind of busy right now.nOK, let’s go check out the kitchen next.nOh look, they have some
#Person1# and #Person2# are looking around a house which is on their Open House list. They sign a registry and #Person2# tells #Person1# to take a flyer with the information.nn

Summarize this dialog:n#Person1#: How are you, Mr. Wilson? I am Tina.n#Person2#: Glad to meet you, Mrs. Tina.n#Person1#: Please have a seat. I know our employment of forcer has already given you the information about your employee contract, but I’d like to go over the main details again before signing. First, you’ll be getting a monthly salary, with one month paid vacation after one year of service.n#Person2#: That’s what I understand.n#Person1#: You’ll be covered by our medical plan while on duty. Since you’ll be middle management, you’re expected to be available up to 1. 5 hours past normal working hours. Any approved time over that will be paid at time and a half, which you can take as salary or time off.n#Person2#: Exactly my understanding.n#Person1#: A reasonable number of sick days will be covered by the company. Any extended illness will be covered by insurance. Have you read the other terms of the contract?n#Person2#: Yes, I have.n#Person1#: Do you have any other questions?n#Person2#: Just one. I noticed an item about flex-time. Is that a possibility for me?n#Person1#: Yes, it is, but you’ll have to discuss the details with your manager.n#Person2#: That’s acceptable.n#Person1#: Good. Now, if you’ll just sign here, you can start work in 3 days.n—nSummary:nnn### Response:n
Mr.Wilson describes the elementary information about the employment contract to Tina. He mentions monthly salary, one month paid vacation, medical plans, extended working hours and flex-time.
nNo, the summaries aren’t right. The #Person1 dialogs don’t say the name is Tina, they don’t introduce themselves. I think they both know each others’ names, so that’s unneeded.nnI think if everything else is right, the Summary won’t change, unless #Person1 ends, and that’s ok.nn
Tina meets with Mr. Wilson to go over the main details of Tina’s employee contract. The two discuss the monthly salary and medical plan. Tina has no questions and will start work in 3 days.nn

Summarize this dialog:n#Person1#: Jack, what is your favorite subject?n#Person2#: My favorite subject? Well, I think I prefer science. Yes, science, I find it very interesting. The science teacher makes the subject very exciting. What about you?n#Person1#: I always do very well in the maths tests. I also like geography. I like learning about people and the places where they live. Are there any subjects that you don’t like Jack?n#Person2#: I don’t like physics very much. It’s not easy to understand, most of the children are bored with it.n—nSummary:nnn### Response:n
Jack’s favorite subject is science while he doesn’t like physics very much. #Person1# likes maths and geography.
n—nn### The Word: “favorite” (adj)nnMeaning: Loved, appreciated most of the others nExamples: My favorite book is the Bible; What’s your favorite sport?; My favorite color is blue.nn—nnnn### What is your ____ subject? (formal)nnQuestion and Answer Dialog:nn#Person1#: What’s your favorite subject?n#
Jack and Jack are talking about their favorite and the least liked subjects.nn

Summarize this dialog:n#Person1#: Hi, Maggie. I saw your painting in the end of year school art exhibition that all the parents went to. It was great.n#Person2#: Well, our art teacher seemed really pleased with it, but I’m not sure what the parents thought about.n#Person1#: Oh, don’t worry. I only heard them saying good things about it.n#Person2#: Hmm…He probably didn’t notice that I hadn’t used the right colors.n#Person1#: I don’t think anyone noticed.n#Person2#: No, but I did. But it was too late to do anything about it. Well, next time I’ll do something better hopefully.n—nSummary:nnn### Response:n
#Person1# saw Maggie’s painting and thinks it was great, but Maggie worries that she didn’t use the right colors and hopes she’ll do something better next time.
This summary should highlight the key points and conclusions from the discussion.nn# Respond to Person1:n[Response summary]nn—n### Response:nThis summary should highlight the key points and conclusions from the discussion.nn# Respond to Person2:n[Response summary]nn### Person1:n[Person1 summary]n—n### Person2:n[Person2 summary]nn### Personn—n
#Person1# saw Maggie’s painting and told Maggie that a lot of parents said great things about Maggie’s painting. Maggie does not think so.nn

Results for fine-tuning the Llama 2 70B model on the Dialogsum dataset:

Inputs
Ground Truth
Response from Non-Fine-Tuned Model
Response from Fine-Tuned Model

Summarize this dialog:n#Person1#: Have you received any degrees?n#Person2#: Yes. In 1996 I received my Bachelor of Science degree from Hebes University, and in 2001 I received my MBA degree from Peking University.n#Person1#: How about your academic records at college?n#Person2#: In fact my records were excellent. My overall GPA was 9 on a 10 scale, which was the highest in my class.n#Person1#: That’s very impressive. Which course did you like best?n#Person2#: English. It was both interesting and useful, so I showed a great interest in it.n#Person1#: Can you tell me why you changed your major when you were a graduate student?n#Person2#: Because I am very interested in administration and I did take some courses in it. I also performed well in the subjects. Hence I believe that I can do a good job in this position.n#Person1#: Did you get any honors and awards at college?n#Person2#: Yes. I was awarded a scholarship from the university every year. In 1995 I participated in the National Contest of Maths Models and I won the prize.n—nSummary:nnn### Response:n
#Person1# interviews #Person2# and asks #Person2# some questions, including #Person2#’s degrees, academic records, the favorite course, and awards in college. #Person2# also tells #Person1# why #Person2# changed the major into administration.
The candidate summarized that the main idea of the dialogue was that he received two degrees, 0. And the majors did well. <br/>nHowever his first sentence doesn’t provide much emotion above besides narcissism, nor does it state why his experience is relevant to the company. He could make a better connection and perhaps end the summary with an underlying emotion/attitude other than “professional” (as he does in the dialog).<br/>
#Person2# tells #Person1# #Person2# received a bachelor’s degree and an MBA. #Person2#’s GPA is the highest in #Person2#’s class and #Person2# is interested in administration. #Person2# also got some honors and awards at college.nn

Summarize this dialog:n#Person1#: Excuse me, what material is it? n#Person2#: Rhaki. Do you want some? n#Person1#: No, where are the cotton cloths? n#Person2#: I will show you. n#Person1#: What is the shrink rate after washing? n#Person2#: Less than 5%. It’s quite durable. n—nSummary:nnn### Response:n
#Person2# tells #Person1# the cotton clothes are durable.
For route QA41913, this section should contain a paragraph summary of the dialog so that, for example, an automatic pronunciation and speech generation system could use this text for output synthesis.
#Person1# wants some cotton cloth. #Person2# shows some rhaki.nn

Summarize this dialog:n#Person1#: Sorry, I’m late, Tom.n#Person2#: It’s all right, Rita. Where have you been?n#Person1#: At the police station. I’ve lost my handbag. Or perhaps someone’s taken it. I don’t know.n#Person2#: Oh, no. What happened? Sit down. Was there anything important in it?n#Person1#: Important! My checkbook, all the papers I need for work.n#Person2#: Oh, that’s terrible. But how did you lose it?n#Person1#: Well, as you know, I was with a friend all morning and we had lunch together. After I had lunch, I went shopping. And when I wanted to buy something, I couldn’t find my checkbook. Then I remembered that it was in my handbag. And my handbag was in my car.n#Person2#: So you went back to your car.n#Person1#: But I didn’t find it there.n#Person2#: And you went to the police station?n#Person1#: Not immediately. Before I went to the police station I called my friend’s office. No luck.n#Person2#: You should go to the restaurant where you had lunch and look for it.n#Person1#: Oh, I should have done that.n#Person2#: Now you’d better telephone the manager right away.n—nSummary:nnn### Response:n
Rita tells Tom she’s late because she’s lost her handbag and her efforts on searching for the handbag. Tom suggests Rita go to the restaurant where she had lunch and look for it.
Translate each sentence into English, using the word combination you like the best:n### Value:n
Rita lost her handbag and Tom advises her to go to the restaurant where she had lunch to look for it.nn

Summarize this dialog:n#Person1#: Morning, Mum!n#Person2#: Morning, Meg. You look not well today? Are you ill?n#Person1#: No, I am not ill.n#Person2#: Then, What’s the matter with you my child?n#Person1#: Nothing.n#Person2#: Oh, come on, baby. Tell me what happened.n#Person1#: I. . . I failed to pass the examination. How I wish I had studied hard.n#Person2#: Oh. Take it easy. You can set your aim from today.n#Person1#: Ok, Mum. I will not fail the examination next time.n—nSummary:nnn### Response:n
Meg isn’t in the mood because she failed the examination. Her mom encourages her to set her aim from today.
Aide sans action verbnAide et direct object apres le verbe amountn
#Person1# failed to pass the examination and tells Mum about it. Mum encourages #Person1# to study harder.nn

Summarize this dialog:n#Person1#: Everyone wants to be financial lose kill. You must invest your savings if you planed to retire rich. I’m thinking about buying stocks. It can be a good investment if I can manage well. What do you say?n#Person2#: I partly agree with you. Buying stocks is not difficult, but making money consistently from buying stock is very difficult. Even professionals don’t find it easy, so be careful. Anyway, people like us need help with investing. Why not do some online research? There are data, reports and discussion forums to educate people on how to invest. In addition, you need to visit an established website that offers stock charts.n#Person1#: Wow, you sound very professional.n#Person2#: I just watched the investment TV program last weekend. It has become a very hot topic these days. Actually I am not even a beginner.n—nSummary:nnn### Response:n
#Person1# and #Person2# are talking about buying stocks. #Person2# thinks it is hard to make money consistently in this way and suggests doing online research to learn about investing.
nI agree with the investment strategy recommended earlier. It took me a great deal of time and energy to come up with such plan. It is great that trust has been put it my judgement to this extent. Indeed, my colleague can feel secure, because I will do everything in order to lead this team to bright and safe retirement future. I look forward to your role as an active member of the team.
#Person1# wants to buy stocks but #Person2# tells #Person1# to be careful because it is difficult to invest. #Person2# suggests #Person1# do online research, visit established websites, and watch investment programs.nn

Dataset formatting
We currently offer two types of fine-tuning: instruction fine-tuning and domain adaption fine-tuning. You can easily switch to one of the training methods by specifying the parameter instruction_tuned as ‘True‘ or ‘False‘.
Domain adaption format
The text generation model can also be fine-tuned on any domain-specific dataset. After it’s fine-tuned on the domain-specific dataset, the model is expected to generate domain-specific text and solve various NLP tasks in that specific domain with few-shot prompting.
For input to the model, use a training and optional validation directory. Each directory contains a CSV, JSON, or TXT file. For CSV and JSON files, the train or validation data is used from the column called text or the first column if no column called text is found. The number of files under train and validation (if provided) should equal to 1, respectively.
The output is a trained model that can be deployed for inference.
The following is an example of a TXT file for fine-tuning the text generation model. The TXT file is SEC filings of Amazon from 2021–2022:

This report includes estimates, projections, statements relating to our
business plans, objectives, and expected operating results that are “forward-
looking statements” within the meaning of the Private Securities Litigation
Reform Act of 1995, Section 27A of the Securities Act of 1933, and Section 21E
of the Securities Exchange Act of 1934. Forward-looking statements may appear
throughout this report, including the following sections: “Business” (Part I,
Item 1 of this Form 10-K), “Risk Factors” (Part I, Item 1A of this Form 10-K),
and “Management’s Discussion and Analysis of Financial Condition and Results
of Operations” (Part II, Item 7 of this Form 10-K). These forward-looking
statements generally are identified by the words “believe,” “project,”
“expect,” “anticipate,” “estimate,” “intend,” “strategy,” “future,”
“opportunity,” “plan,” “may,” “should,” “will,” “would,” “will be,” “will
continue,” “will likely result,” and similar expressions. Forward-looking
statements are based on current expectations and assumptions that are subject
to risks and uncertainties that may cause actual results to differ materially.
We describe risks and uncertainties that could cause actual results and events
to differ materially in “Risk Factors,” “Management’s Discussion and Analysis
of Financial Condition and Results of Operations,” and “Quantitative and
Qualitative Disclosures about Market Risk” (Part II, Item 7A of this Form
10-K). Readers are cautioned not to place undue reliance on forward-looking
statements, which speak only as of the date they are made. We undertake no
obligation to update or revise publicly any forward-looking statements,
whether because of new information, future events, or otherwise.

GENERAL

Embracing Our Future …

Instruction fine-tuning
The text generation model can be instruction-tuned on any text data provided that the data is in the expected format. The instruction-tuned model can be further deployed for inference.
For input, use a training and optional validation directory. The train and validation directories should contain one or multiple JSON lines (.jsonl) formatted files. In particular, the train directory can also contain an optional *.json file describing the input and output formats.
The best model is selected according to the validation loss, calculated at the end of each epoch. If a validation set is not given, an (adjustable) percentage of the training data is automatically split and used for validation.
The training data must be formatted in a JSON lines (.jsonl) format, where each line is a dictionary representing a single data sample. All training data must be in a single folder; however, it can be saved in multiple .jsonl files. The .jsonl file extension is mandatory. The training folder can also contain a template.json file describing the input and output formats. If no template file is given, the following template will be used:

{
“prompt”: “Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.nn### Instruction:n{instruction}nn### Input:n{context}`,
“completion”: “{response}”,
}

In this case, the data in the JSON lines entries must include prompt and completion fields. If a custom template is provided, it must also use prompt and completion keys to define the input and output templates. The following is a sample custom template:

{
“prompt”: “question: {question} context: {context}”,
“completion”: “{answer}”
}

Here, the data in the JSON lines entries must include the question, context, and answer fields.
The output is a trained model that can be deployed for inference.
We provide a subset of SEC filings data of Amazon. It is downloaded from publicly available EDGAR. For instructions on accessing the data, refer to Accessing EDGAR Data.
License: Creative Commons Attribution-ShareAlike License (CC BY-SA 4.0)

Microsoft Files A Patent On It’s New Artificial Intelligence-Powered …

In a world increasingly dominated by smart devices and artificial intelligence, a new patent filed by Microsoft in May 2023 is raising eyebrows and curiosity. The tech giant has set its sights on the future by developing an AI-supported Microsoft smart backpack. This unconventional invention promises to revolutionize everyday life by utilizing artificial intelligence to identify objects, perform contextual tasks, access cloud-based information, and seamlessly interact with other devices. 

While the concept of wearable technology is not entirely new, existing solutions have only scratched the surface of what is possible. Brands like Targus and Intel have introduced their own versions of smart backpacks, each offering unique features such as integrated rechargeable batteries and GPS units for the visually impaired. However, Microsoft’s innovation appears poised to take this concept to an entirely new level.

According to the patent, Microsoft’s smart backpack is equipped with a camera, microphone, speaker, network interface, processor, and storage. It serves as a hands-free digital assistant that is context-aware and capable of understanding and responding to user commands based on their surroundings. This groundbreaking technology operates through a six-step process, beginning with user input and culminating in a comprehensive AI-driven response.

The potential metrics for evaluating the success of Microsoft’s smart backpack are vast. From its ability to accurately identify objects and provide contextual information to its seamless integration with other devices and cloud-based services, the backpack’s performance will be measured by its convenience, efficiency, and ability to enhance everyday tasks. Metrics may also include user satisfaction and adoption rates and the device’s impact on productivity and convenience in various settings.

In conclusion, Microsoft’s foray into the world of AI-supported smart backpacks represents a significant leap forward in wearable technology. With the power to enhance everyday experiences by providing real-time information and assistance, this innovation has the potential to reshape the way we interact with our surroundings. While existing solutions have paved the way, Microsoft’s smart backpack seems poised to be a game-changer in the world of wearable AI, offering users a glimpse into a future that once seemed confined to the realm of science fiction.

Check out the Patent. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

The post Microsoft Files A Patent On It’s New Artificial Intelligence-Powered Smart Backpack That Not Only Hears You But Also Sees What You See appeared first on MarkTechPost.

This Artificial Intelligence AI Research Proposes SAM-Med2D: The Most …

By recognizing and separating different tissues, organs, or regions of interest, medical image segmentation is essential to studying medical pictures. For more exact diagnosis and therapy, clinicians can use accurate segmentation to help them locate and accurately pinpoint disease regions. Additionally, thorough insights into the morphology, structure, and function of various tissues or organs are provided through quantitative and qualitative analysis of medical pictures, enabling the study of illness. Due to the peculiarities of medical imaging, such as its wide variety of modalities, complicated tissue and organ architecture, and absence of annotated data, most existing approaches are restricted to certain modalities, organs, or pathologies. 

Because of this restriction, algorithms are difficult to generalize and modify for use in various clinical contexts. The push towards large-scale models has recently generated excitement among the AI community. The development of general AI models like ChatGPT2, ERNIE Bot 3, DINO, SegGPT, and SAM makes employing a single model for various tasks possible. With SAM, the most recent large-scale vision model, users may create masks for certain regions of interest by interactively clicking, drawing bounding boxes, or using verbal cues. Significant attention has been paid to its zero-shot and few-shot capabilities on natural photos across various fields. 

Some efforts have also concentrated on the SAMs’ zero-shot capability in the context of medical imaging. However, SAM finds it difficult to generalize to multi-modal and multi-object medical datasets, leading to variable segmentation performance across datasets. This is because there is a considerable domain gap between natural and medical images. The cause can be linked to the methods used to gather the data: due to their specific clinical purpose, medical pictures are obtained using particular protocols and scanners and displayed as various modalities (electrons, lasers, X-rays, ultrasound, nuclear physics, and magnetic resonance). As a result, these images deviate substantially from real images since they depend on various physics-based features and energy sources. 

Natural and medical images differ significantly in terms of pixel intensity, color, texture, and other distribution features, as seen in Figure 1. Because SAM is trained on only natural photos, it needs more specialized information regarding medical imaging, so it cannot be immediately applied to the medical sector. Providing SAM with medical information is challenging due to the high annotation cost and inconsistent annotation quality. Medical data preparation needs subject expertise, and the quality of this data differs greatly between institutions and clinical trials. The amount of medical and natural images varies significantly due to these difficulties. 

The bar chart in Figure 1 compares the data volume of publicly available natural image datasets and medical image datasets. For instance, Totalsegmentor, the largest public segmentation dataset in the medical domain, also has a significant gap compared to Open Image v6   and SA-1B. In this study, their objective is to transfer SAM from natural images to medical images. This will provide benchmark models and evaluation frameworks for researchers in medical image analysis to explore and enhance. To achieve this goal, researchers from Sichuan University and Shanghai AI Laboratory proposed SAM-Med2D, the most comprehensive study on applying SAM to medical 2D images.

Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

The post This Artificial Intelligence AI Research Proposes SAM-Med2D: The Most Comprehensive Studies on Applying SAM to Medical 2D Images appeared first on MarkTechPost.

Best AI Tools for Project Management (September 2023)

Notion AI

Thanks to its cutting-edge artificial intelligence technology, whether it’s blog entries, meeting agendas, or sales emails, Notion AI can help you create them fast and easily. Users may save time and write more quickly by tapping into AI’s boundless potential, allowing them to think more broadly and enhance their creative output. Notion AI handles the rough writing, giving users a head start on difficult paragraphs or whole pages. It’s useful for correcting spelling and grammar, translating text inline, and even creating poetry. Users may explore new avenues of thought, find new sources of inspiration, and witness the wonder of AI-powered content creation, all with the help of Notion AI. On November 16th, Notion AI will be released in alpha, and individuals on the waitlist will gain access in the order in which they signed up. In the early access period, Notion AI will not cost anything.

ClickUp 

ClickUp is an all-inclusive project management solution that improves communication within and across groups of any size or industry. Because of its versatility and ease of use, ClickUp has quickly become the go-to product management solution for product creation and planning. It’s an adaptable platform with many ready-to-use product team templates for things like product planning, roadmaps, and new product releases. ClickUp’s flexibility and helpful features allow each team to modify the platform to their requirements and the subtleties of their process.

Ayanza 

Ayanza is an artificial intelligence management platform concerned with the well-being and productivity of teams. It’s easy to navigate, and it comes with a wide variety of project management templates and a newsfeed board with chat functions so that you can stay in touch with your team through a social media-like stream of updates. Ayanza takes an innovative approach to artificial intelligence that can help you and your team achieve greater success. Ayanza’s AI integration improves team morale and open communication by facilitating ideation, speedy idea sharing, and deeper workflow comprehension. Ask the AI writer a question and receive an instant response with only a few words. Writing a bespoke report that is both informative and easy to complete has always been more complex. Reduce the stress in your life by doing any routine work swiftly.

Process.st

Businesses may use Process.st, or Process Street to manage processes and improve efficiency. The system is driven by ChatGPT, which automates work on projects and boosts productivity. Collaboration is at the heart of this cutting-edge method of managing processes and tasks. Its main functions include project management, deliverable tracking, automated check-ins, enhanced data analytics, and routine job management. Slack, Zapier, and Trello are just a few project management apps that may be integrated with Process Street.

Project Planner

Cost management, project scheduling, reporting on task performance, and real-time communication are all made easier with Project Planner, a cloud-based project management software. Project Planner has grown in capability with the introduction of PAI, an artificial intelligence-driven software solution for managing projects. By automating the generation of tasks and documents, PAI facilitates teamwork. The hub is built to inform everyone involved in a project at every stage of its development, from team members to stakeholders to managers.

Project Insight (PI)

Project Insight (PI) is software for managing and reporting on a company’s projects and portfolios. PI gives your team access to project monitoring, resource management, and budget tracking thanks to its AI-powered project management tools and capabilities. The automated repetitious chores and optimized resource allocation are made possible by the AI-powered project management tools. While connectivity is not PI’s top priority, the platform does offer an interface with widely used apps like Trello, Slack, Jira Software, and Zendesk to facilitate smoother teamwork and better management.

Wrike 

Wrike, one of the best business resource and project management applications, simplifies workflow planning, resource allocation, data visualization, and team communication. Wrike’s skills in project management, blog post creation, and more have been bolstered by its new AI integration. Resource planning, Gantt charts, time tracking, task management, and performance reporting are some of the available project management tools. In addition, it works with more than 400 other programs to simplify the work of every sort of project manager. Wrike’s task management and collaboration capabilities may be adjusted to meet the changing demands of each team or project. Because of its flexibility and scalability, it may be used by both new and existing businesses for project management.

Basecamp 

After over two decades in the market, the popular project management application Basecamp has begun integrating artificial intelligence (AI) project management into its platform. Basecamp offers several AI-powered capabilities to aid organizations of all sizes in project management, including task management tools, communication, reporting, and resource management. However, despite the addition of cutting-edge AI features, Basecamp’s dated user interface still needs to be improved. Despite its modernization, it still needs the basic capabilities seen in most current applications, despite its original intent as a to-do list.

Kintone 

Kintone is a cloud-based platform that may be personalized to facilitate managing your team’s activities, communications, and data. In essence, it’s a low-code platform on which to create unique business apps and processes. It’s not designed to be an AI platform but includes several AI-related capabilities that can simplify and speed up common processes. It will serve as a visual application builder, allowing users to transform static spreadsheets into interactive, user-specific databases without the need to learn any code. Workflows, databases, and apps may all be automated with the help of Kintone AI. Project management, customer relationship management, inventory tracking, and other pre-built apps are also available.

Hive 

Hive is a cloud-based project management platform with AI-powered features that automate chores and optimize workflows to make project teams more productive. Hive’s AI technology, for instance, may aid with things like planning out projects, allocating resources, and prioritizing tasks. It can also aid in foreseeing problems and helping teams prepare for them. Adaptable project layouts, file sharing, and team chat rooms are just a few features that may streamline your operation. Additionally, many other applications may be integrated with Hive.

ProofHub 

ProofHub is a software for keeping track of several projects and tasks in one place. It will be helpful for groups working toward similar goals or companies looking for a centralized center for project management. The sophisticated AI in ProofHub helps businesses and organizations save time and money by assisting with resource allocation and work prioritization. The AI monitors these critical metrics and notifies the appropriate parties when action is necessary. By generating in-depth reports on your projects with the aid of AI, ProofHub AI might help you save time. Consider automating routine processes like approvals or reports if you need more time to focus on your major PM duties. 

Monday 

Monday is a unified workspace where you and your team can easily create and manage projects, tasks, and procedures. Features such as items, columns, groups, and boards make it easy to organize and delegate work in an agile manner, and the system’s customization options are practically limitless. Multiple AI-specific add-ons are available on Monday as well. However, it also has several automated and AI-enhanced capabilities. The product’s flagship AI feature, AI Sales, helps businesses increase revenue, locate new sales prospects, and optimize existing business processes based on user feedback. It allows companies to achieve their sales targets by providing them with actionable advice, automated processes, and accurate predictions based on data. 

BeforeSunset 

You can let the AI-powered project management tool handle everything, or you can do it yourself by using the available daily templates and calendar sync features. When compared to competing AI software, BeforeSunset stands out because it lets you tailor your company’s everyday operations to meet your specific requirements. The program does more than simply help you keep track of your time; it also collaborates with you to design the ideal timetable. You can use the AI assistant to convert your business intelligence into actionable tasks, break down a massive project into manageable chunks, find the best time for your team to meet, and keep your weekly schedule up-to-date, all with the help of BeforeSunset’s AI-enabled project management system. Personal analytics, private task selection lists, time tracking, note taking, a searchable historical project view, visual verification, etc., are some additional features.

Trello 

Trello is widely used because of its user-friendliness and its simplicity as a project management tool. While Trello may not have as many AI-powered features as some other apps, it does make use of AI to improve productivity and expedite the process of task management. Its user-friendly Kanban-style boards make monitoring the status of projects and individual tasks simple. The artificial intelligence (AI) behind Trello’s smart ideas analyzes past data to suggest changes to task allocation, due dates, and dependencies. Businesses of any size that need a straightforward and intuitive project management solution should look at Trello. It’s great for agile project management, keeping track of tasks, and keeping everyone on the same page.

Naav 

Naav is a solution for managing projects that boosts productivity in teams and businesses. You can visualize your workflow and keep tabs on tasks with Naav’s Kanban board, create a list view of your tasks with the backlog, plot out your tasks’ timelines and dependencies with the roadmap, and monitor your team’s progress in real-time with a custom dashboard. Naav also has a wiki feature for building and organizing a knowledge base, a release feature for managing the delivery of finished work, a sprint feature for planning reasonable work times, and a custom board tool for letting users design their own boards with specified columns and processes. In addition to its intelligent task organization and lightning-fast search capabilities, Naav’s AI integration and default permission settings make it simple to manage users’ access.

Dart 

Dart is an artificial intelligence (AI)-driven project management application developed to streamline and automate project management. Dart’s GPT-4 technology allows teams to spend less time on administrative activities and more on real project work, saving up to a third of the time normally spent on project management. The platform’s streamlined interface and rich keyboard shortcuts make it ideal for productivity. Dart paves the way for teams to organize tasks, subtasks, and projects and seamlessly roll over sprint cycles with total freedom of action. In addition, Dart may be adjusted to accommodate the unique routines of different groups, hence improving coordination across departments and personnel. Dart also provides many connectors with common applications like email, Slack, GitHub, and Notion, allowing teams to collaborate effectively across devices without moving between applications.

Kypso 

Kypso is an AI copilot that helps teams manage and extend their operations across common technologies, making it a project intelligence platform that aids in the delivery of superior products. The software is designed to help groups overcome obstacles and achieve sustainable project success. Kypso provides a number of tools designed to improve productivity in teams and the overall delivery of projects. By doing so, crucial conversations may be recorded and resolved without any delays or missed chances. Kypso helps reduce scope creep and finish projects on time by spotting ambiguities in the plan early. Teams may keep stakeholders in the loop by regularly reporting on their progress on a predetermined timeline. Kypso’s adaptability is a major selling point. Each team’s demands in terms of operations may be catered to with this tool, making it much simpler for them to construct robust systems. Multiple tools can be integrated with Kypso, and more are on the way. Kypso guarantees the safety and management of user data in terms of security and privacy.

Moncelo 

To that end, Moncelo has been developed as an AI-powered personal project management application. Users may measure their progress and release their imagination with the help of the tool’s AI-generated tasks. Moncelo’s project overview dashboard keeps customers apprised of the overall health of their projects and allows them to prioritize tasks accordingly. A lifetime membership to the tool costs $60 and grants access to infinite projects and upgrades, while a monthly subscription costs $5. All of the aforementioned features, including AI-generated tasks, an overview dashboard, unlimited projects, and a cancelation policy at any time, are standard across both plans. With artificial intelligence, the Moncelo platform seeks to make project management more efficient and effective.

Plexo 

Plexo is an open-source PMO that prioritizes cooperation and collaboration, making it ideal for today’s innovators. This system introduces a fresh method of collaboration by providing tools that improve the efficiency and effectiveness of project management. Since Plexo is open source, it may be modified to meet the requirements of different groups and companies. Users may easily create and manage projects, delegate responsibilities, establish due dates, and exchange files on this highly customizable platform. Plexo’s capacity to improve teamwork is one of its most valuable features. The platform’s built-in messaging functions facilitate team communication and ensure everyone is current on the project’s status. Plexo’s comprehensive analytics and reporting services are available with its collaborative features. Users may compile extensive reports on the quality of their projects and other important indicators, yielding useful information on the effectiveness of their teams.

SolidlyAI 

By summarizing and keeping track of Slack communications and giving users crucial context when needed, the AI-powered project management assistant SolidlyAI helps cut unnecessary meetings and boost output. The primary goal of this application is to let its users focus on the work at hand while still receiving important notifications and information. SolidlyAI notifies team members when changes are needed, distributes personalized briefings to partner teams, and condenses messaging for end users. It also monitors incoming text messages, allowing users to silence interruptions by disabling alerts. By employing AI to keep tabs on the particulars and lessen the time spent playing catch-up, SolidlyAI helps teams overcome the challenge of constant interruptions from pings, meetings, and other sources. The tool aims to reduce interruptions and free users’ attention to concentrate on the task.

Also, don’t forget to join our Reddit Page, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you have any questions, feel free to message us at Asif@marktechpost.com

The post Best AI Tools for Project Management (September 2023) appeared first on MarkTechPost.

Build a generative AI-based content moderation solution on Amazon Sage …

Content moderation plays a pivotal role in maintaining online safety and upholding the values and standards of websites and social media platforms. Its significance is underscored by the protection it provides users from exposure to inappropriate content, safeguarding their well-being in digital spaces. For example, in the advertising industry, content moderation serves to shield brands from unfavorable associations, thereby contributing to brand elevation and revenue growth. Advertisers prioritize their brand’s alignment with appropriate content to uphold their reputation and avert negative publicity. Content moderation also assumes critical importance in the finance and healthcare sectors, where it serves multiple functions. It plays an important role in identifying and safeguarding sensitive personal identifiable and health information (PII, PHI). By adhering to internal standards and practices and complying with external regulations, content moderation enhances digital security for users. This way, it prevents the inadvertent sharing of confidential data on public platforms, ensuring the preservation of user privacy and data security.
In this post, we introduce a novel method to perform content moderation on image data with multi-modal pre-training and a large language model (LLM). With multi-modal pre-training, we can directly query the image content based on a set of questions of interest and the model will be able to answer these questions. This enables users to chat with the image to confirm if it contains any inappropriate content that violates the organization’s policies. We use the powerful generating capability of LLMs to generate the final decision including safe/unsafe labels and category type. In addition, by designing a prompt, we can make an LLM generate the defined output format, such as JSON format. The designed prompt template allows the LLM to determine if the image violates the moderation policy, identify the category of violation, explain why, and provide the output in a structured JSON format.
We use BLIP-2 as the multi-modal pre-training method. BLIP-2 is one of the state-of-the-art models in multi-modal pre-training and outperforms most of the existing methods in visual question answering, image captioning, and image text retrieval. For our LLM, we use Llama 2, the next generation open-source LLM, which outperforms existing open-source language models on many benchmarks, including reasoning, coding, proficiency, and knowledge tests. The following diagram illustrates the solution components.

Challenges in content moderation
Traditional content moderation methods, such as human-based moderation, can’t keep up with the growing volume of user-generated content (UGC). As the volume of UGC increases, human moderators can become overwhelmed and struggle to moderate content effectively. This results in a poor user experience, high moderation costs, and brand risk. Human-based moderation is also prone to errors, which can result in inconsistent moderation and biased decisions. To address these challenges, content moderation powered by machine learning (ML) has emerged as a solution. ML algorithms can analyze large volumes of UGC and identify content that violates the organization’s policies. ML models can be trained to recognize patterns and identify problematic content, such as hate speech, spam, and inappropriate material. According to the study Protect your users, brand, and budget with AI-powered content moderation, ML-powered content moderation can help organizations reclaim up to 95% of the time their teams spend moderating content manually. This allows organizations to focus their resources on more strategic tasks, such as community building and content creation. ML-powered content moderation can also reduce moderation costs because it’s more efficient than human-based moderation.
Despite the advantages of ML-powered content moderation, it still has further improvement space. The effectiveness of ML algorithms heavily relies on the quality of the data they are trained on. When models are trained using biased or incomplete data, they can make erroneous moderation decisions, exposing organizations to brand risks and potential legal liabilities. The adoption of ML-based approaches for content moderation brings several challenges that necessitate careful consideration. These challenges include:

Acquiring labeled data – This can be a costly process, especially for complex content moderation tasks that require training labelers. This cost can make it challenging to gather large enough datasets to train a supervised ML model with ease. Additionally, the accuracy of the model heavily relies on the quality of the training data, and biased or incomplete data can result in inaccurate moderation decisions, leading to brand risk and legal liabilities.
Model generalization – This is critical to adopting ML-based approaches. A model trained on one dataset may not generalize well to another dataset, particularly if the datasets have different distributions. Therefore, it is essential to ensure that the model is trained on a diverse and representative dataset to ensure it generalizes well to new data.
Operational efficiency – This is another challenge when using conventional ML-based approaches for content moderation. Constantly adding new labels and retraining the model when new classes are added can be time-consuming and costly. Additionally, it is essential to ensure that the model is regularly updated to keep up with changes in the content being moderated.
Explainability – End users may perceive the platform as biased or unjust if content gets flagged or removed without justification, resulting in a poor user experience. Similarly, the absence of clear explanations can render the content moderation process inefficient, time-consuming, and costly for moderators.
Adversarial nature – The adversarial nature of image-based content moderation presents a unique challenge to conventional ML-based approaches. Bad actors can attempt to evade content moderation mechanisms by altering the content in various ways, such as using synonyms of images or embedding their actual content within a larger body of non-offending content. This requires constant monitoring and updating of the model to detect and respond to such adversarial tactics.

Multi-modal reasoning with BLIP-2
Multi-modality ML models refer to models that can handle and integrate data from multiple sources or modalities, such as images, text, audio, video, and other forms of structured or unstructured data. One of the popular multi-modality models is the visual-language models such as BLIP-2, which combines computer vision and natural language processing (NLP) to understand and generate both visual and textual information. These models enable computers to interpret the meaning of images and text in a way that mimics human understanding. Vision-language models can tackle a variety of tasks, including image captioning, image text retrieval, visual question answering, and more. For example, an image captioning model can generate a natural language description of an image, and an image text retrieval model can search for images based on a text query. Visual question answering models can respond to natural language questions about images, and multi-modal chatbots can use visual and textual inputs to generate responses. In terms of content moderation, you can use this capability to query against a list of questions.
BLIP-2 contains three parts. The first component is a frozen image encoder, ViT-L/14 from CLIP, which takes image data as input. The second component is a frozen LLM, FlanT5, which outputs text. The third component is a trainable module called Q-Former, a lightweight transformer that connects the frozen image encoder with the frozen LLM. Q-Former employs learnable query vectors to extract visual features from the frozen image encoder and feeds the most useful visual feature to the LLM to output the desired text.
The pre-training process involves two stages. In the first stage, vision-language representation learning is performed to teach Q-Former to learn the most relevant visual representation for the text. In the second stage, vision-to-language generative learning is performed by connecting the output of Q-Former to a frozen LLM and training Q-Former to output visual representations that can be interpreted by the LLM.
BLIP-2 achieves state-of-the-art performance on various vision-language tasks despite having significantly fewer trainable parameters than existing methods. The model also demonstrates emerging capabilities of zero-shot image-to-text generation that can follow natural language instructions. The following illustration is modified from the original research paper.

Solution overview
The following diagram illustrates the solution architecture.

In the following sections, we demonstrate how to deploy BLIP-2 to an Amazon SageMaker endpoint, and use BLIP-2 and an LLM for content moderation.
Prerequisites
You need an AWS account with an AWS Identity and Access Management (IAM) role with permissions to manage resources created as part of the solution. For details, refer to Create a standalone AWS account.
If this is your first time working with Amazon SageMaker Studio, you first need to create a SageMaker domain. Additionally, you may need to request a service quota increase for the corresponding SageMaker hosting instances. For the BLIP-2 model, we use an ml.g5.2xlarge SageMaker hosting instance. For the Llama 2 13B model, we use an ml.g5.12xlarge SageMaker hosting instance.
Deploy BLIP-2 to a SageMaker endpoint
You can host an LLM on SageMaker using the Large Model Inference (LMI) container that is optimized for hosting large models using DJLServing. DJLServing is a high-performance universal model serving solution powered by the Deep Java Library (DJL) that is programming language agnostic. To learn more about DJL and DJLServing, refer to Deploy large models on Amazon SageMaker using DJLServing and DeepSpeed model parallel inference. With the help of the SageMaker LMI container, the BLIP-2 model can be easily implemented with the Hugging Face library and hosted on SageMaker. You can run blip2-sagemaker.ipynb for this step.
To prepare the Docker image and model file, you need to retrieve the Docker image of DJLServing, package the inference script and configuration files as a model.tar.gz file, and upload it to an Amazon Simple Storage Service (Amazon S3) bucket. You can refer to the inference script and configuration file for more details.
inference_image_uri = image_uris.retrieve(
framework=”djl-deepspeed”, region=sess.boto_session.region_name, version=”0.22.1″
)
! tar czvf model.tar.gz blip2/
s3_code_artifact = sess.upload_data(“model.tar.gz”, bucket, s3_code_prefix)
When the Docker image and inference related files are ready, you create the model, the configuration for the endpoint, and the endpoint:
from sagemaker.utils import name_from_base
blip_model_version = “blip2-flan-t5-xl”
model_name = name_from_base(blip_model_version)
model = Model(
image_uri=inference_image_uri,
model_data=s3_code_artifact,
role=role,
name=model_name,
)
model.deploy(
initial_instance_count=1,
instance_type=”ml.g5.2xlarge”,
endpoint_name=model_name
)
When the endpoint status becomes in service, you can invoke the endpoint for image captioning and the instructed zero-shot vision-to-language generation task. For the image captioning task, you only need to pass an image to the endpoint:
import base64
import json
from PIL import Image

smr_client = boto3.client(“sagemaker-runtime”)

def encode_image(img_file):
with open(img_file, “rb”) as image_file:
img_str = base64.b64encode(image_file.read())
base64_string = img_str.decode(“latin1”)
return base64_string

def run_inference(endpoint_name, inputs):
response = smr_client.invoke_endpoint(
EndpointName=endpoint_name, Body=json.dumps(inputs)
)
print(response[“Body”].read())

test_image = “carcrash-ai.jpeg”
base64_string = encode_image(test_image)
inputs = {“image”: base64_string}
run_inference(endpoint_name, inputs)
For the instructed zero-shot vision-to-language generation task, in addition to the input image, you need to define the question as a prompt:
base64_string = encode_image(test_image)
inputs = {“prompt”: “Question: what happened in this photo? Answer:”, “image”: base64_string}
run_inference(endpoint_name, inputs)
Use BLIP-2 and LLM for content moderation
In this stage, you can make queries on the given image and retrieve hidden information. With the LLM, you organize the queries and retrieve information to generate the JSON format result. You can roughly split this task into the following two sub-tasks:

Extract information from the image with the BLIP-2 model.
Generate the final result and explanation with the LLM.

Extract information from the image with the BLIP-2 model
To retrieve enough useful hidden information from the given image, you need to define queries. Because each query will invoke the endpoint once, many queries will lead to longer processing time. Therefore, we suggest making queries high quality and cover all policies but also without duplicated. In our sample code, we define the queries as follows:
check_list = [
“Does this photo contain complete naked person?”,
“Does this photo contain topless person?”,
“Does this photo contain weapon?”,
“Does this photo contain contact information?”,
“Does this photo contain a smoker?”,
“Does this photo contain blood?”,
“Are there persons fighting in this photo?”,
“Does this photo contain harassment words?”
]
With the preceding queries, invoke the endpoint of BLIP-2 to retrieve the information with the following code:
test_image = “./surf_swimwear.png”
raw_image = Image.open(test_image).convert(‘RGB’)

base64_string = encode_image(test_image)
conversations = “”””””
for question in check_list:
inputs = {“prompt”: f”Question: {question}? Answer:”, “image”: base64_string}
response = run_inference(endpoint_name, inputs)
conversations += f”””
Question: {question}
Answer: {response}.
“””
In addition to the information retrieved by queries, you can get information with the image captioning task by invoking the endpoint without the prompt field in the payload:
inputs = {“image”: base64_string}
response = smr_client.invoke_endpoint(
EndpointName=endpoint_name, Body=json.dumps(inputs)
)
image_caption = response[“Body”].read().decode(‘utf-8’)
You can combine the contents of queries and answers with the image caption and use this retrieved information for the downstream task, described in the next section below.
Generate the final result and explanation with the LLM
Large language models (LLMs) such as Llama 2 can generate high-quality results with the right prompt template. Using Amazon SageMaker JumpStart, ML practitioners can choose from a broad selection of publicly available foundation models. With just a few clicks in SageMaker Studio, you can now discover and deploy Llama 2.
The final results rely on a LLM with a specific prompt template. Such prompt consists of: the moderation policy based on categories of inappropriate or offensive moderation categories; image information extracted from BLIP-2; the question template to the LLM asking if the photo contains unsafe content and requesting its category and reason if unsafe; and instructions to output the results in JSON format. The designed prompt template allows the LLM to determine if the image violates the moderation policy, identify the category of violation, explain why, and provide the output in a structured JSON format.
The core source code is as follows:
prompt_template = f”””
The following is our company’s content moderation policy, based on the moderation policy, we gather image information from the user uploaded image. Please answer the question with json format.

###### moderation policy ######
{moderation_policy}

###### Image information ######
{conversations}

###### Question ######
Based on the following Moderation policy and QA, tell me if the photo containes unsafe content, also give its category and reason if it’s unsafe. Please anwser the question with the following format and only put explanation into the reason field:
“””

prompt_template += “””
{
“flag”: “xxx”,
“category”: “xxx”,
“reason”: “the reason is …”
}
“””

dialog = [
{“role”: “user”, “content”: prompt_template}
]
You can customize the prompt based on your own use case. Refer to the notebook for more details. When the prompt is ready, you can invoke the LLM endpoint to generate results:
endpoint_name = “jumpstart-dft-meta-textgeneration-llama-2-70b-f”

def query_endpoint(payload):
client = boto3.client(“sagemaker-runtime”)
response = client.invoke_endpoint(
EndpointName=endpoint_name,
ContentType=”application/json”,
Body=json.dumps(payload),
CustomAttributes=”accept_eula=true”,
)
response = response[“Body”].read().decode(“utf8”)
response = json.loads(response)
return response

payload = {
“inputs”: [dialog],
“parameters”: {“max_new_tokens”: 256, “top_p”: 0.9, “temperature”: 0.5}
}
result = query_endpoint(payload)[0]
Part of the generated output is as follows:

> Assistant: {
“flag”: “unsafe”,
“category”: “Suggestive”,
“reason”: “The photo contains a topless person, which is considered suggestive content.”
}

Explanation:
The photo contains a topless person, which violates the moderation policy’s rule number 2, which states that suggestive content includes “Female Swimwear Or Underwear, Male Swimwear Or Underwear, Partial Nudity, Barechested Male, Revealing Clothes and Sexual Situations.” Therefore, the photo is considered unsafe and falls under the category of Suggestive.

Occasionally, Llama 2 attaches additional explanation besides the answer from the assistant. You could use the parsing code to extract JSON data from the original generated results:
answer = result[‘generation’][‘content’].split(‘}’)[0]+’}’
json.loads(answer)

Advantages of generative approaches
The preceding sections showed how to implement the core part of model inference. In this section, we cover various aspects of generative approaches, including comparisons with conventional approaches and perspectives.
The following table compares each approach.

.
Generative Approach
Classification Approach

Acquiring labeled data
Pre-trained model on a large number of images, zero-shot inference
Requires data from all types of categories

Model generalization
Pre-trained model with various types of images
Requires a large volume of content moderation related data to improve model generalization

Operational efficiency
Zero-shot capabilities
Requires training the model for recognizing different patterns, and retraining when labels are added

Explainability
Reasoning as the text output, great user experience
Hard to achieve reasoning, hard to explain and interpret

Adversarial nature
Robust
High frequency retraining

Potential use cases of multi-modal reasoning beyond content moderation
The BLIP-2 models can be applied to fit multiple purposes with or without fine-tuning, which includes the following:

Image captioning – This asks the model to generate a text description for the image’s visual content. As illustrated in the following example image (left), we can have “a man is standing on the beach with a surfboard” as the image description.
Visual question answering –  As the example image in the middle shows, we can ask “Is it commercial related content” and we have “yes” as the answer. In addition, BLIP-2 supports the multi-round conversation and outputs the following question: “Why do you think so?” Based on the visual cue and LLM capabilities, BLIP-2 outputs “it’s a sign for amazon.”
Image text retrieval – Given the question as “Text on the image”, we can extract the image text “it’s monday but keep smiling” as demonstrated in the image on the right.

The following images show examples to demonstrate the zero-shot image-to-text capability of visual knowledge reasoning.

As we can see from various examples above, multi-modality models open up new opportunities for solving complex problems that traditional single-modality models would struggle to address.
Clean up
To avoid incurring future charges, delete the resources created as part of this post. You can do this by following the instructions in the notebook cleanup section, or delete the created endpoints via the SageMaker console and resources stored in the S3 bucket.
Conclusion
In this post, we discussed the importance of content moderation in the digital world and highlighted its challenges. We proposed a new method to help improve content moderation with image data and perform question answering against the images to automatically extract useful information. We also provided further discussion on the advantages of using a generative AI-based approach compared to the traditional classification-based approach. Lastly, we illustrated the potential use cases of visual-language models beyond content moderation.
We encourage you to learn more by exploring SageMaker and building a solution using the multi-modality solution provided in this post and a dataset relevant to your business.

About the Authors
Gordon Wang is a Senior AI/ML Specialist TAM at AWS. He supports strategic customers with AI/ML best practices cross many industries. He is passionate about computer vision, NLP, generative AI, and MLOps. In his spare time, he loves running and hiking.
Yanwei Cui, PhD, is a Senior Machine Learning Specialist Solutions Architect at AWS. He started machine learning research at IRISA (Research Institute of Computer Science and Random Systems), and has several years of experience building AI-powered industrial applications in computer vision, natural language processing, and online user behavior prediction. At AWS, he shares his domain expertise and helps customers unlock business potentials and drive actionable outcomes with machine learning at scale. Outside of work, he enjoys reading and traveling.
Melanie Li, PhD, is a Senior AI/ML Specialist TAM at AWS based in Sydney, Australia. She helps enterprise customers build solutions using state-of-the-art AI/ML tools on AWS and provides guidance on architecting and implementing ML solutions with best practices. In her spare time, she loves to explore nature and spend time with family and friends.

How Carrier predicts HVAC faults using AWS Glue and Amazon SageMaker

In their own words, “In 1902, Willis Carrier solved one of mankind’s most elusive challenges of controlling the indoor environment through modern air conditioning. Today, Carrier products create comfortable environments, safeguard the global food supply, and enable safe transport of vital medical supplies under exacting conditions.”
At Carrier, the foundation of our success is making products our customers can trust to keep them comfortable and safe year-round. High reliability and low equipment downtime are increasingly important as extreme temperatures become more common due to climate change. We have historically relied on threshold-based systems that alert us to abnormal equipment behavior, using parameters defined by our engineering team. Although such systems are effective, they are intended to identify and diagnose equipment issues rather than predict them. Predicting faults before they occur allows our HVAC dealers to proactively address issues and improve the customer experience.
In order to improve our equipment reliability, we partnered with the Amazon Machine Learning Solutions Lab to develop a custom machine learning (ML) model capable of predicting equipment issues prior to failure. Our teams developed a framework for processing over 50 TB of historical sensor data and predicting faults with 91% precision. We can now notify dealers of impending equipment failure, so that they can schedule inspections and minimize unit downtime. The solution framework is scalable as more equipment is installed and can be reused for a variety of downstream modeling tasks.
In this post, we show how the Carrier and AWS teams applied ML to predict faults across large fleets of equipment using a single model. We first highlight how we use AWS Glue for highly parallel data processing. We then discuss how Amazon SageMaker helps us with feature engineering and building a scalable supervised deep learning model.
Overview of use case, goals, and risks
The main goal of this project is to reduce downtime by predicting impending equipment failures and notifying dealers. This allows dealers to schedule maintenance proactively and provide exceptional customer service. We faced three primary challenges when working on this solution:

Data scalability – Data processing and feature extraction needs to scale across large growing historical sensor data
Model scalability – The modeling approach needs to be capable of scaling across over 10,000 units
Model precision – Low false positive rates are needed to avoid unnecessary maintenance inspections

Scalability, both from a data and modeling perspective, is a key requirement for this solution. We have over 50 TB of historical equipment data and expect this data to grow quickly as more HVAC units are connected to the cloud. Data processing and model inference need to scale as our data grows. In order for our modeling approach to scale across over 10,000 units, we need a model that can learn from a fleet of equipment rather than relying on anomalous readings for a single unit. This will allow for generalization across units and reduce the cost of inference by hosting a single model.
The other concern for this use case is triggering false alarms. This means that a dealer or technician will go on-site to inspect the customer’s equipment and find everything to be operating appropriately. The solution requires a high precision model to ensure that when a dealer is alerted, the equipment is likely to fail. This helps earn the trust of dealers, technicians, and homeowners alike, and reduces the costs associated with unnecessary on-site inspections.
We partnered with the AI/ML experts at the Amazon ML Solutions Lab for a 14-week development effort. In the end, our solution includes two primary components. The first is a data processing module built with AWS Glue that summarizes equipment behavior and reduces the size of our training data for efficient downstream processing. The second is a model training interface managed through SageMaker, which allows us to train, tune, and evaluate our model before it is deployed to a production endpoint.
Data processing
Each HVAC unit we install generates data from 90 different sensors with readings for RPMs, temperature, and pressures throughout the system. This amounts to roughly 8 million data points generated per unit per day, with tens of thousands of units installed. As more HVAC systems are connected to the cloud, we anticipate the volume of data to grow quickly, making it critical for us to manage its size and complexity for use in downstream tasks. The length of the sensor data history also presents a modeling challenge. A unit may start displaying signs of impending failure months before a fault is actually triggered. This creates a significant lag between the predictive signal and the actual failure. A method for compressing the length of the input data becomes critical for ML modeling.
To address the size and complexity of the sensor data, we compress it into cycle features as shown in Figure 1. This dramatically reduces the size of data while capturing features that characterize the equipment’s behavior.

Figure 1: Sample of HVAC sensor data

AWS Glue is a serverless data integration service for processing large quantities of data at scale. AWS Glue allowed us to easily run parallel data preprocessing and feature extraction. We used AWS Glue to detect cycles and summarize unit behavior using key features identified by our engineering team. This dramatically reduced the size of our dataset from over 8 million data points per day per unit down to roughly 1,200. Crucially, this approach preserves predictive information about unit behavior with a much smaller data footprint.
The output of the AWS Glue job is a summary of unit behavior for each cycle. We then use an Amazon SageMaker Processing job to calculate features across cycles and label our data. We formulate the ML problem as a binary classification task with a goal of predicting equipment faults in the next 60 days. This allows our dealer network to address potential equipment failures in a timely manner. It’s important to note that not all units fail within 60 days. A unit experiencing slow performance degradation could take more time to fail. We address this during the model evaluation step. We focused our modeling on summertime because those months are when most HVAC systems in the US are in consistent operation and under more extreme conditions.
Modeling
Transformer architectures have become the state-of-the-art approach for handling temporal data. They can use long sequences of historical data at each time step without suffering from vanishing gradients. The input to our model at a given point in time is composed of the features for the previous 128 equipment cycles, which is roughly one week of unit operation. This is processed by a three-layer encoder whose output is averaged and fed into a multi-layered perceptron (MLP) classifier. The MLP classifier is composed of three linear layers with ReLU activation functions and a final layer with LogSoftMax activation. We use weighted negative log-likelihood loss with a different weight on the positive class for our loss function. This biases our model towards high precision and avoids costly false alarms. It also incorporates our business objectives directly into the model training process. Figure 2 illustrates the transformer architecture.

Figure 2: Temporal transformer architecture

Training
One challenge when training this temporal learning model is data imbalance. Some units have a longer operational history than others and therefore have more cycles in our dataset. Because they are overrepresented in the dataset, these units will have more influence on our model. We solve this by randomly sampling 100 cycles in a unit’s history where we assess the probability of a failure at that time. This ensures that each unit is equally represented during the training process. While removing the imbalanced data problem, this approach has the added benefit of replicating a batch processing approach that will be used in production. This sampling approach was applied to the training, validation, and test sets.
Training was performed using a GPU-accelerated instance on SageMaker. Monitoring the loss shows that it achieves the best results after 180 training epochs as show in Figure 3. Figure 4 shows that the area under the ROC curve for the resulting temporal classification model is 81%.

Figure 3: Training loss over epochs

Figure 4: ROC-AUC for 60-day lockout

Evaluation
While our model is trained at the cycle level, evaluation needs to take place at the unit level. In this way, one unit with multiple true positive detections is still only counted as a single true positive at the unit level. To do this, we analyze the overlap between the predicted outcomes and the 60-day window preceding a fault. This is illustrated in the following figure, which shows four cases of predicting outcomes:

True negative – All the prediction results are negative (purple) (Figure 5)
False positive – The positive predictions are false alarms (Figure 6)
False negative – Although the predictions are all negative, the actual labels could be positive (green) (Figure 7)
True positive – Some of the predictions could be negative (green), and at least one prediction is positive (yellow) (Figure 8)

Figure 5.1: True negative case

Figure 5.2: False positive case

Figure 5.3: False negative case

Figure 5.4: True positive case

After training, we use the evaluation set to tune the threshold for sending an alert. Setting the model confidence threshold at 0.99 yields a precision of roughly 81%. This falls short of our initial 90% criterion for success. However, we found that a good portion of units failed just outside the 60-day evaluation window. This makes sense, because a unit may actively display faulty behavior but take longer than 60 days to fail. To handle this, we defined a metric called effective precision, which is a combination of the true positive precision (81%) with the added precision of lockouts that occurred in the 30 days beyond our target 60-day window.
For an HVAC dealer, what is most important is that an onsite inspection helps prevent future HVAC issues for the customer. Using this model, we estimate that 81.2% of the time the inspection will prevent a lockout from occurring in the next 60 days. Additionally, 10.4% of the time the lockout would have occurred in within 90 days of inspection. The remaining 8.4% will be a false alarm. The effective precision of the trained model is 91.6%.
Conclusion
In this post, we showed how our team used AWS Glue and SageMaker to create a scalable supervised learning solution for predictive maintenance. Our model is capable of capturing trends across long-term histories of sensor data and accurately detecting hundreds of equipment failures weeks in advance. Predicting faults in advance will reduce curb-to-curb time, allowing our dealers to provide more timely technical assistance and improving the overall customer experience. The impacts of this approach will grow over time as more cloud-connected HVAC units are installed every year.
Our next step is to integrate these insights into the upcoming release of Carrier’s Connected Dealer Portal. The portal combines these predictive alerts with other insights we derive from our AWS-based data lake in order to give our dealers more clarity into equipment health across their entire client base. We will continue to improve our model by integrating data from additional sources and extracting more advanced features from our sensor data. The methods employed in this project provide a strong foundation for our team to start answering other key questions that can help us reduce warranty claims and improve equipment efficiency in the field.
If you’d like help accelerating the use of ML in your products and services, please contact the Amazon ML Solutions Lab. To learn more about the services used in this project, refer to the AWS Glue Developer Guide and the Amazon SageMaker Developer Guide.

About the Authors
Ravi Patankar is a technical leader for IoT related analytics at Carrier’s Residential HVAC Unit. He formulates analytics problems related to diagnostics and prognostics and provides direction for ML/deep learning-based analytics solutions and architecture.
Dan Volk is a Data Scientist at the AWS Generative AI Innovation Center. He has ten years of experience in machine learning, deep learning and time-series analysis and holds a Master’s in Data Science from UC Berkeley. He is passionate about transforming complex business challenges into opportunities by leveraging cutting-edge AI technologies.
Yingwei Yu is an Applied Scientist at AWS Generative AI Innovation Center. He has experience working with several organizations across industries on various proof-of-concepts in machine learning, including NLP, time-series analysis, and generative AI technologies. Yingwei received his PhD in computer science from Texas A&M University.
Yanxiang Yu is an Applied Scientist at Amazon Web Services, working on the Generative AI Innovation Center. With over 8 years of experience building AI and machine learning models for industrial applications, he specializes in generative AI, computer vision, and time series modeling. His work focuses on finding innovative ways to apply advanced generative techniques to real-world problems.
Diego Socolinsky is a Senior Applied Science Manager with the AWS Generative AI Innovation Center, where he leads the delivery team for the Eastern US and Latin America regions. He has over twenty years of experience in machine learning and computer vision, and holds a PhD degree in mathematics from The Johns Hopkins University.
Kexin Ding is a fifth-year Ph.D. candidate in computer science at UNC-Charlotte. Her research focuses on applying deep learning methods for analyzing multi-modal data, including medical image and genomics sequencing data.

Optimize deployment cost of Amazon SageMaker JumpStart foundation mode …

The success of generative AI applications across a wide range of industries has attracted the attention and interest of companies worldwide who are looking to reproduce and surpass the achievements of competitors or solve new and exciting use cases. These customers are looking into foundation models, such as TII Falcon, Stable Diffusion XL, or OpenAI’s GPT-3.5, as the engines that power the generative AI innovation.
Foundation models are a class of generative AI models that are capable of understanding and generating human-like content, thanks to the vast amounts of unstructured data they have been trained on. These models have revolutionized various computer vision (CV) and natural language processing (NLP) tasks, including image generation, translation, and question answering. They serve as the building blocks for many AI applications and have become a crucial component in the development of advanced intelligent systems.
However, the deployment of foundation models can come with significant challenges, particularly in terms of cost and resource requirements. These models are known for their size, often ranging from hundreds of millions to billions of parameters. Their large size demands extensive computational resources, including powerful hardware and significant memory capacity. In fact, deploying foundation models usually requires at least one (often more) GPUs to handle the computational load efficiently. For example, the TII Falcon-40B Instruct model requires at least an ml.g5.12xlarge instance to be loaded into memory successfully, but performs best with bigger instances. As a result, the return on investment (ROI) of deploying and maintaining these models can be too low to prove business value, especially during development cycles or for spiky workloads. This is due to the running costs of having GPU-powered instances for long sessions, potentially 24/7.
Earlier this year, we announced Amazon Bedrock, a serverless API to access foundation models from Amazon and our generative AI partners. Although it’s currently in Private Preview, its serverless API allows you to use foundation models from Amazon, Anthropic, Stability AI, and AI21, without having to deploy any endpoints yourself. However, open-source models from communities such as Hugging Face have been growing a lot, and not every one of them has been made available through Amazon Bedrock.
In this post, we target these situations and solve the problem of risking high costs by deploying large foundation models to Amazon SageMaker asynchronous endpoints from Amazon SageMaker JumpStart. This can help cut costs of the architecture, allowing the endpoint to run only when requests are in the queue and for a short time-to-live, while scaling down to zero when no requests are waiting to be serviced. This sounds great for a lot of use cases; however, an endpoint that has scaled down to zero will introduce a cold start time before being able to serve inferences.
Solution overview
The following diagram illustrates our solution architecture.

The architecture we deploy is very straightforward:

The user interface is a notebook, which can be replaced by a web UI built on Streamlit or similar technology. In our case, the notebook is an Amazon SageMaker Studio notebook, running on an ml.m5.large instance with the PyTorch 2.0 Python 3.10 CPU kernel.
The notebook queries the endpoint in three ways: the SageMaker Python SDK, the AWS SDK for Python (Boto3), and LangChain.
The endpoint is running asynchronously on SageMaker, and on the endpoint, we deploy the Falcon-40B Instruct model. It’s currently the state of the art in terms of instruct models and available in SageMaker JumpStart. A single API call allows us to deploy the model on the endpoint.

What is SageMaker asynchronous inference
SageMaker asynchronous inference is one of the four deployment options in SageMaker, together with real-time endpoints, batch inference, and serverless inference. To learn more about the different deployment options, refer to Deploy models for Inference.
SageMaker asynchronous inference queues incoming requests and processes them asynchronously, making this option ideal for requests with large payload sizes up to 1 GB, long processing times, and near-real-time latency requirements. However, the main advantage that it provides when dealing with large foundation models, especially during a proof of concept (POC) or during development, is the capability to configure asynchronous inference to scale in to an instance count of zero when there are no requests to process, thereby saving costs. For more information about SageMaker asynchronous inference, refer to Asynchronous inference. The following diagram illustrates this architecture.

To deploy an asynchronous inference endpoint, you need to create an AsyncInferenceConfig object. If you create AsyncInferenceConfig without specifying its arguments, the default S3OutputPath will be s3://sagemaker-{REGION}-{ACCOUNTID}/async-endpoint-outputs/{UNIQUE-JOB-NAME} and S3FailurePath will be s3://sagemaker-{REGION}-{ACCOUNTID}/async-endpoint-failures/{UNIQUE-JOB-NAME}.
What is SageMaker JumpStart
Our model comes from SageMaker JumpStart, a feature of SageMaker that accelerates the machine learning (ML) journey by offering pre-trained models, solution templates, and example notebooks. It provides access to a wide range of pre-trained models for different problem types, allowing you to start your ML tasks with a solid foundation. SageMaker JumpStart also offers solution templates for common use cases and example notebooks for learning. With SageMaker JumpStart, you can reduce the time and effort required to start your ML projects with one-click solution launches and comprehensive resources for practical ML experience.
The following screenshot shows an example of just some of the models available on the SageMaker JumpStart UI.

Deploy the model
Our first step is to deploy the model to SageMaker. To do that, we can use the UI for SageMaker JumpStart or the SageMaker Python SDK, which provides an API that we can use to deploy the model to the asynchronous endpoint:

%%time
from sagemaker.jumpstart.model import JumpStartModel, AsyncInferenceConfig
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

model_id, model_version = “huggingface-llm-falcon-40b-instruct-bf16”, “*”
my_model = JumpStartModel(model_id=model_id)
predictor = my_model.deploy(
initial_instance_count=0,
instance_type=”ml.g5.12xlarge”,
async_inference_config=AsyncInferenceConfig()
)

This call can take approximately10 minutes to complete. During this time, the endpoint is spun up, the container together with the model artifacts are downloaded to the endpoint, the model configuration is loaded from SageMaker JumpStart, then the asynchronous endpoint is exposed via a DNS endpoint. To make sure that our endpoint can scale down to zero, we need to configure auto scaling on the asynchronous endpoint using Application Auto Scaling. You need to first register your endpoint variant with Application Auto Scaling, define a scaling policy, and then apply the scaling policy. In this configuration, we use a custom metric using CustomizedMetricSpecification, called ApproximateBacklogSizePerInstance, as shown in the following code. For a detailed list of Amazon CloudWatch metrics available with your asynchronous inference endpoint, refer to Monitoring with CloudWatch.

import boto3

client = boto3.client(“application-autoscaling”)
resource_id = “endpoint/” + my_model.endpoint_name + “/variant/” + “AllTraffic”

# Configure Autoscaling on asynchronous endpoint down to zero instances
response = client.register_scalable_target(
ServiceNamespace=”sagemaker”,
ResourceId=resource_id,
ScalableDimension=”sagemaker:variant:DesiredInstanceCount”,
MinCapacity=0, # Miminum number of instances we want to scale down to – scale down to 0 to stop incurring in costs
MaxCapacity=1, # Maximum number of instances we want to scale up to – scale up to 1 max is good enough for dev
)

response = client.put_scaling_policy(
PolicyName=”Invocations-ScalingPolicy”,
ServiceNamespace=”sagemaker”, # The namespace of the AWS service that provides the resource.
ResourceId=resource_id, # Endpoint name
ScalableDimension=”sagemaker:variant:DesiredInstanceCount”, # SageMaker supports only Instance Count
PolicyType=”TargetTrackingScaling”, # ‘StepScaling’|’TargetTrackingScaling’
TargetTrackingScalingPolicyConfiguration={
“TargetValue”: 5.0, # The target value for the metric. – here the metric is – SageMakerVariantInvocationsPerInstance
“CustomizedMetricSpecification”: {
“MetricName”: “ApproximateBacklogSizePerInstance”,
“Namespace”: “AWS/SageMaker”,
“Dimensions”: [{“Name”: “EndpointName”, “Value”: my_model.endpoint_name}],
“Statistic”: “Average”,
},
“ScaleInCooldown”: 600, # The amount of time, in seconds, after a scale in activity completes before another scale in activity can start.
“ScaleOutCooldown”: 300, # ScaleOutCooldown – The amount of time, in seconds, after a scale out activity completes before another scale out activity can start.
# ‘DisableScaleIn’: True|False – indicates whether scale in by the target tracking policy is disabled.
# If the value is true, scale in is disabled and the target tracking policy won’t remove capacity from the scalable resource.
},
)

You can verify that this policy has been set successfully by navigating to the SageMaker console, choosing Endpoints under Inference in the navigation pane, and looking for the endpoint we just deployed.

Invoke the asynchronous endpoint
To invoke the endpoint, you need to place the request payload in Amazon Simple Storage Service (Amazon S3) and provide a pointer to this payload as a part of the InvokeEndpointAsync request. Upon invocation, SageMaker queues the request for processing and returns an identifier and output location as a response. Upon processing, SageMaker places the result in the Amazon S3 location. You can optionally choose to receive success or error notifications with Amazon Simple Notification Service (Amazon SNS).
SageMaker Python SDK
After deployment is complete, it will return an AsyncPredictor object. To perform asynchronous inference, you need to upload data to Amazon S3 and use the predict_async() method with the S3 URI as the input. It will return an AsyncInferenceResponse object, and you can check the result using the get_response() method.
Alternatively, if you would like to check for a result periodically and return it upon generation, use the predict() method. We use this second method in the following code:

import time

# Invoking the asynchronous endpoint with the SageMaker Python SDK
def query_endpoint(payload):
“””Query endpoint and print the response”””
response = predictor.predict_async(
data=payload,
input_path=”s3://{}/{}”.format(bucket, prefix),
)
while True:
try:
response = response.get_result()
break
except:
print(“Inference is not ready …”)
time.sleep(5)
print(f”33[1m Input:33[0m {payload[‘inputs’]}”)
print(f”33[1m Output:33[0m {response[0][‘generated_text’]}”)

query_endpoint(payload)

Boto3
Let’s now explore the invoke_endpoint_async method from Boto3’s sagemaker-runtime client. It enables developers to asynchronously invoke a SageMaker endpoint, providing a token for progress tracking and retrieval of the response later. Boto3 doesn’t offer a way to wait for the asynchronous inference to be completed like the SageMaker Python SDK’s get_result() operation. Therefore, we take advantage of the fact that Boto3 will store the inference output in Amazon S3 in the response[“OutputLocation”]. We can use the following function to wait for the inference file to be written to Amazon S3:

import json
import time
import boto3
from botocore.exceptions import ClientError

s3_client = boto3.client(“s3”)

# Wait until the prediction is generated
def wait_inference_file(bucket, prefix):
while True:
try:
response = s3_client.get_object(Bucket=bucket, Key=prefix)
break
except ClientError as ex:
if ex.response[‘Error’][‘Code’] == ‘NoSuchKey’:
print(“Waiting for file to be generated…”)
time.sleep(5)
next
else:
raise
except Exception as e:
print(e.__dict__)
raise
return response

With this function, we can now query the endpoint:

# Invoking the asynchronous endpoint with the Boto3 SDK
import boto3

sagemaker_client = boto3.client(“sagemaker-runtime”)

# Query the endpoint function
def query_endpoint_boto3(payload):
“””Query endpoint and print the response”””
response = sagemaker_client.invoke_endpoint_async(
EndpointName=my_model.endpoint_name,
InputLocation=”s3://{}/{}”.format(bucket, prefix),
ContentType=”application/json”,
Accept=”application/json”
)
output_url = response[“OutputLocation”]
output_prefix = “/”.join(output_url.split(“/”)[3:])
# Read the bytes of the file from S3 in output_url with Boto3
output = wait_inference_file(bucket, output_prefix)
output = json.loads(output[‘Body’].read())[0][‘generated_text’]
# Emit output
print(f”33[1m Input:33[0m {payload[‘inputs’]}”)
print(f”33[1m Output:33[0m {output}”)

query_endpoint_boto3(payload)

LangChain
LangChain is an open-source framework launched in October 2022 by Harrison Chase. It simplifies the development of applications using large language models (LLMs) by providing integrations with various systems and data sources. LangChain allows for document analysis, summarization, chatbot creation, code analysis, and more. It has gained popularity, with contributions from hundreds of developers and significant funding from venture firms. LangChain enables the connection of LLMs with external sources, making it possible to create dynamic, data-responsive applications. It offers libraries, APIs, and documentation to streamline the development process.
LangChain provides libraries and examples for using SageMaker endpoints with its framework, making it easier to use ML models hosted on SageMaker as the “brain” of the chain. To learn more about how LangChain integrates with SageMaker, refer to the SageMaker Endpoint in the LangChain documentation.
One of the limits of the current implementation of LangChain is that it doesn’t support asynchronous endpoints natively. To use an asynchronous endpoint to LangChain, we have to define a new class, SagemakerAsyncEndpoint, that extends the SagemakerEndpoint class already available in LangChain. Additionally, we provide the following information:

The S3 bucket and prefix where asynchronous inference will store the inputs (and outputs)
A maximum number of seconds to wait before timing out
An updated _call() function to query the endpoint with invoke_endpoint_async() instead of invoke_endpoint()
A way to wake up the asynchronous endpoint if it’s in cold start (scaled down to zero)

To review the newly created SagemakerAsyncEndpoint, you can check out the sagemaker_async_endpoint.py file available on GitHub.

from typing import Dict
from langchain import PromptTemplate
from langchain.llms.sagemaker_endpoint import LLMContentHandler
from langchain.chains import LLMChain
from sagemaker_async_endpoint import SagemakerAsyncEndpoint

class ContentHandler(LLMContentHandler):
content_type:str = “application/json”
accepts:str = “application/json”
len_prompt:int = 0

def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
self.len_prompt = len(prompt)
input_str = json.dumps({“inputs”: prompt, “parameters”: {“max_new_tokens”: 100, “do_sample”: False, “repetition_penalty”: 1.1}})
return input_str.encode(‘utf-8’)

def transform_output(self, output: bytes) -> str:
response_json = output.read()
res = json.loads(response_json)
ans = res[0][‘generated_text’]
return ans

chain = LLMChain(
llm=SagemakerAsyncEndpoint(
input_bucket=bucket,
input_prefix=prefix,
endpoint_name=my_model.endpoint_name,
region_name=sagemaker.Session().boto_region_name,
content_handler=ContentHandler(),
),
prompt=PromptTemplate(
input_variables=[“query”],
template=”{query}”,
),
)

print(chain.run(payload[‘inputs’]))

Clean up
When you’re done testing the generation of inferences from the endpoint, remember to delete the endpoint to avoid incurring in extra charges:

predictor.delete_endpoint()

Conclusion
When deploying large foundation models like TII Falcon, optimizing cost is crucial. These models require powerful hardware and substantial memory capacity, leading to high infrastructure costs. SageMaker asynchronous inference, a deployment option that processes requests asynchronously, reduces expenses by scaling the instance count to zero when there are no pending requests. In this post, we demonstrated how to deploy large SageMaker JumpStart foundation models to SageMaker asynchronous endpoints. We provided code examples using the SageMaker Python SDK, Boto3, and LangChain to illustrate different methods for invoking asynchronous endpoints and retrieving results. These techniques enable developers and researchers to optimize costs while using the capabilities of foundation models for advanced language understanding systems.
To learn more about asynchronous inference and SageMaker JumpStart, check out the following posts:

Quickly build high-accuracy Generative AI applications on enterprise data using Amazon Kendra, LangChain, and large language models
Run computer vision inference on large videos with Amazon SageMaker asynchronous endpoints

About the author
Davide Gallitelli is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in Brussels and works closely with customers throughout Benelux. He has been a developer since he was very young, starting to code at the age of 7. He started learning AI/ML at university, and has fallen in love with it since then.

Meta AI Releases Nougat: A Visual Transformer Model that Performs OCR …

With the growing advancements in the field of Artificial Intelligence, its sub-fields, including Natural Language Processing, Natural Language Generation, Computer Vision, etc., have rapidly gained a lot of popularity due to their extensive use cases. Optical Character Recognition (OCR) is a well-established and heavily investigated area of computer vision. It has a number of uses, such as document digitization, handwriting recognition, and scene text identification. The recognition of mathematical expressions is one area of OCR that has received a lot of interest in academic studies.

The Portable Document Format (PDF) is one of the most widely used formats for scientific knowledge, which is often preserved in books or published in scholarly journals. The second most used data format on the internet, accounting for 2.4% of the information, PDFs are frequently used for document delivery. Despite their widespread use, extracting information from PDF files can be difficult, particularly when dealing with highly specialized materials like scientific research articles. In particular, when these papers are converted to PDF format, the semantic information of mathematical expressions is frequently lost.

To address the challenges, a team of researchers from Meta AI has introduced a solution called Nougat, which stands for “Neural Optical Understanding for Academic Documents.” In order to do Optical Character Recognition (OCR) on scientific texts, Nougat is a Visual Transformer model. Its goal is to transform these files into a markup language so that they may be more easily accessed and machine-readable.

To show the efficacy of the methodology, the team has also produced a fresh dataset of academic papers. This method offers a viable answer for enhancing scientific knowledge accessibility in the digital age. It fills the gap between written materials that are simple for people to read and text that computers can process and analyze. Researchers, educators, and anyone interested in scientific literature can access and deal with scientific papers more effectively using Nougat. Nougat is basically a transformer-based model designed to convert images of document pages, particularly those from PDFs, into formatted markup text.

The team has summarized their key contributions as follows –

Publication of a Pre-trained Model: The team has created a pre-trained model that can transform PDFs into a simple markup language. This pre-trained model is made public on GitHub, where the research community and anyone can access it, along with the related code.

Pipeline for Dataset Creation: A method for building datasets that pair PDF documents with their associated source code is described in the study. This dataset development method is crucial for testing and refining the Nougat model and may be useful for future document analysis research and applications.

Dependency on the Page’s Image Only: One of Nougat’s standout features is its capacity to operate only on the Page’s Image. This makes it a flexible tool for extracting content from a variety of sources, even when the original documents are not available in digital text formats. It can process scanned papers and books.

Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

The post Meta AI Releases Nougat: A Visual Transformer Model that Performs OCR for Processing Scientific Documents into a Markup Language appeared first on MarkTechPost.

Researchers from South Korea Propose VITS2: A Breakthrough in Single-S …

The paper introduces VITS2, a single-stage text-to-speech model that synthesizes more natural speech by improving various aspects of previous models. The model addresses issues like intermittent unnaturalness, computational efficiency, and dependence on phoneme conversion. The proposed methods enhance naturalness, speech characteristic similarity in multi-speaker models, and training and inference efficiency.

The strong dependence on phoneme conversion in previous works is significantly reduced, allowing for a fully end-to-end single-stage approach.

Previous Methods:

Two-Stage Pipeline Systems: These systems divided the process of generating waveforms from input texts into two cascaded stages. The first stage produced intermediate speech representations like mel-spectrograms or linguistic features from the input texts. The second stage then generated raw waveforms based on those intermediate representations. These systems had limitations such as error propagation from the first stage to the second, reliance on human-defined features like mel-spectrogram, and the computation required to generate intermediate features.

Single-Stage Models: Recent studies have actively explored single-stage models that directly generate waveforms from input texts. These models have not only outperformed the two-stage systems but also demonstrated the ability to generate high-quality speech nearly indistinguishable from human speech.

Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech by J. Kim, J. Kong, and J. Son was a significant prior work in the field of single-stage text-to-speech synthesis. This previous single-stage approach achieved great success but had several problems, including intermittent unnaturalness, low efficiency of the duration predictor, complex input format, insufficient speaker similarity in multi-speaker models, slow training, and strong dependence on phoneme conversion.

The current paper’s main contribution is to address the issues found in the previous single-stage model, particularly the one mentioned in the above successful model, and introduce improvements to achieve better quality and efficiency in text-to-speech synthesis.

Deep neural network-based text-to-speech has seen significant advancements. The challenge lies in converting discontinuous text into continuous waveforms, ensuring high-quality speech audio. Previous solutions divided the process into two stages: producing intermediate speech representations from texts and then generating raw waveforms based on those representations. Single-stage models have been actively studied and have outperformed two-stage systems. The paper aims to address issues found in previous single-stage models.

The paper describes improvements in four areas: duration prediction, augmented variational autoencoder with normalizing flows, alignment search, and speaker-conditioned text encoder. A stochastic duration predictor is proposed, trained through adversarial learning. The Monotonic Alignment Search (MAS) is used for alignment, with modifications for quality improvement. The model introduces a transformer block into the normalizing flows for capturing long-term dependencies. A speaker-conditioned text encoder is designed to better mimic the various speech characteristics of each speaker.

Experiments were conducted on the LJ Speech dataset and the VCTK dataset. The study used both phoneme sequences and normalized texts as model inputs. Networks were trained using the AdamW optimizer, and the training was conducted on NVIDIA V100 GPUs.Crowdsourced mean opinion score (MOS) tests were conducted to evaluate the naturalness of the synthesized speech. The proposed method showed significant improvement in the quality of synthesized speech compared to previous models. Ablation studies were conducted to verify the validity of the proposed methods.Finally, the authors demonstrated the validity of their proposed methods through experiments, quality evaluation, and computation speed measurement but conveyed that various problems still exist in the field of speech synthesis that must be addressed, and hope that their work can be a basis for future research.

Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

The post Researchers from South Korea Propose VITS2: A Breakthrough in Single-Stage Text-to-Speech Models for Enhanced Naturalness and Efficiency appeared first on MarkTechPost.