Detect Anything You Want With UniDetector

Deep learning and AI have made remarkable progress in recent years, especially in detection models. Despite these impressive advancements, the effectiveness of object detection models heavily relies on large-scale benchmark datasets. However, the challenge lies in the variation of object categories and scenes. In the real world, there are significant differences from existing images, and novel object classes may emerge, necessitating the reconstruction of datasets to ensure object detectors’ success. Unfortunately, this severely affects their ability to generalize in open-world scenarios. In contrast, humans, even children, can quickly adapt and generalize well in new environments. Consequently, the lack of universality in AI remains a notable gap between AI systems and human intelligence.

The key to overcoming this limitation is the development of a universal object detector to achieve detection capabilities across all types of objects in any given scene. Such a model would possess the remarkable ability to function effectively in unknown situations without requiring additional re-training. Such a breakthrough would significantly approach the goal of making object detection systems as intelligent as humans.

A universal object detector must possess two critical abilities. Firstly, it should be trained using images from various sources and diverse label spaces. Collaborative training on a large scale for classification and localization is essential to ensure the detector gains sufficient information to generalize effectively. The ideal large-scale learning dataset should include many image types, encompassing as many categories as possible, with high-quality bounding box annotations and extensive category vocabularies. Unfortunately, achieving such diversity is challenging due to limitations posed by human annotators. In practice, while small vocabulary datasets offer cleaner annotations, larger ones are noisier and may suffer from inconsistencies. Additionally, specialized datasets focus on specific categories. To achieve universality, the detector must learn from multiple sources with varying label spaces to acquire comprehensive and complete knowledge.

Join the fastest growing ML Community on Reddit
Secondly, the detector should demonstrate robust generalization to the open world. It should be capable of accurately predicting category tags for novel classes not seen during training without any significant drop in performance. However, relying solely on visual information cannot achieve this purpose, as comprehensive visual learning necessitates human annotations for fully-supervised learning.

To overcome these limitations, a novel universal object detection model termed “UniDetector” has been proposed.

The architecture overview is reported in the illustration below.

Two corresponding challenges need to be tackled to achieve the two essential abilities of a universal object detector. The first challenge refers to training with multi-source images, where images come from different sources and are associated with diverse label spaces. Existing detectors are limited to predicting classes from only one label space, and the differences in dataset-specific taxonomy and annotation inconsistency among datasets make it difficult to unify multiple heterogeneous label spaces.

The second challenge involves novel category discrimination. Inspired by the success of image-text pre-training in recent research, the authors leverage pre-trained models with language embeddings to recognize unseen categories. However, fully-supervised training tends to bias the detector towards focusing on categories present during training. Consequently, the model might be skewed towards base classes at inference time and produce under-confident predictions for novel classes. Although language embeddings offer the potential to predict novel classes, their performance still lags significantly behind that of base categories.

UniDetector has been designed to tackle the abovementioned challenges. Utilizing the language space, the researchers explore various structures to train the detector effectively with heterogeneous label spaces. They discover that employing a partitioned structure facilitates feature sharing while avoiding label conflicts, which is beneficial for the detector’s performance.

To enhance the generalization ability of the region proposal stage towards novel classes, the authors decouple the proposal generation stage from the RoI (Region of Interest) classification stage, opting for separate training instead of joint training. This approach leverages the unique characteristics of each stage, contributing to the overall universality of the detector. Furthermore, they introduce a class-agnostic localization network (CLN) to achieve generalized region proposals.

Additionally, the authors propose a probability calibration technique to de-bias the predictions. They estimate the prior probability of all categories and then adjust the predicted category distribution based on this prior probability. This calibration significantly improves the performance of novel classes within the object detection system. According to the authors, UniDetector can surpass Dyhead, the state-of-the-art CNN detector, by 6.3% AP (Average Precision).

This was the summary of UniDetector, a novel AI framework designed for universal object detection. If you are interested and want to learn more about this work, you can find further information by clicking on the links below.

Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 27k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

The post Detect Anything You Want With UniDetector appeared first on MarkTechPost.

Exploring summarization options for Healthcare with Amazon SageMaker

In today’s rapidly evolving healthcare landscape, doctors are faced with vast amounts of clinical data from various sources, such as caregiver notes, electronic health records, and imaging reports. This wealth of information, while essential for patient care, can also be overwhelming and time-consuming for medical professionals to sift through and analyze. Efficiently summarizing and extracting insights from this data is crucial for better patient care and decision-making. Summarized patient information can be useful to a number of downstream processes like data aggregation, effectively coding patients, or grouping patients with similar diagnoses for review.
Artificial intelligence (AI) and machine learning (ML) models have shown great promise in addressing these challenges. Models can be trained to analyze and interpret large volumes of text data, effectively condensing information into concise summaries. By automating the summarization process, doctors can quickly gain access to relevant information, allowing them to focus on patient care and make more informed decisions. See the following case study to learn more about a real-world use case.
Amazon SageMaker, a fully managed ML service, provides an ideal platform for hosting and implementing various AI/ML-based summarization models and approaches. In this post, we explore different options for implementing summarization techniques on SageMaker, including using Amazon SageMaker JumpStart foundation models, fine-tuning pre-trained models from Hugging Face, and building custom summarization models. We also discuss the pros and cons of each approach, enabling healthcare professionals to choose the most suitable solution for generating concise and accurate summaries of complex clinical data.
Two important terms to know before we begin: pre-trained and fine-tuning. A pre-trained or foundation model is one that has been built and trained on a large corpus of data, typically for general language knowledge. Fine-tuning is the process by which a pre-trained model is given another more domain-specific dataset in order to enhance its performance on a specific task. In a healthcare setting, this would mean giving the model some data including phrases and terminology pertaining specifically to patient care.
Build custom summarization models on SageMaker
Though the most high-effort approach, some organizations might prefer to build custom summarization models on SageMaker from scratch. This approach requires more in-depth knowledge of AI/ML models and may involve creating a model architecture from scratch or adapting existing models to suit specific needs. Building custom models can offer greater flexibility and control over the summarization process, but also requires more time and resources compared to approaches that start from pre-trained models. It’s essential to weigh the benefits and drawbacks of this option carefully before proceeding, because it may not be suitable for all use cases.
SageMaker JumpStart foundation models
A great option for implementing summarization on SageMaker is using JumpStart foundation models. These models, developed by leading AI research organizations, offer a range of pre-trained language models optimized for various tasks, including text summarization. SageMaker JumpStart provides two types of foundation models: proprietary models and open-source models. SageMaker JumpStart also provides HIPAA eligibility, making it useful for healthcare workloads. It is ultimately up to the customer to ensure compliance, so be sure to take the appropriate steps. See Architecting for HIPAA Security and Compliance on Amazon Web Services for more details.
Proprietary foundation models
Proprietary models, such as Jurassic models from AI21 and the Cohere Generate model from Cohere, can be discovered through SageMaker JumpStart on the AWS Management Console and are currently under preview. Utilizing proprietary models for summarization is ideal when you don’t need to fine-tune your model on custom data. This offers an easy-to-use, out-of-the-box solution that can meet your summarization requirements with minimal configuration. By using the capabilities of these pre-trained models, you can save time and resources that would otherwise be spent on training and fine-tuning a custom model. Furthermore, proprietary models typically come with user-friendly APIs and SDKs, streamlining the integration process with your existing systems and applications. If your summarization needs can be met by pre-trained proprietary models without requiring specific customization or fine-tuning, they offer a convenient, cost-effective, and efficient solution for your text summarization tasks. Because these models are not trained specifically for healthcare use cases, quality can’t be guaranteed for medical language out of the box without fine-tuning.
Jurassic-2 Grande Instruct is a large language model (LLM) by AI21 Labs, optimized for natural language instructions and applicable to various language tasks. It offers an easy-to-use API and Python SDK, balancing quality and affordability. Popular uses include generating marketing copy, powering chatbots, and text summarization.
On the SageMaker console, navigate to SageMaker JumpStart, find the AI21 Jurassic-2 Grande Instruct model, and choose Try out model.

If you want to deploy the model to a SageMaker endpoint that you manage, you can follow the steps in this sample notebook, which shows you how to deploy Jurassic-2 Large using SageMaker.
Open-source foundation models
Open-source models include FLAN T5, Bloom, and GPT-2 models that can be discovered through SageMaker JumpStart in the Amazon SageMaker Studio UI, SageMaker JumpStart on the SageMaker console, and SageMaker JumpStart APIs. These models can be fine-tuned and deployed to endpoints under your AWS account, giving you full ownership of model weights and script codes.
Flan-T5 XL is a powerful and versatile model designed for a wide range of language tasks. By fine-tuning the model with your domain-specific data, you can optimize its performance for your particular use case, such as text summarization or any other NLP task. For details on how to fine-tune Flan-T5 XL using the SageMaker Studio UI, refer to Instruction fine-tuning for FLAN T5 XL with Amazon SageMaker Jumpstart.
Fine-tuning pre-trained models with Hugging Face on SageMaker
One of the most popular options for implementing summarization on SageMaker is fine-tuning pre-trained models using the Hugging Face Transformers library. Hugging Face provides a wide range of pre-trained transformer models specifically designed for various natural language processing (NLP) tasks, including text summarization. With the Hugging Face Transformers library, you can easily fine-tune these pre-trained models on your domain-specific data using SageMaker. This approach has several advantages, such as faster training times, better performance on specific domains, and easier model packaging and deployment using built-in SageMaker tools and services. If you’re unable to find a suitable model in SageMaker JumpStart, you can choose any model offered by Hugging Face and fine-tune it using SageMaker.
To start working with a model to learn about the capabilities of ML, all you need to do is open SageMaker Studio, find a pre-trained model you want to use in the Hugging Face Model Hub, and choose SageMaker as your deployment method. Hugging Face will give you the code to copy, paste, and run in your notebook. It’s as easy as that! No ML engineering experience required.

The Hugging Face Transformers library enables builders to operate on the pre-trained models and do advanced tasks like fine-tuning, which we explore in the following sections.
Provision resources
Before we can begin, we need to provision a notebook. For instructions, refer to Steps 1 and 2 in Build and Train a Machine Learning Model Locally. For this example, we used the settings shown in the following screenshot.

We also need to create an Amazon Simple Storage Service (Amazon S3) bucket to store the training data and training artifacts. For instructions, refer to Creating a bucket.
Prepare the dataset
To fine-tune our model to have better domain knowledge, we need to get data suitable for the task. When training for an enterprise use case, you’ll need to go through a number of data engineering tasks to prepare your own data to be ready for training. Those tasks are outside the scope of this post. For this example, we’ve generated some synthetic data to emulate nursing notes and stored it in Amazon S3. Storing our data in Amazon S3 enables us to architect our workloads for HIPAA compliance. We start by getting those notes and loading them on the instance where our notebook is running:

from datasets import load_dataset
dataset = load_dataset(“csv”, data_files={
    “train”: “s3://” + bucket_name + train_data_path,
    “validation”: “s3://” + bucket_name + test_data_path
})

The notes are composed of a column containing the full entry, note, and a column containing a shortened version exemplifying what our desired output should be, summary. The purpose of using this dataset is to improve our model’s biological and medical vocabulary so that it’s more attuned to summarizing in a healthcare context, called domain fine-tuning, and show our model how to structure its summarized output. In some summarization cases, we may want to create an abstract out of an article or a one-line synopsis of a review, but in this case, we’re trying to get our model to output an abbreviated version of the symptoms and actions taken for a patient so far.
Load the model
The model we use as our foundation is a version of Google’s Pegasus, made available in the Hugging Face Hub, called pegasus-xsum. It’s already pre-trained for summarization, so our fine-tuning process can focus on extending its domain knowledge. Modifying the task our model runs is a different type of fine-tuning not covered in this post. The Transformer library supplies us with a class to load the model definition from our model_checkpoint: google/pegasus-xsum. This will load the model from the hub and instantiate it in our notebook so we can use it later on. Because pegasus-xsum is a sequence-to-sequence model, we want to use the Seq2Seq type of the AutoModel class:

from transformers import AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

Now that we have our model, it’s time to put our attention to the other components that will enable us to run our training loop.
Create a tokenizer
The first of these components is the tokenizer. Tokenization is the process by which words from the input data are transformed into numerical representations that our model can understand. Again, the Transformer library provides a class for us to load a tokenizer definition from the same checkpoint we used to instantiate the model:

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

With this tokenizer object, we can create a preprocessing function and map it onto our dataset to give us tokens ready to be fed into the model. Finally, we format the tokenized output and remove the columns containing our original text, because the model will not be able to interpret them. Now we’re left with a tokenized input ready to be fed into the model. See the following code:

tokenized_datasets = dataset.map(preprocess_function, batched=True)

tokenized_datasets.set_format(“torch”)

tokenized_datasets = tokenized_datasets.remove_columns(
dataset[“train”].column_names
)
Create a data collator and optimizer

With our data tokenized and our model instantiated, we’re almost ready to run a training loop. The next components we want to create are the data collator and the optimizer. The data collator is another class provided by Hugging Face through the Transformers library, which we use to create batches of our tokenized data for training. We can easily build this using the tokenizer and model objects we already have just by finding the corresponding class type we’ve used previously for our model (Seq2Seq) for the collator class. The optimizer’s function is to maintain the training state and update the parameters based on our training loss as we work through the loop. To create an optimizer, we can import the optim package from the torch module, where a number of optimization algorithms are available. Some common ones you may have encountered before are Stochastic Gradient Descent and Adam, the latter of the which is applied in our example. Adam’s constructor takes in the model parameters and the parameterized learning rate for the given training run. See the following code:

from transformers import DataCollatorForSeq2Seq
from torch.optim import Adam

data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)
optimizer = Adam(model.parameters(), lr=learning_rate)

Build the accelerator and scheduler

The last steps before we can begin training are to build the accelerator and the learning rate scheduler. The accelerator comes from a different library (we’ve been primarily using Transformers) produced by Hugging Face, aptly named Accelerate, and will abstract away logic required to manage devices during training (using multiple GPUs for example). For the final component, we revisit the ever-useful Transformers library to implement our learning rate scheduler. By specifying the scheduler type, the total number of training steps in our loop, and the previously created optimizer, the get_scheduler function returns an object that enables us to adjust our initial learning rate throughout the training process:

from accelerate import Accelerator
from transformers import get_scheduler

accelerator = Accelerator()
model, optimizer = accelerator.prepare(
model, optimizer
)

lr_scheduler = get_scheduler(
“linear”,
optimizer=optimizer,
num_warmup_steps=0,
num_training_steps=num_training_steps,
)

Configure a training job

We’re now fully set up for training! Let’s set up a training job, starting by instantiating the training_args using the Transformers library and choosing parameter values. We can pass these, along with our other prepared components and dataset, directly to the trainer and start training, as shown in the following code. Depending on the size of your dataset and chosen parameters, this may take a significant amount of time.

from transformers import Seq2SeqTrainer
from transformers import Seq2SeqTrainingArguments

training_args = Seq2SeqTrainingArguments(
output_dir=”output/”,
save_total_limit=1,
num_train_epochs=num_train_epochs,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
evaluation_strategy=”epoch”,
logging_dir=”output/”,
load_best_model_at_end=True,
disable_tqdm=True,
logging_first_step=True,
logging_steps=1,
save_strategy=”epoch”,
predict_with_generate=True
)

trainer = Seq2SeqTrainer(
model=model,
tokenizer=tokenizer,
args=training_args,
train_dataset=tokenized_datasets[“train”],
eval_dataset=tokenized_datasets[“validation”],
data_collator=data_collator,
optimizers=(optimizer, lr_scheduler)
)

trainer.train()

To operationalize this code, we can package it as an entry point file and call it through a SageMaker training job. This allows us to separate the logic we just built away from the training call and allows SageMaker to run training on a separate instance.

Package the model for inference
After training has been run, the model object is ready to be used for inference. As a best practice, let’s save our work for future use. We need to create our model artifacts, zip them together, and upload our tarball to Amazon S3 for storage. To prepare our model for zipping, we need to unwrap the now fine-tuned model, then save the model binary and associated config files. We also need to save our tokenizer to the same directory that we saved our model artifacts to so it is available when we use the model for inference. Our model_dir folder should now look something like the following code:

config.json pytorch_model.bin tokenizer_config.json
generation_config.json special_tokens_map.json tokenizer.json

All that’s left is to run a tar command to zip up our directory and upload the tar.gz file to Amazon S3:

unwrapped_model = accelerator.unwrap_model(trainer.model)

unwrapped_model.save_pretrained(‘model_dir’, save_function=accelerator.save)

tokenizer.save_pretrained(‘model_dir’)

!cd model_dir/ && tar -czvf model.tar.gz *
!mv model_dir/model.tar.gz ./

with open(“model.tar.gz”, “rb”) as f:
s3.upload_fileobj(f, bucket_name, artifact_path + “model/model.tar.gz”)

Our newly fine-tuned model is now ready and available to be used for inference.
Perform inference
To use this model artifact for inference, open a new file and use the following code, modifying the model_data parameter to fit your artifact save location in Amazon S3. The HuggingFaceModel constructor will rebuild our model from the checkpoint we saved to model.tar.gz, which we can then deploy for inference using the deploy method. Deploying the endpoint will take a few minutes.

from sagemaker.huggingface import HuggingFaceModel
from sagemaker import get_execution_role

role = get_execution_role()

huggingface_model = HuggingFaceModel(
model_data=”s3://{bucket_name}/{artifact_path}/model/model.tar.gz”,
role=role,
transformers_version=”4.26”,
pytorch_version=”1.13”,
py_version=”py39”
)

predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type=”ml.m5.xlarge”
)

After the endpoint is deployed, we can use the predictor we’ve created to test it. Pass the predict method a data payload and run the cell, and you’ll get the response from your fine-tuned model:

data = {
“inputs”: “Text to summarize”
}
predictor.predict(data)

Results

To see the benefit of fine-tuning a model, let’s do a quick test. The following table includes a prompt and the results of passing that prompt to the model before and after fine-tuning.

Prompt
Response with No Fine-Tuning
Response with Fine-Tuning

Summarize the symptoms that the patient is experiencing. Patient is a 45 year old male with complaints of substernal chest pain radiating to the left arm. Pain is sudden onset while he was doing yard work, associated with mild shortness of breath and diaphoresis. On arrival patient’s heart rate was 120, respiratory rate 24, blood pressure 170/95. 12 lead electrocardiogram done on arrival to the emergency department and three sublingual nitroglycerin administered without relief of chest pain. Electrocardiogram shows ST elevation in anterior leads demonstrating acute anterior myocardial infarction. We have contacted cardiac catheterization lab and prepping for cardiac catheterization by cardiologist.
We present a case of acute myocardial infarction.
Chest pain, anterior MI, PCI.

As you can see, our fine-tuned model uses health terminology differently, and we’ve been able to change the structure of the response to fit our purposes. Note that results are dependent on your dataset and the design choices made during training. Your version of the model could offer very different results.
Clean up
When you’re finished with your SageMaker notebook, be sure to shut it down to avoid costs from long-running resources. Note that shutting down the instance will cause you to lose any data stored in the instance’s ephemeral memory, so you should save all your work to persistent storage before cleanup. You will also need to go to the Endpoints page on the SageMaker console and delete any endpoints deployed for inference. To remove all artifacts, you also need to go to the Amazon S3 console to delete files uploaded to your bucket.
Conclusion
In this post, we explored various options for implementing text summarization techniques on SageMaker to help healthcare professionals efficiently process and extract insights from vast amounts of clinical data. We discussed using SageMaker Jumpstart foundation models, fine-tuning pre-trained models from Hugging Face, and building custom summarization models. Each approach has its own advantages and drawbacks, catering to different needs and requirements.
Building custom summarization models on SageMaker allows for lots of flexibility and control but requires more time and resources than using pre-trained models. SageMaker Jumpstart foundation models provide an easy-to-use and cost-effective solution for organizations that don’t require specific customization or fine-tuning, as well as some options for simplified fine-tuning. Fine-tuning pre-trained models from Hugging Face offers faster training times, better domain-specific performance, and seamless integration with SageMaker tools and services across a broad catalog of models, but it requires some implementation effort. At the time of writing this post, Amazon has announced another option, Amazon Bedrock, which will offer summarization capabilities in an even more managed environment.
By understanding the pros and cons of each approach, healthcare professionals and organizations can make informed decisions on the most suitable solution for generating concise and accurate summaries of complex clinical data. Ultimately, using AI/ML-based summarization models on SageMaker can significantly enhance patient care and decision-making by enabling medical professionals to quickly access relevant information and focus on providing quality care.
Resources
For the full script discussed in this post and some sample data, refer to the GitHub repo. For more information on how to run ML workloads on AWS, see the following resources:

Hugging Face on Amazon SageMaker Workshop
Hugging Face Transformers Amazon SageMaker Examples
Technology Innovation Institute trains the state-of-the-art Falcon LLM 40B foundation model on Amazon SageMaker
Training large language models on Amazon SageMaker: Best practices
How Forethought saves over 66% in costs for generative AI models using Amazon SageMaker 

About the authors
Cody Collins is a New York based Solutions Architect at Amazon Web Services. He works with ISV customers to build industry leading solutions in the cloud. He has successfully delivered complex projects for diverse industries, optimizing efficiency and scalability. In his spare time, he enjoys reading, traveling, and training jiu jitsu.
Ameer Hakme is an AWS Solutions Architect residing in Pennsylvania. His professional focus involves collaborating with Independent software vendors throughout the Northeast, guiding them in designing and constructing scalable, state-of-the-art platforms on the AWS Cloud.

Unlocking creativity: How generative AI and Amazon SageMaker help busi …

Advertising agencies can use generative AI and text-to-image foundation models to create innovative ad creatives and content. In this post, we demonstrate how you can generate new images from existing base images using Amazon SageMaker, a fully managed service to build, train, and deploy ML models for at scale. With this solution, businesses large and small can develop new ad creatives much faster and at lower cost than ever before. This allows you to develop new custom ad creative content for your business at low cost and at a rapid pace.
Solution overview
Consider the following scenario: a global automotive company needs new marketing material generated for their new car design being released and hires a creative agency that is known for providing advertising solutions for clients with strong brand equity. The car manufacturer is looking for low-cost ad creatives that display the model in diverse locations, colors, views, and perspectives while maintaining the brand identity of the car manufacturer. With the power of state-of-the-art techniques, the creative agency can support their customer by using generative AI models within their secure AWS environment.
The solution is developed with Generative AI and Text-to-Image models in Amazon SageMaker. SageMaker is a fully managed machine learning (ML) service that that makes it straightforward to build, train, and deploy ML models for any use case with fully managed infrastructure, tools, and workflows. Stable Diffusion is a text-to-image foundation model from Stability AI that powers the image generation process. Diffusers are pre-trained models that use Stable Diffusion to use an existing image to generate new images based on a prompt. Combining Stable Diffusion with Diffusers like ControlNet can take existing brand-specific content and develop stunning versions of it. Key benefits of developing the solution within AWS along with Amazon SageMaker are:

Privacy – Storing the data in Amazon Simple Storage Service (Amazon S3) and using SageMaker to host models allows you to adhere to security best practices within your AWS account while not exposing assets publicly.
Scalability – The Stable Diffusion model, when deployed as a SageMaker endpoint, brings scalability by allowing you to configure instance sizes and number of instances. SageMaker endpoints also have auto scaling features and are highly available.
Flexibility – When creating and deploying endpoints, SageMaker provides the flexibility to choose GPU instance types. Also, instances behind SageMaker endpoints can be changed with minimum effort as business needs change. AWS has also developed hardware and chips using AWS Inferentia2 for high performance at the lowest cost for generative AI inference.
Rapid innovation – Generative AI is a rapidly evolving domain with new approaches, and models are being constantly developed and released. Amazon SageMaker JumpStart regularly onboards new models along with foundation models.
End-to-end integration – AWS allows you to integrate the creative process with any AWS service and develop an end-to-end process using fine-grained access control through AWS Identity and Access Management (IAM), notification through Amazon Simple Notification Service (Amazon SNS), and postprocessing with the event-driven compute service AWS Lambda.
Distribution – When the new creatives are generated, AWS allows distributing the content across global channels in multiple Regions using Amazon CloudFront.

For this post, we use the following GitHub sample, which uses Amazon SageMaker Studio with foundation models (Stable Diffusion), prompts, computer vision techniques, and a SageMaker endpoint to generate new images from existing images. The following diagram illustrates the solution architecture.

The workflow contains the following steps:

We store the existing content (images, brand styles, and so on) securely in S3 buckets.
Within SageMaker Studio notebooks, the original image data is transformed to images using computer vision techniques, which preserves the shape of the product (the car model), removes color and background, and generates monotone intermediate images.
The intermediate image acts as a control image for Stable Diffusion with ControlNet.
We deploy a SageMaker endpoint with the Stable Diffusion text-to-image foundation model from SageMaker Jumpstart and ControlNet on a preferred GPU-based instance size.
Prompts describing new backgrounds and car colors along with the intermediate monotone image are used to invoke the SageMaker endpoint, yielding new images.
New images are stored in S3 buckets as they’re generated.

Deploy ControlNet on SageMaker endpoints
To deploy the model to SageMaker endpoints, we must create a compressed file for each individual technique model artifact along with the Stable Diffusion weights, inference script, and NVIDIA Triton config file.
In the following code, we download the model weights for the different ControlNet techniques and Stable Diffusion 1.5 to the local directory as tar.gz files:

if ids ==”runwayml/stable-diffusion-v1-5″:
snapshot_download(ids, local_dir=str(model_tar_dir), local_dir_use_symlinks=False,ignore_patterns=unwanted_files_sd)

elif ids ==”lllyasviel/sd-controlnet-canny”:
snapshot_download(ids, local_dir=str(model_tar_dir), local_dir_use_symlinks=False)

To create the model pipeline, we define an inference.py script that SageMaker real-time endpoints will use to load and host the Stable Diffusion and ControlNet tar.gz files. The following is a snippet from inference.py that shows how the models are loaded and how the Canny technique is called:

controlnet = ControlNetModel.from_pretrained(
f”{model_dir}/{control_net}”,
torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32)

pipe = StableDiffusionControlNetPipeline.from_pretrained(
f”{model_dir}/sd-v1-5″,
controlnet=controlnet,
torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32)

# Define technique function for Canny
image = cv2.Canny(image, low_threshold, high_threshold)

We deploy the SageMaker endpoint with the required instance size (GPU type) from the model URI:

huggingface_model = HuggingFaceModel(
model_data=model_s3_uri, # path to your trained sagemaker model
role=role, # iam role with permissions to create an Endpoint
py_version=”py39″, # python version of the DLC
image_uri=image_uri,
)

# Deploy model as SageMaker Endpoint
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type=”ml.p3.2xlarge”,
)

Generate new images
Now that the endpoint is deployed on SageMaker endpoints, we can pass in our prompts and the original image we want to use as our baseline.
To define the prompt, we create a positive prompt, p_p, for what we’re looking for in the new image, and the negative prompt, n_p, for what is to be avoided:

p_p=”metal orange colored car, complete car, colour photo, outdoors in a pleasant landscape, realistic, high quality”

n_p=”cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, blurry, bad anatomy, bad proportions”

Finally, we invoke our endpoint with the prompt and source image to generate our new image:

request={“prompt”:p_p,
“negative_prompt”:n_p,
“image_uri”:’s3://<bucker>/sportscar.jpeg’, #existing content
“scale”: 0.5,
“steps”:20,
“low_threshold”:100,
“high_threshold”:200,
“seed”: 123,
“output”:”output”}
response=predictor.predict(request)

Different ControlNet techniques
In this section, we compare the different ControlNet techniques and their effect on the resulting image. We use the following original image to generate new content using Stable Diffusion with Control-net in Amazon SageMaker.

The following table shows how the technique output dictates what, from the original image, to focus on.

Technique Name
Technique Type
Technique Output
Prompt
Stable Diffusion with ControlNet

canny
A monochrome image with white edges on a black background.

metal orange colored car, complete car, colour photo, outdoors in a pleasant landscape, realistic, high quality

depth
A grayscale image with black representing deep areas and white representing shallow areas.

metal red colored car, complete car, colour photo, outdoors in pleasant landscape on beach, realistic, high quality

hed
A monochrome image with white soft edges on a black background.

metal white colored car, complete car, colour photo, in a city, at night, realistic, high quality

scribble
A hand-drawn monochrome image with white outlines on a black background.

metal blue colored car, similar to original car, complete car, colour photo, outdoors, breath-taking view, realistic, high quality, different viewpoint

Clean up
After you generate new ad creatives with generative AI, clean up any resources that won’t be used. Delete the data in Amazon S3 and stop any SageMaker Studio notebook instances to not incur any further charges. If you used SageMaker JumpStart to deploy Stable Diffusion as a SageMaker real-time endpoint, delete the endpoint either through the SageMaker console or SageMaker Studio.
Conclusion
In this post, we used foundation models on SageMaker to create new content images from existing images stored in Amazon S3. With these techniques, marketing, advertisement, and other creative agencies can use generative AI tools to augment their ad creatives process. To dive deeper into the solution and code shown in this demo, check out the GitHub repo.
Also, refer to Amazon Bedrock for use cases on generative AI, foundation models, and text-to-image models.

About the Authors
Sovik Kumar Nath is an AI/ML solution architect with AWS. He has extensive experience designing end-to-end machine learning and business analytics solutions in finance, operations, marketing, healthcare, supply chain management, and IoT. Sovik has published articles and holds a patent in ML model monitoring. He has double masters degrees from the University of South Florida, University of Fribourg, Switzerland, and a bachelors degree from the Indian Institute of Technology, Kharagpur. Outside of work, Sovik enjoys traveling, taking ferry rides, and watching movies.
Sandeep Verma is a Sr. Prototyping Architect with AWS. He enjoys diving deep into customer challenges and building prototypes for customers to accelerate innovation. He has a background in AI/ML, founder of New Knowledge, and generally passionate about tech. In his free time, he loves traveling and skiing with his family.
Uchenna Egbe is an Associate Solutions Architect at AWS. He spends his free time researching about herbs, teas, superfoods, and how to incorporate them into his daily diet.
Mani Khanuja is an Artificial Intelligence and Machine Learning Specialist SA at Amazon Web Services (AWS). She helps customers using machine learning to solve their business challenges using the AWS. She spends most of her time diving deep and teaching customers on AI/ML projects related to computer vision, natural language processing, forecasting, ML at the edge, and more. She is passionate about ML at edge, therefore, she has created her own lab with self-driving kit and prototype manufacturing production line, where she spend lot of her free time.

Top 18 AI-Based Website Builders in 2023

10Web

To assist website owners in more effectively creating and managing their websites, 10Web provides a WordPress platform driven by AI. The platform has technologies like AI Assistant, AI Builder, Automated WordPress Hosting, BuddyBoss Hosting, 1-click Migration, Real-Time Backup, Security, and PageSpeed Booster. The drag-and-drop Elementor-based editor of the AI Builder enables users to design or replicate any website using AI in minutes swiftly.

TeleportHQ

A website and UI builder powered by AI, TeleportHQ employs code produced by OpenAI. It gives web designers newfound speed and accuracy while building websites and components. Developers may swiftly go from a concept to a working prototype using TeleportHQ’s Vision API to transform hand-drawn wireframes into digital designs.

Users may create whole websites or specific components using AI starting from pre-made templates. In addition, TeleportHQ provides a low-code environment for editing and working together on code, a wireframing plugin for Figma, and tutorials to quickly learn how to utilize the system.

AiDA 

Join the fastest growing ML Community on Reddit
Bookmark’s Artificial Intelligence Design Assistant (AiDA) creates and optimizes websites to boost user interaction and sales. Employing patented machine learning algorithms to examine millions of data points and occasionally giving forth unique optimization recommendations removes 90% of the pain points related to site design.

AiDA will also provide recommendations on improving the user’s website so that visitors enjoy the greatest possible experience. Users may also specify certain business objectives for AiDA to concentrate on, such as generating more appointments, boosting e-commerce page views, generating more email leads, generating more phone calls, and focusing on particular website areas.

Durable AI

Durable AI is a cutting-edge website builder that uses artificial intelligence (AI) to assist business owners in rapidly and simply creating expert websites. With AI-generated features such as a name generator, professional images, AI-written text, and custom domains, Durable allows consumers to construct their website in only 30 seconds.

The editor enables even greater website customization, including adding logos, images, unique objects, and more. Additional capabilities Durable offers include invoicing, client relationship management tools, creating promotional material, and more, all in one location.

Appy Pie

The no-code AI platform from Appy Pie enables coding-free application creation and process automation. Users may combine and streamline their data into a single source using their drag-and-drop capabilities, making it simple. Their platform offers seamless interfaces with different data sources and apps, shattering all hurdles and limitations regarding no-code. This platform appeals to those who value efficiency and price since it is affordable and brings items to market considerably quicker than rivals.

Anyone needing workflow or business process automation software may utilize Appy Pie’s no-code AI platform since it is well-structured, simple to use, and reasonably priced.

B12 

B12 is a platform and website builder designed with professional service providers in mind. Its features make it simple to draw in customers, close deals, satisfy clients, and simplify corporate processes. B12’s AI-powered platform automatically creates an industry-specific website draft, allocating a team of copywriting, design, and launch professionals to assist in customizing and publishing the site.

OReilly, FastCompany, TechCrunch, The Wall Street Journal, and VentureBeat have all been written on B12. It is intended to make it simple for business service providers to build and manage a website that performs as well as they do in terms of online credibility.

Weaverse 

A website builder called Weaverse enables users to develop efficient eCommerce stores. It uses headless frameworks such as Shopify Hydrogen, Remix, and NextJS for quick page rendering. Users may create pages using Weaverse’s drag-and-drop visual page builder without worrying about breaking frontend code. Web professionals may also change the source code and have access to a whole new development experience thanks to its included script editor.

Weaverse also provides a starting plan that enables users to get going without spending any money and a section template library. Shopify merchants who want to create a high-converting shop without caring about CSS or web design can use Weaverse.

Chat2Build 

By enabling users to communicate with an AI to design and build a website without the need for coding or hosting, Chat2Build is an AI-powered web software that streamlines the website-building process. An all-in-one solution for quickly building and deploying a website, the AI considers user demands to create a personalized website matched to their company preferences.

The websites created using Chat2Build may be edited or updated by users via the web app, and they are mobile-responsive and adapt to multiple screen sizes. Customers of Chat2Build may get assistance from the company via the help center, email, and live chat. A free basic plan, a pro plan for small to medium-sized firms, and an enterprise package for big corporations and unique projects are available via Chat2Build.

Webullar 

Webullar is a platform for building websites that utilizes AI to build fully functional websites in about 30 seconds. It provides customers with a quick and uncomplicated method that enables them to establish a website with only one line explaining their company. Webullar handles labor-intensive tasks, including content creation, mobile optimization, and putting up security measures. Additionally, the tool provides a special price package that includes endless updates, a domain connection, site visits, and SEO tools.

Sitekick 

Without coding, design, or copywriting expertise, customers of Sitekick’s AI-powered landing page builder can create stunning, interesting, and responsive landing pages. Elastic Themes and Webflow power Sitekick uses an engine trained on more than 1000 extremely effective landing pages from various sectors. Users can quickly develop a landing page with only a brief company description, and Sitekick will handle the rest. Customers claim that Sitekick lets them concentrate on more vital duties while saving time and using their innovative ideas.

Additionally, Sitekick provides the option to edit and optimize current pages quickly. Sitekick is the ideal way for consumers to create landing pages that appear professional without a headache.

Essai 

Essai is a no-code platform powered by AI that enables users to quickly and easily construct websites by defining the layout or content they want. Based on the description, the tool generates a variety of design possibilities, and users may update the design aspects using a conversational UI feature. The platform offers AI-assisted content and design and claims it can quickly build complete website blocks. Users may introduce their products to a larger audience, get feedback, and expedite the creation of their landing sites. Essai strives to make website building easier by giving anybody, regardless of skill level, access to a user-friendly and accessible platform.

Landing-AI

A website-building application called Landing AI uses generative artificial intelligence to produce a unique landing page quickly. The program creates copywriting, color palettes, logos, and drawings that fit each of the 29 themes, which may represent a user’s branding. The user enters information about their project, including the product, market, and target audience, and the AI utilizes this information to generate the best sales presentation. After selecting a branding theme, the customer receives three landing page versions, each with its wording, logo, and images.

The user may export all the content, code, and pictures with only one click. The right people for Landing AI want a unique and noteworthy website with compelling copywriting and want to expand their project rather than spend time on development.

Superflow Rewrite

Superflow Rewrite is an AI-powered application that assists web agencies and teams in quickly and easily write catchy headlines and product descriptions. It connects with ChatGPT, a platform powered by AI that enables users to produce a copy promptly. Users of Superflow Rewrite may collaborate on live products, assign assignments, autogenerate copy, and annotate live web pages. Additionally, it provides a task manager to keep everyone in the loop and private annotations for more intimate activities. It is simple to use and keep track of exercises thanks to the interaction with other applications like Clickup, Webflow, Asana, and Slack.

Butternut AI

With the aid of Butternut AI, customers may create websites immediately without knowing any code. Within 20 seconds, the software develops completely operational webpages with content and images prepared for launch. Users may alter the information and pictures on their websites to reflect their brand voice. By providing complete SEO optimization, Butternut AI ensures that the website appears at the top of Google searches. Users of the software may communicate through text and utilize it as their personal website developer. To develop a website using Butternut AI, users must provide their company’s name and keywords that best describe it. It is simple for anybody to become a website developer because of the platform’s simplicity.

Debuild

Debuild is a low-code platform with AI that aids in the speedy development of web apps for consumers and developers. It offers a graphical user interface that lets users quickly deploy an interface after visually putting one together. It can automatically produce SQL code and React components, eliminating the need for human coding. Debuild is designed to be quick, enabling users to transform a concept into reality quickly. Users may create a free account on the website and utilize the platform for nothing. The business also offers several resources, such as an About page, terms of service, and privacy policy.

Aspen

Aspen is a low-code AI framework for creating generative web applications. It allows developers to build AI apps fast and simply without complex code. It offers several capabilities, including authentication, payments, templates, hosting, and deployment, to make the development process easier. It also gives users a simple playground to train their unique models.

R.O.B.

The AI-powered website copywriter R.O.B. (Robot Of Business) allows customers to develop customized 4-page websites rapidly. The program uses machine learning and natural language processing to produce website material that accurately represents the user’s wants and company. Users of ROB may also use it to compose emails, keeping them informed of the most recent events. The platform’s user-friendly tool design requires customers to provide their essential data before proceeding.

Within 15 minutes of receiving their data, the AI will produce a great first draft of the website text. The website may then be reviewed and edited, and necessary modifications made before publication. ROB gives customers an effective and simple approach to building websites without having to type a single word.

Studio 

Studio is a design tool with AI enhancements for building contemporary websites. Its distinctive qualities set it apart from other design tools and are intended to assist designers in creating with the power of AI. Designers may use Studio to indicate a problem area and get design recommendations. They may also use a voice assistant to communicate sophisticated style instructions. Additionally, Studio offers an auto-responsive capability that can automatically adjust layouts utilizing algorithms that reflow items without interfering with the layout.

Wix ADI

Creating websites using Wix is quite reliable. The platform gathers information about the user and uses it to create a unique website with business features, an enterprise-grade infrastructure, powerful SEO, and marketing tools. A website is automatically created using Wix ADI, also known as Wix Artificial Design Intelligence, a feature of the Wix website builder. Instead of being a special tool, their artificial intelligence is effectively integrated into the current platform. This validates their claim that they are one of the simplest website builders for marketers and non-technical folks.

Don’t forget to join our 27k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club

The post Top 18 AI-Based Website Builders in 2023 appeared first on MarkTechPost.

A New AI Research Proposes The PanGu-Coder2 Model and The RRTF Framewo …

Large language models (LLMs) have gained a massive amount of attention in the recent months. These models mimic humans by answering questions relevantly, generating precise content, translating languages, summarizing long textual paragraphs, and completing code samples. LLMs have been developing quickly, with regular releases of potent models showcasing excellent performance in code generation tasks. Researchers have looked into several techniques, including supervised fine-tuning, instruction tuning, reinforcement learning, and others, to improve the capacity of pre-trained code LLMs to generate code.

In a recent study, a team of researchers from Huawei Cloud Co., Ltd., Chinese Academy of Science, and Peking University introduced a unique framework called RRTF (Rank Responses to align Test&Teacher Feedback), which successfully and efficiently enhances pre-trained large language models for code production. The RRTF framework has been developed with the intention of improving Code LLMs’ performance in code generation activities. It uses natural language LLM alignment techniques and rates feedback rather than utilizing absolute reward values.

The Reinforcement Learning from Human Feedback approach, which provides models like InstructGPT or ChatGPT with a simpler and more effective training approach by using ranking responses as feedback instead of absolute reward values, serves as inspiration for this novel approach, which applies natural language LLM alignment techniques to Code LLMs. As a result of applying the RRTF framework, the team has also introduced the PanGu-Coder2 model, which achieves an outstanding 62.20% pass rate at the top-1 position on the OpenAI HumanEval benchmark.

Join the fastest growing ML Community on Reddit
By using the approach on StarCoder 15B, the team has exceeded PanGu-Coder and achieved the best performance of all documented Code LLMs, proving the usefulness of RRTF. Comprehensive analyses of three benchmarks—HumanEval, CoderEval, and LeetCode—have indicated that Code LLMs may be able to outperform natural language models of the same or greater sizes in code creation tasks. The study also emphasizes the value of high-quality data in enhancing models’ ability to follow instructions and write code.

The team has summarized the contributions as follows –

The RRTF Optimisation Paradigm has been introduced, which has a number of benefits that make it a model-neutral, straightforward, and data-efficient approach.

The PanGu-Coder2 model has also been introduced. By about 30%, PanGu-Coder2 greatly beats its original model. HumanEval, CoderEval, and LeetCode are a few of the benchmarks that show this significant speed gain.

PanGu-Coder2 outperforms all previously released Code LLMs in terms of code generation, achieving new state-of-the-art achievements.

The team has discussed their ideas and practical knowledge on building good training data for code generation.

The PanGu-Coder2 model has been trained using the RRTF framework, and the team has offered helpful insights into this process.

In addition to improving the code generation efficiency, the team has suggested optimization methods used by PanGu-Coder2 to guarantee quick inference. This field’s findings help create realistic deployment scenarios because efficient inference is essential for real-world applications.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 27k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

The post A New AI Research Proposes The PanGu-Coder2 Model and The RRTF Framework that Efficiently Boosts Pre-Trained Large Language Models for Code Generation appeared first on MarkTechPost.

This AI Paper Deploys a Light-Weight Foundational Model in Outer Space …

Space Technology is advancing day by day. There have been efforts from different research groups to build Machine Learning and Artificial Intelligence models in outer space that would influence space research. The data that is collected provides us with information regarding aerial mapping, weather prediction, and deforestation. These satellites collect the data but cannot process the dataset through data processing techniques. Hence, these satellites are unable to fetch rapid events like natural disasters.

To have an approach in space technology to solve these problems, researchers trained the ML models in space that would process this data. The researchers trained the simpler models at an earlier stage that detected the cover on the clouds directly while training in space rather than training on the ground. The training approach is called few-shot learning or active learning. This approach takes the most important features required to train the model. Hence, It is called few-shot learning. The main advantage of this model over others is that the data that is being collected can be converted into smaller dimensions, making the model faster and more effective. This model falls under the category of Computer Vision models. The training part of this model consists of keeping the important values combined in the form of a vector. The aim of this model is to detect whether there is cloud cover present or not. This results in a classification model to train.

The model is broadly classified into two categories. The first part of the model is to collect the images and train them on the ground, while the second part of the model classifies the model based on binary classification, which gives us information regarding the cloud cover. The second part is trained on the satellite itself. The training requires several rounds of epochs to be trained. On the other hand, the team’s tiny model completed the training in one and a half seconds. Researchers also said that the model is automatically adaptable for all forms of data. Researchers are still working on different types of models that would work on different changes of interest. 

Join the fastest growing ML Community on Reddit
Researchers are still working on a model that would solve complex datasets consisting of images from hyperspectral satellites. In this research, the model performance parameters like recall, precision, and F1 score are quite high. These scenarios consist of increasing opportunities in the space research world which is around the Earth and also in deep space. Researchers are going into the deep space with the emerging technology of Artificial Intelligence, which helps to explore the deep space.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 27k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

The post This AI Paper Deploys a Light-Weight Foundational Model in Outer Space for the First Time appeared first on MarkTechPost.

Build protein folding workflows to accelerate drug discovery on Amazon …

Drug development is a complex and long process that involves screening thousands of drug candidates and using computational or experimental methods to evaluate leads. According to McKinsey, a single drug can take 10 years and cost an average of $2.6 billion to go through disease target identification, drug screening, drug-target validation, and eventual commercial launch. Drug discovery is the research component of this pipeline that generates candidate drugs with the highest likelihood of being effective with the least harm to patients. Machine learning (ML) methods can help identify suitable compounds at each stage in the drug discovery process, resulting in more streamlined drug prioritization and testing, saving billions in drug development costs (for more information, refer to AI in biopharma research: A time to focus and scale).
Drug targets are typically biological entities called proteins, the building blocks of life. The 3D structure of a protein determines how it interacts with a drug compound; therefore, understanding the protein 3D structure can add significant improvements to the drug development process by screening for drug compounds that fit the target protein structure better. Another area where protein structure prediction can be useful is understanding the diversity of proteins, so that we only select for drugs that selectively target specific proteins without affecting other proteins in the body (for more information, refer to Improving target assessment in biomedical research: the GOT-IT recommendations). Precise 3D structures of target proteins can enable drug design with higher specificity and lower likelihood of cross-interactions with other proteins.
However, predicting how proteins fold into their 3D structure is a difficult problem, and traditional experimental methods such as X-ray crystallography and NMR spectroscopy can be time-consuming and expensive. Recent advances in deep learning methods for protein research have shown promise in using neural networks to predict protein folding with remarkable accuracy. Folding algorithms like AlphaFold2, ESMFold, OpenFold, and RoseTTAFold can be used to quickly build accurate models of protein structures. Unfortunately, these models are computationally expensive to run and the results can be cumbersome to compare at the scale of thousands of candidate protein structures. A scalable solution for using these various tools will allow researchers and commercial R&D teams to quickly incorporate the latest advances in protein structure prediction, manage their experimentation processes, and collaborate with research partners.
Amazon SageMaker is a fully managed service to prepare, build, train, and deploy high-quality ML models quickly by bringing together a broad set of capabilities purpose-built for ML. It offers a fully managed environment for ML, abstracting away the infrastructure, data management, and scalability requirements so you can focus on building, training, and testing your models.
In this post, we present a fully managed ML solution with SageMaker that simplifies the operation of protein folding structure prediction workflows. We first discuss the solution at the high level and its user experience. Next, we walk you through how to easily set up compute-optimized workflows of AlphaFold2 and OpenFold with SageMaker. Finally, we demonstrate how you can track and compare protein structure predictions as part of a typical analysis. The code for this solution is available in the following GitHub repository.
Solution overview
In this solution, scientists can interactively launch protein folding experiments, analyze the 3D structure, monitor the job progress, and track the experiments in Amazon SageMaker Studio.
The following screenshot shows a single run of a protein folding workflow with Amazon SageMaker Studio. It includes the visualization of the 3D structure in a notebook, run status of the SageMaker jobs in the workflow, and links to the input parameters and output data and logs.

The following diagram illustrates the high-level solution architecture.

To understand the architecture, we first define the key components of a protein folding experiment as follows:

FASTA target sequence file – The FASTA format is a text-based format for representing either nucleotide sequences or amino acid (protein) sequences, in which nucleotides or amino acids are represented using single-letter codes.
Genetic databases – A genetic database is one or more sets of genetic data stored together with software to enable users to retrieve genetic data. Several genetic databases are required to run AlphaFold and OpenFold algorithms, such as BFD, MGnify, PDB70, PDB, PDB seqres, UniRef30 (FKA UniClust30), UniProt, and UniRef90.
Multiple sequence alignment (MSA) – A sequence alignment is a way of arranging the primary sequences of a protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. The input features for predictions include MSA data.
Protein structure prediction – The structure of input target sequences is predicted with folding algorithms like AlphaFold2 and OpenFold that use a multitrack transformer architecture trained on known protein templates.
Visualization and metrics – Visualize the 3D structure with the py3Dmol library as an interactive 3D visualization. You can use metrics to evaluate and compare structure predictions, most notably root-mean-square deviation (RMSD) and template modeling Score (TM-score)

The workflow contains the following steps:

Scientists use the web-based SageMaker ML IDE to explore the code base, build protein sequence analysis workflows in SageMaker Studio notebooks, and run protein folding pipelines via the graphical user interface in SageMaker Studio or the SageMaker SDK.
Genetic and structure databases required by AlphaFold and OpenFold are downloaded prior to pipeline setup using Amazon SageMaker Processing, an ephemeral compute feature for ML data processing, to an Amazon Simple Storage Service (Amazon S3) bucket. With SageMaker Processing, you can run a long-running job with a proper compute without setting up any compute cluster and storage and without needing to shut down the cluster. Data is automatically saved to a specified S3 bucket location.
An Amazon FSx for Lustre file system is set up, with the data repository being the S3 bucket location where the databases are saved. FSx for Lustre can scale to hundreds of GB/s of throughput and millions of IOPS with low-latency file retrieval. When starting an estimator job, SageMaker mounts the FSx for Lustre file system to the instance file system, then starts the script.
Amazon SageMaker Pipelines is used to orchestrate multiple runs of protein folding algorithms. SageMaker Pipelines offers a desired visual interface for interactive job submission, traceability of the progress, and repeatability.
Within a pipeline, two computationally heavy protein folding algorithms—AlphaFold and OpenFold—are run with SageMaker estimators. This configuration supports mounting of an FSx for Lustre file system for high throughput database search in the algorithms. A single inference run is divided into two steps: an MSA construction step using an optimal CPU instance and a structure prediction step using a GPU instance. These substeps, like SageMaker Processing in Step 2, are ephemeral, on-demand, and fully managed. Job output such as MSA files, predicted pdb structure files, and other metadata files are saved in a specified S3 location. A pipeline can be designed to run one single protein folding algorithm or run both AlphaFold and OpenFold after a common MSA construction.
Runs of the protein folding prediction are automatically tracked by Amazon SageMaker Experiments for further analysis and comparison. The job logs are kept in Amazon CloudWatch for monitoring.

Prerequisites
To follow this post and run this solution, you need to have completed several prerequisites. Refer to the GitHub repository for a detailed explanation of each step.

A SageMaker domain and a user profile – If you don’t have a SageMaker Studio domain, refer to Onboard to Amazon SageMaker Domain Using Quick Setup.
IAM policies – Your user should have the AWS Identity and Access Management (IAM) AmazonSageMakerFullAccess policy attached, the ability to build Docker container images to Amazon Elastic Container Registry (Amazon ECR), and FSx for Lustre file systems created. See the readme for more details.
Network – A VPC with an Amazon S3 VPC endpoint. We use this VPC location to provision the FSx for Lustre file system and SageMaker jobs.
Docker resources – Run 00-prerequisite.ipynb from the repository to build the Docker images, download the genetic database to Amazon S3, and create an FSx for Lustre file system with a data repository association to the S3 bucket.

Run protein folding on SageMaker
We use the fully managed capabilities of SageMaker to run computationally heavy protein folding jobs without much infrastructure overhead. SageMaker uses container images to run custom scripts for generic data processing, training, and hosting. You can easily start an ephemeral job on-demand that runs a program with a container image with a couple of lines of the SageMaker SDK without self-managing any compute infrastructure. Specifically, the SageMaker estimator job provides flexibility when it comes to choice of container image, run script, and instance configuration, and supports a wide variety of storage options, including file systems such as FSx for Lustre. The following diagram illustrates this architecture.

Folding algorithms like AlphaFold and OpenFold use a multitrack transformer architecture trained on known protein templates to predict the structure of unknown peptide sequences. These predictions can be run on GPU instances to provide best throughput and lowest latency. The input features however for these predictions include MSA data. MSA algorithms are CPU-dependent and can require several hours of processing time.
Running both the MSA and structure prediction steps in the same computing environment can be cost-inefficient because the expensive GPU resources remain idle while the MSA step runs. Therefore, we optimize the workflow into two steps. First, we run a SageMaker estimator job on a CPU instance specifically to compute MSA alignment given a particular FASTA input sequence and source genetic databases. Then we run a SageMaker estimator job on a GPU instance to predict the protein structure with a given input MSA alignment and a folding algorithm like AlphaFold or OpenFold.
Run MSA generation
For MSA computation, we include a custom script run_create_alignment.sh and create_alignments.py script that is adopted from the existing AlphaFold prediction source run_alphafold.py. Note that this script may need to be updated if the source AlphaFold code is updated. The custom script is provided to the SageMaker estimator via script mode. The key components of the container image, script mode implementation, and setting up a SageMaker estimator job are also part of the next step of running folding algorithms, and are described further in the following section.
Run AlphaFold
We get started by running an AlphaFold structure prediction with a single protein sequence using SageMaker. Running an AlphaFold job involves three simple steps, as can be seen in 01-run_stepbystep.ipynb. First, we build a Docker container image based on AlphaFold’s Dockerfile so that we can also run AlphaFold in SageMaker. Second, we construct the script run_alphafold.sh that instructs how AlphaFold should be run. Third, we construct and run a SageMaker estimator with the script, the container, instance type, data, and configuration for the job.
Container image
The runtime requirement for a container image to run AlphaFold (OpenFold as well) in SageMaker can be greatly simplified with AlphaFold’s Dockerfile. We only need to add a handful of simple layers on top to install a SageMaker-specific Python library so that a SageMaker job can communicate with the container image. See the following code:

# In Dockerfile.alphafold
## SageMaker specific
RUN pip3 install sagemaker-training –upgrade –no-cache-dir
ENV PATH=”/opt/ml/code:${PATH}”
# this environment variable is used by the SageMaker Estimator to determine our user code directory
ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/code

Input script
We then provide the script run_alphafold.sh that runs run_alphafold.py from the AlphaFold repository that is currently placed in the container /app/alphafold/run_alphafold.py. When this script is run, the location of the genetic databases and the input FASTA sequence will be populated by SageMaker as environment variables (SM_CHANNEL_GENETIC and SM_CHANNEL_FASTA, respectively). For more information, refer to Input Data Configuration.
Estimator job
We next create a job using a SageMaker estimator with the following key input arguments, which instruct SageMaker to run a specific script using a specified container with the instance type or count, your networking option of choice, and other parameters for the job. vpc_subnet_ids and security_group_ids instruct the job to run inside a specific VPC where the FSx for Lustre file system is in so that we can mount and access the filesystem in the SageMaker job. The output path refers to a S3 bucket location where the final product of AlphaFold will be uploaded to at the end of a successful job by SageMaker automatically. Here we also set a parameter DB_PRESET, for example, to be passed in and accessed within run_alphafold.sh as an environmental variable during runtime. See the following code:
from sagemaker.estimator import Estimator
alphafold_image_uri=f'{account}.dkr.ecr.{region}.amazonaws.com/sagemaker-studio-alphafold:v2.3.0′
instance_type=’ml.g5.2xlarge’
instance_count=1
vpc_subnet_ids=[‘subnet-xxxxxxxxx’] # okay to use a default VPC
security_group_ids=[‘sg-xxxxxxxxx’]
env={‘DB_PRESET’: db_preset} # <full_dbs|reduced_dbs>
output_path=’s3://%s/%s/job-output/’%(default_bucket, prefix)

estimator_alphafold = Estimator(
source_dir=’src’, # directory where run_alphafold.sh and other runtime files locate
entry_point=’run_alphafold.sh’, # our script that runs /app/alphafold/run_alphafold.py
image_uri=alphafold_image_uri, # container image to use
instance_count=instance_count, #
instance_type=instance_type,
subnets=vpc_subnet_ids,
security_group_ids=security_group_ids,
environment=env,
output_path=output_path,
…)
Finally, we gather the data and let the job know where they are. The fasta data channel is defined as an S3 data input that will be downloaded from an S3 location into the compute instance at the beginning of the job. This allows great flexibility to manage and specify the input sequence. On the other hand, the genetic data channel is defined as a FileSystemInput that will be mounted onto the instance at the beginning of the job. The use of an FSx for Lustre file system as a way to bring in close to 3 TB of data avoids repeatedly downloading data from an S3 bucket to a compute instance. We call the .fit method to kick off an AlphaFold job:
from sagemaker.inputs import FileSystemInput
file_system_id=’fs-xxxxxxxxx’
fsx_mount_id=’xxxxxxxx’
file_system_directory_path=f’/{fsx_mount_id}/{prefix}/alphafold-genetic-db’ # should be the full prefix from the S3 data repository

file_system_access_mode=’ro’ # Specify the access mode (read-only)
file_system_type=’FSxLustre’ # Specify your file system type

genetic_db = FileSystemInput(
file_system_id=file_system_id,
file_system_type=file_system_type,
directory_path=file_system_directory_path,
file_system_access_mode=file_system_access_mode)

s3_fasta=sess.upload_data(path=’sequence_input/T1030.fasta’, # FASTA location locally
key_prefix=’alphafoldv2/sequence_input’) # S3 prefix. Bucket is sagemaker default bucket
fasta = sagemaker.inputs.TrainingInput(s3_fasta,
distribution=’FullyReplicated’,
s3_data_type=’S3Prefix’,
input_mode=’File’)
data_channels_alphafold = {‘genetic’: genetic_db, ‘fasta’: fasta}

estimator_alphafold.fit(inputs=data_channels_alphafold,
wait=False) # wait=False gets the cell back in the notebook; set to True to see the logs as the job progresses
That’s it. We just submitted a job to SageMaker to run AlphaFold. The logs and output including .pdb prediction files will be written to Amazon S3.
Run OpenFold
Running OpenFold in SageMaker follows a similar pattern, as shown in the second half of 01-run_stepbystep.ipynb. We first add a simple layer to get the SageMaker-specific library to make the container image SageMaker compatible on top of OpenFold’s Dockerfile. Secondly, we construct a run_openfold.sh as an entry point for the SageMaker job. In run_openfold.sh, we run the run_pretrained_openfold.py from OpenFold, which is available in the container image with the same genetic databases we downloaded for AlphaFold and OpenFold’s model weights (–openfold_checkpoint_path). In terms of input data locations, besides the genetic databases channel and the FASTA channel, we introduce a third channel, SM_CHANNEL_PARAM, so that we can flexibly pass in the model weights of choice from the estimator construct when we define and submit a job. With the SageMaker estimator, we can easily submit jobs with different entry_point, image_uri, environment, inputs, and other configurations for OpenFold with the same signature. For the data channel, we add a new channel, param, as an Amazon S3 input along with the use of the same genetic databases from the FSx for Lustre file system and FASTA file from Amazon S3. This, again, allows us easily specify the model weight to use from the job construct. See the following code:
s3_param=sess.upload_data(path=’openfold_params/finetuning_ptm_2.pt’,
key_prefix=f'{prefix}/openfold_params’)
param = sagemaker.inputs.TrainingInput(s3_param,
distribution=”FullyReplicated”,
s3_data_type=”S3Prefix”,
input_mode=’File’)

data_channels_openfold = {“genetic”: genetic_db, ‘fasta’: fasta, ‘param’: param}

estimator_openfold.fit(inputs=data_channels_openfold,
wait=False)
To access the final output after the job completes, we run the following commands:
!aws s3 cp {estimator_openfold.model_data} openfold_output/model.tar.gz
!tar zxfv openfold_output/model.tar.gz -C openfold_output/
Runtime performance
The following table shows the cost savings of 57% and 51% for AlphaFold and OpenFold, respectively, by splitting the MSA alignment and folding algorithms in two jobs as compared to a single compute job. It allows us to right-size the compute for each job: ml.m5.4xlarge for MSA alignment and ml.g5.2xlarge for AlphaFold and OpenFold.

Job Details
Instance Type
Input FASTA Sequence
Runtime
Cost

MSA alignment + OpenFold
ml.g5.4xlarge
T1030
50 mins
$1.69

MSA alignment + AlphaFold
ml.g5.4xlarge
T1030
65 mins
$2.19

MSA alignment
ml.m5.4xlarge
T1030
46 mins
$0.71

OpenFold
ml.g5.2xlarge
T1030
6 mins
$0.15

AlphaFold
ml.g5.2xlarge
T1030
21 mins
$0.53

Build a repeatable workflow using SageMaker Pipelines
With SageMaker Pipelines, we can create an ML workflow that takes care of managing data between steps, orchestrating their runs, and logging. SageMaker Pipelines also provides us a UI to visualize our pipeline and easily run our ML workflow.
A pipeline is created by combing a number of steps. In this pipeline, we combine three training steps, which require an SageMaker estimator. The estimators defined in this notebook are very similar to those defined in 01-run_stepbystep.ipynb, with the exception that we use Amazon S3 locations to point to our inputs and outputs. The dynamic variables allow SageMaker Pipelines to run steps one after another and also permit the user to retry failed steps. The following screenshot shows a Directed Acyclic Graph (DAG), which provides information on the requirements for and relationships between each step of our pipeline.

Dynamic variables
SageMaker Pipelines is capable of taking user inputs at the start of every pipeline run. We define the following dynamic variables, which we would like to change during each experiment:

FastaInputS3URI – Amazon S3 URI of the FASTA file uploaded via SDK, Boto3, or manually.
FastFileName – Name of the FASTA file.
db_preset – Selection between full_dbs or reduced_dbs.
MaxTemplateDate – AlphaFold’s MSA step will search for the available templates before the date specified by this parameter.
ModelPreset – Select between AlphaFold models including monomer, monomer_casp14, monomer_ptm, and multimer.
NumMultimerPredictionsPerModel – Number of seeds to run per model when using multimer system.
InferenceInstanceType – Instance type to use for inference steps (both AlphaFold and OpenFold). The default value is ml.g5.2xlarge.
MSAInstanceType – Instance type to use for MSA step. The default value is ml.m5.4xlarge.

See the following code:
fasta_file = ParameterString(name=”FastaFileName”)
fasta_input = ParameterString(name=”FastaInputS3URI”)
pipeline_db_preset = ParameterString(name=”db_preset”,
default_value=’full_dbs’,
enum_values=[‘full_dbs’, ‘reduced_dbs’])
max_template_date = ParameterString(name=”MaxTemplateDate”)
model_preset = ParameterString(name=”ModelPreset”)
num_multimer_predictions_per_model = ParameterString(name=”NumMultimerPredictionsPerModel”)
msa_instance_type = ParameterString(name=”MSAInstanceType”, default_value=’ml.m5.4xlarge’)
instance_type = ParameterString(name=”InferenceInstanceType”, default_value=’ml.g5.2xlarge’)
A SageMaker pipeline is constructed by defining a series of steps and then chaining them together in a specific order where the output of a previous step becomes the input to the next step. Steps can be run in parallel and defined to have a dependency on a previous step. In this pipeline, we define an MSA step, which is the dependency for an AlphaFold inference step and OpenFold inference step that run in parallel. See the following code:
step_msa = TrainingStep(
name=”RunMSA”,
step_args=pipeline_msa_args,
)

step_alphafold = TrainingStep(
name=”RunAlphaFold”,
step_args=pipeline_alphafold_default_args,
)
step_alphafold.add_depends_on([step_msa])

step_openfold = TrainingStep(
name=”RunOpenFold”,
step_args=pipeline_openfold_args,
)
step_openfold.add_depends_on([step_msa]
To put all the steps together, we call the Pipeline class and provide a pipeline name, pipeline input variables, and the individual steps:
pipeline_name = f”ProteinFoldWorkflow”
pipeline = Pipeline(
name=pipeline_name,
parameters=[
fasta_input,
instance_type,
msa_instance_type,
pipeline_db_preset
],
steps=[step_msa, step_alphafold, step_openfold],
)

pipeline.upsert(role_arn=role, # run this if it’s the first time setting up the pipeline
description=’Protein_Workflow_MSA’)
Run the pipeline
In the last cell of the notebook 02-define_pipeline.ipynb, we show how to run a pipeline using the SageMaker SDK. The dynamic variables we described earlier are provided as follows:
!mkdir ./sequence_input/
!curl ‘https://www.predictioncenter.org/casp14/target.cgi?target=T1030&view=sequence’ > ./sequence_input/T1030.fasta
fasta_file_name = ‘T1030.fasta’

pathName = f’./sequence_input/{fasta_file_name}’
s3_fasta=sess.upload_data(path=pathName,
key_prefix=’alphafoldv2/sequence_input’)

PipelineParameters={
‘FastaInputS3URI’:s3_fasta,
‘db_preset’: ‘full_dbs’,
‘FastaFileName’: fasta_file_name,
‘MaxTemplateDate’: ‘2020-05-14’,
‘ModelPreset’: ‘monomer’,
‘NumMultimerPredictionsPerModel’: ‘5’,
‘InferenceInstanceType’:’ml.g5.2xlarge’,
‘MSAInstanceType’:’ml.m5.4xlarge’
}
execution = pipeline.start(execution_display_name=’SDK-Executetd’,
execution_description=’This pipeline was executed via SageMaker SDK’,
parameters=PipelineParameters
)
Track experiments and compare protein structures
For our experiment, we use an example protein sequence from the CASP14 competition, which provides an independent mechanism for the assessment of methods of protein structure modeling. The target T1030 is derived from the PDB 6P00 protein, and has 237 amino acids in the primary sequence. We run the SageMaker pipeline to predict the protein structure of this input sequence with both OpenFold and AlphaFold algorithms.
When the pipeline is complete, we download the predicted .pdb files from each folding job and visualize the structure in the notebook using py3Dmol, as in the notebook 04-compare_alphafold_openfold.ipynb.
The following screenshot shows the prediction from the AlphaFold prediction job.

The predicted structure is compared against its known base reference structure with PDB code 6poo archived in RCSB. We analyze the prediction performance against the base PDB code 6poo with three metrics: RMSD, RMSD with superposition, and template modeling score, as described in Comparing structures.

.
Input Sequence
Comparison With
RMSD
RMSD with Superposition
Template Modeling Score

AlphaFold
T1030
6poo
247.26
3.87
0.3515

The folding algorithms are now compared against each other for multiple FASTA sequences: T1030, T1090, and T1076. New target sequences may not have the base pdb structure in reference databases and therefore it’s useful to compare the variability between folding algorithms.

.
Input Sequence
Comparison With
RMSD
RMSD with Superposition
Template Modeling Score

AlphaFold
T1030
OpenFold
73.21
24.8
0.0018

AlphaFold
T1076
OpenFold
38.71
28.87
0.0047

AlphaFold
T1090
OpenFold
30.03
20.45
0.005

The following screenshot shows the runs of ProteinFoldWorkflow for the three FASTA input sequences with SageMaker Pipeline:

We also log the metrics with SageMaker Experiments as new runs of the same experiment created by the pipeline:
from sagemaker.experiments.run import Run, load_run
metric_type=’compare:’
experiment_name = ‘proteinfoldworkflow’
with Run(experiment_name=experiment_name, run_name=input_name_1, sagemaker_session=sess) as run:
run.log_metric(name=metric_type + “rmsd_cur”, value=rmsd_cur_one, step=1)
run.log_metric(name=metric_type + “rmds_fit”, value=rmsd_fit_one, step=1)
run.log_metric(name=metric_type + “tm_score”, value=tmscore_one, step=1)
We then analyze and visualize these runs on the Experiments page in SageMaker Studio.

The following chart depicts the RMSD value between AlphaFold and OpenFold for the three sequences: T1030, T1076, and T1090.

Conclusion
In this post, we described how you can use SageMaker Pipelines to set up and run protein folding workflows with two popular structure prediction algorithms: AlphaFold2 and OpenFold. We demonstrated a price performant solution architecture of multiple jobs that separates the compute requirements for MSA generation from structure prediction. We also highlighted how you can visualize, evaluate, and compare predicted 3D structures of proteins in SageMaker Studio.
To get started with protein folding workflows on SageMaker, refer to the sample code in the GitHub repo.

About the authors
Michael Hsieh is a Principal AI/ML Specialist Solutions Architect. He works with HCLS customers to advance their ML journey with AWS technologies and his expertise in medical imaging. As a Seattle transplant, he loves exploring the great mother nature the city has to offer, such as the hiking trails, scenery kayaking in the SLU, and the sunset at Shilshole Bay.
Shivam Patel is a Solutions Architect at AWS. He comes from a background in R&D and combines this with his business knowledge to solve complex problems faced by his customers. Shivam is most passionate about workloads in machine learning, robotics, IoT, and high-performance computing.
Hasan Poonawala is a Senior AI/ML Specialist Solutions Architect at AWS, Hasan helps customers design and deploy machine learning applications in production on AWS. He has over 12 years of work experience as a data scientist, machine learning practitioner, and software developer. In his spare time, Hasan loves to explore nature and spend time with friends and family.
Jasleen Grewal is a Senior Applied Scientist at Amazon Web Services, where she works with AWS customers to solve real world problems using machine learning, with special focus on precision medicine and genomics. She has a strong background in bioinformatics, oncology, and clinical genomics. She is passionate about using AI/ML and cloud services to improve patient care.

Is your model good? A deep dive into Amazon SageMaker Canvas advanced …

If you are a business analyst, understanding customer behavior is probably one of the most important things you care about. Understanding the reasons and mechanisms behind customer purchase decisions can facilitate revenue growth. However, the loss of customers (commonly referred to as customer churn) always poses a risk. Gaining insights into why customers leave can be just as crucial for sustaining profits and revenue.
Although machine learning (ML) can provide valuable insights, ML experts were needed to build customer churn prediction models until the introduction of Amazon SageMaker Canvas.
SageMaker Canvas is a low-code/no-code managed service that allows you to create ML models that can solve many business problems without writing a single line of code. It also enables you to evaluate the models using advanced metrics as if you were a data scientist.
In this post, we show how a business analyst can evaluate and understand a classification churn model created with SageMaker Canvas using the Advanced metrics tab. We explain the metrics and show techniques to deal with data to obtain better model performance.
Prerequisites
If you would like to implement all or some of the tasks described in this post, you need an AWS account with access to SageMaker Canvas. Refer to Predict customer churn with no-code machine learning using Amazon SageMaker Canvas to cover the basics around SageMaker Canvas, the churn model, and the dataset.
Introduction to model performance evaluation
As a general guideline, when you need to evaluate the performance of a model, you’re trying to measure how well the model will predict something when it sees new data. This prediction is called inference. You start by training the model using existing data, and then ask the model to predict the outcome on data that it has not already seen. How accurately the model predicts this outcome is what you look at to understand the model performance.
If the model hasn’t seen the new data, how would anybody know if the prediction is good or bad? Well, the idea is to actually use historical data where the results are already known and compare the these values to the model’s predicted values. This is enabled by setting aside a portion of the historical training data so it can be compared with what the model predicts for those values.
In the example of customer churn (which is a categorical classification problem), you start with a historical dataset that describes customers with many attributes (one in each record). One of the attributes, called Churn, can be True or False, describing if the customer left the service or not. To evaluate model accuracy, we split this dataset and train the model using one part (the training dataset), and ask the model to predict the outcome (classify the customer as Churn or not) with the other part (the test dataset). We then compare the model’s prediction to the ground truth contained in the test dataset.
Interpreting advanced metrics
In this section, we discuss the advanced metrics in SageMaker Canvas that can help you understand model performance.
Confusion matrix
SageMaker Canvas uses confusion matrices to help you visualize when a model generates predictions correctly. In a confusion matrix, your results are arranged to compare the predicted values against the actual historical (known) values. The following example explains how a confusion matrix works for a two-category prediction model that predicts positive and negative labels:

True positive – The model correctly predicted positive when the true label was positive
True negative – The model correctly predicted negative when the true label was negative
False positive – The model incorrectly predicted positive when the true label was negative
False negative – The model incorrectly predicted negative when the true label was positive

The following image is an example of a confusion matrix for two categories. In our churn model, the actual values come from the test dataset, and the predicted values come from asking our model.

Accuracy
Accuracy is the percentage of correct predictions out of all the rows or samples of the test set. It is the true samples that were predicted as True, plus the false samples that were correctly predicted as False, divided by the total number of samples in the dataset.
It’s one of the most important metrics to understand because it will tell you in what percentage the model correctly predicted, but it can be misleading in some cases. For example:

Class imbalance – When the classes in your dataset are not evenly distributed (you have a disproportionate number of samples from one class and very little on others), accuracy can be misleading. In such cases, even a model that simply predicts the majority class for every instance can achieve a high accuracy.
Cost-sensitive classification – In some applications, the cost of misclassification for different classes can be different. For example, if we were predicting if a drug can aggravate a condition, a false negative (for example, predicting the drug might not aggravate when it actually does) can be more costly than a false positive (for example, predicting the drug might aggravate when it actually does not).

Precision, recall, and F1 score
Precision is the fraction of true positives (TP) out of all the predicted positives (TP + FP). It measures the proportion of positive predictions that are actually correct.
Recall is the fraction of true positives (TP) out of all the actual positives (TP + FN). It measures the proportion of positive instances that were correctly predicted as positive by the model.
The F1 score combines precision and recall to provide a single score that balances the trade-off between them. It is defined as the harmonic mean of precision and recall:
F1 score = 2 * (precision * recall) / (precision + recall)
The F1 score ranges from 0–1, with a higher score indicating better performance. A perfect F1 score of 1 indicates that the model has achieved both perfect precision and perfect recall, and a score of 0 indicates that the model’s predictions are completely wrong.
The F1 score provides a balanced evaluation of the model’s performance. It considers precision and recall, providing a more informative evaluation metric that reflects the model’s ability to correctly classify positive instances and avoid false positives and false negatives.
For example, in medical diagnosis, fraud detection, and sentiment analysis, F1 is especially relevant. In medical diagnosis, accurately identifying the presence of a specific disease or condition is crucial, and false negatives or false positives can have significant consequences. The F1 score takes into account both precision (the ability to correctly identify positive cases) and recall (the ability to find all positive cases), providing a balanced evaluation of the model’s performance in detecting the disease. Similarly, in fraud detection, where the number of actual fraud cases is relatively low compared to non-fraudulent cases (imbalanced classes), accuracy alone may be misleading due to a high number of true negatives. The F1 score provides a comprehensive measure of the model’s ability to detect both fraudulent and non-fraudulent cases, considering both precision and recall. And in sentiment analysis, if the dataset is imbalanced, accuracy may not accurately reflect the model’s performance in classifying instances of the positive sentiment class.
AUC (area under the curve)
The AUC metric evaluates the ability of a binary classification model to distinguish between positive and negative classes at all classification thresholds. A threshold is a value used by the model to make a decision between the two possible classes, converting the probability of a sample being part of a class into a binary decision. To calculate the AUC, the true positive rate (TPR) and false positive rate (FPR) are plotted across various threshold settings. The TPR measures the proportion of true positives out of all actual positives, while the FPR measures the proportion of false positives out of all actual negatives. The resulting curve, called the receiver operating characteristic (ROC) curve, provides a visual representation of the TPR and FPR at different threshold settings. The AUC value, which ranges from 0–1, represents the area under the ROC curve. Higher AUC values indicate better performance, with a perfect classifier achieving an AUC of 1.
The following plot shows the ROC curve, with TPR as the Y axis and FPR as the X axis. The closer the curve gets to the top left corner of the plot, the better the model does at classifying the data into categories.

To clarify, let’s go over an example. Let’s think about a fraud detection model. Usually, these models are trained from unbalanced datasets. This is due to the fact that, usually, almost all the transactions in the dataset are non-fraudulent with only a few labeled as frauds. In this case, the accuracy alone may not adequately capture the performance of the model because it is probably heavily influenced by the abundance of non-fraudulent cases, leading to misleadingly high accuracy scores.
In this case, the AUC would be a better metric to assess model performance because it provides a comprehensive assessment of a model’s ability to distinguish between fraudulent and non-fraudulent transactions. It offers a more nuanced evaluation, taking into account the trade-off between true positive rate and false positive rate at various classification thresholds.
Just like the F1 score, it is particularly useful when the dataset is imbalanced. It measures the trade-off between TPR and FPR and shows how well the model can differentiate between the two classes regardless of their distribution. This means that even if one class is significantly smaller than the other, the ROC curve assesses the model’s performance in a balanced manner by considering both classes equally.
Additional key topics
Advanced metrics are not the only important tools available to you for evaluating and improving ML model performance. Data preparation, feature engineering, and feature impact analysis are techniques that are essential to model building. These activities play a crucial role in extracting meaningful insights from raw data and improving model performance, leading to more robust and insightful results.
Data preparation and feature engineering
Feature engineering is the process of selecting, transforming, and creating new variables (features) from raw data, and plays a key role in improving the performance of an ML model. Selecting the most relevant variables or features from the available data involves removing irrelevant or redundant features that do not contribute to the model’s predictive power. Transforming data features into a suitable format includes scaling, normalization, and handling missing values. And finally, creating new features from the existing data is done through mathematical transformations, combining or interacting different features, or creating new features from domain-specific knowledge.
Feature importance analysis
SageMaker Canvas generates a feature importance analysis that explains the impact that each column in your dataset has on the model. When you generate predictions, you can see the column impact that identifies which columns have the most impact on each prediction. This will give you insights on which features deserve to be part of your final model and which ones should be discarded. Column impact is a percentage score that indicates how much weight a column has in making predictions in relation to the other columns. For a column impact of 25%, Canvas weighs the prediction as 25% for the column and 75% for the other columns.
Approaches to improve model accuracy
Although there are multiple methods to improve model accuracy, data scientists and ML practitioners usually follow one of the two approaches discussed in this section, using the tools and metrics described earlier.
Model-centric approach
In this approach, the data always remains the same and is used to iteratively improve the model to meet desired results. Tools used with this approach include:

Trying multiple relevant ML algorithms
Algorithm and hyperparameter tuning and optimization
Different model ensemble methods
Using pre-trained models (SageMaker provides various built-in or pre-trained models to help ML practitioners)
AutoML, which is what SageMaker Canvas does behind the scenes (using Amazon SageMaker Autopilot), which encompasses all of the above

Data-centric approach
In this approach, the focus is on data preparation, improving data quality, and iteratively modifying the data to improve performance:

Exploring statistics of the dataset used to train the model, also known as exploratory data analysis (EDA)
Improving data quality (data cleaning, missing values imputation, outlier detection and management)
Feature selection
Feature engineering
Data augmentation

Improving model performance with Canvas
We begin with the data-centric approach. We use the model preview functionality to perform an initial EDA. This provides us a baseline that we can use to perform data augmentation, generating a new baseline, and finally getting the best model with a model-centric approach using the standard build functionality.
We use the synthetic dataset from a telecommunications mobile phone carrier. This sample dataset contains 5,000 records, where each record uses 21 attributes to describe the customer profile. Refer to Predict customer churn with no-code machine learning using Amazon SageMaker Canvas for a full description.
Model preview in a data-centric approach
As a first step, we open the dataset, select the column to predict as Churn?, and generate a preview model by choosing Preview model.

The Preview model pane will show the progress until the preview model is ready.

When the model is ready, SageMaker Canvas generates a feature importance analysis.

Finally, when it’s complete, the pane will show a list of columns with its impact on the model. These are useful to understand how relevant the features are on our predictions. Column impact is a percentage score that indicates how much weight a column has in making predictions in relation to the other columns. In the following example, for the Night Calls column, SageMaker Canvas weights the prediction as 4.04% for the column and 95.9% for the other columns. The higher the value, the higher the impact.
As we can see, the preview model has a 95.6% accuracy. Let’s try to improve the model performance using a data-centric approach. We perform data preparation and use feature engineering techniques to improve performance.
As shown in the following screenshot, we can observe that the Phone and State columns have much less impact on our prediction. Therefore, we will use this information as input for our next phase, data preparation.

SageMaker Canvas provides ML data transforms with which you can clean, transform, and prepare your data for model building. You can use these transforms on your datasets without any code, and they will be added to the model recipe, which is a record of the data preparation performed on your data before building the model.
Note that any data transforms you use only modify the input data when building a model and do not modify your dataset or original data source.
The following transforms are available in SageMaker Canvas for you to prepare your data for building:

Datetime extraction
Drop columns
Filter rows
Functions and operators
Manage rows
Rename columns
Remove rows
Replace values
Resample time series data

Let’s start by dropping the columns we have found that have little impact on our prediction.
For example, in this dataset, the phone number is just the equivalent of an account number—it’s useless or even detrimental in predicting other accounts’ likelihood of churn. Likewise, the customer’s state doesn’t impact our model much. Let’s remove the Phone and State columns by unselecting those features under Column name.

Now, let’s perform some additional data transformation and feature engineering.
For example, we noticed in our previous analysis that the charged amount to customers has a direct impact on churn. Let’s therefore create a new column that computes the total charges to our customers by combining Charge, Mins, and Calls for Day, Eve, Night, and Intl. To do so, we use the custom formulas in SageMaker Canvas.

Let’s start by choosing Functions, then we add to the formula textbox the following text:
(Day Calls*Day Charge*Day Mins)+(Eve Calls*Eve Charge*Eve Mins)+(Night Calls*Night Charge*Night Mins)+(Intl Calls*Intl Charge*Intl Mins)
Give the new column a name (for example, Total Charges), and choose Add after the preview has been generated. The model recipe should now look as shown in the following screenshot.

When this data preparation is complete, we train a new preview model to see if the model improved. Choose Preview model again, and the lower right pane will show the progress.

When training is finished, it will proceed to recompute the predicted accuracy, and will also create a new column impact analysis.

And finally, when the whole process is complete, we can see the same pane we saw earlier but with the new preview model accuracy. You can notice model accuracy increased by 0.4% (from 95.6% to 96%).

The numbers in the preceding images may differ from yours because ML introduces some stochasticity in the process of training models, which can lead to different results in different builds.
Model-centric approach to create the model
Canvas offers two options to build your models:

Standard build – Builds the best model from an optimized process where speed is exchanged for better accuracy. It uses Auto-ML, which automates various tasks of ML, including model selection, trying various algorithms relevant to your ML use case, hyperparameter tuning, and creating model explainability reports.
Quick build – Builds a simple model in a fraction of the time compared to a standard build, but accuracy is exchanged for speed. Quick model is useful when iterating to more quickly understand the impact of data changes to your model accuracy.

Let’s continue using a standard build approach.
Standard build
As we saw before, the standard build builds the best model from an optimized process to maximize accuracy.

The build process for our churn model takes around 45 minutes. During this time, Canvas tests hundreds of candidate pipelines, selecting the best model. In the following screenshot, we can see the expected build time and progress.
With the standard build process, our ML model has improved our model accuracy to 96.903%, which is a significant improvement.

Explore advanced metrics
Let’s explore the model using the Advanced metrics tab. On the Scoring tab, choose Advanced metrics.

This page will show the following confusion matrix jointly with the advanced metrics: F1 score, accuracy, precision, recall, F1 score, and AUC.

Generate predictions
Now that the metrics look good, we can perform an interactive prediction on the Predict tab, either in a batch or single (real-time) prediction.
We have two options:

Use this model to run to run batch or single predictions
Send the model to Amazon Sagemaker Studio to share with data scientists

Clean up
To avoid incurring future session charges, log out of SageMaker Canvas.

Conclusion
SageMaker Canvas provides powerful tools that enable you to build and assess the accuracy of models, enhancing their performance without the need for coding or specialized data science and ML expertise. As we have seen in the example through the creation of a customer churn model, by combining these tools with both a data-centric and a model-centric approach using advanced metrics, business analysts can create and evaluate prediction models. With a visual interface, you’re also empowered to generate accurate ML predictions on your own. We encourage you to go through the references and see how many of these concepts might apply in other types of ML problems.
References

Predict customer churn with no-code machine learning using Amazon SageMaker Canvas
Build, Share, Deploy: how business analysts and data scientists achieve faster time-to-market using no-code ML and Amazon SageMaker Canvas
Customizing and reusing models generated by Amazon SageMaker Autopilot
Amazon SageMaker Canvas Immersion Day Workshop
Manage AutoML workflows with AWS Step Functions and AutoGluon on Amazon SageMaker

About the Authors
Marcos is an AWS Sr. Machine Learning Solutions Architect based in Florida, US. In that role, he is responsible for guiding and assisting US startup organizations in their strategy towards the cloud, providing guidance on how to address high-risk issues and optimize their machine learning workloads. He has more than 25 years of experience with technology, including cloud solution development, machine learning, software development, and data center infrastructure.
Indrajit is an AWS Enterprise Sr. Solutions Architect. In his role, he helps customers achieve their business outcomes through cloud adoption. He designs modern application architectures based on microservices, serverless, APIs, and event-driven patterns. He works with customers to realize their data analytics and machine learning goals through adoption of DataOps and MLOps practices and solutions. Indrajit speaks regularly at AWS public events like summits and ASEAN workshops, has published several AWS blog posts, and developed customer-facing technical workshops focused on data and machine learning on AWS.