From text to dream job: Building an NLP-based job recommender at Talen …

This post is co-authored by Anatoly Khomenko, Machine Learning Engineer, and Abdenour Bezzouh, Chief Technology Officer at Talent.com.
Founded in 2011, Talent.com is one of the world’s largest sources of employment. The company combines paid job listings from their clients with public job listings into a single searchable platform. With over 30 million jobs listed in more than 75 countries, Talent.com serves jobs across many languages, industries, and distribution channels. The result is a platform that matches millions of job seekers with available jobs.
Talent.com’s mission is to centralize all jobs available on the web to help job seekers find their best match while providing them with the best search experience. Its focus is on relevancy, because the order of the recommended jobs is vitally important to show the jobs most pertinent to users’ interests. The performance of Talent.com’s matching algorithm is paramount to the success of the business and a key contributor to their users’ experience. It’s challenging to predict which jobs are pertinent to a job seeker based on the limited amount of information provided, usually contained to a few keywords and a location.
Given this mission, Talent.com and AWS joined forces to create a job recommendation engine using state-of-the-art natural language processing (NLP) and deep learning model training techniques with Amazon SageMaker to provide an unrivaled experience for job seekers. This post shows our joint approach to designing a job recommendation system, including feature engineering, deep learning model architecture design, hyperparameter optimization, and model evaluation that ensures the reliability and effectiveness of our solution for both job seekers and employers. The system is developed by a team of dedicated applied machine learning (ML) scientists, ML engineers, and subject matter experts in collaboration between AWS and Talent.com.
The recommendation system has driven an 8.6% increase in clickthrough rate (CTR) in online A/B testing against a previous XGBoost-based solution, helping connect millions of Talent.com’s users to better jobs.
Overview of solution
An overview of the system is illustrated in the following figure. The system takes a user’s search query as input and outputs a ranked list of jobs in order of pertinence. Job pertinence is measured by the click probability (the probability of a job seeker clicking on a job for more information).

The system includes four main components:

Model architecture – The core of this job recommendation engine is a deep learning-based Triple Tower Pointwise model, which includes a query encoder that encodes user search queries, a document encoder that encodes the job descriptions, and an interaction encoder that processes the past user-job interaction features. The outputs of the three towers are concatenated and passed through a classification head to predict the job’s click probabilities. By training this model on search queries, job specifics, and historical user interaction data from Talent.com, this system provides personalized and highly relevant job recommendations to job seekers.
Feature engineering – We perform two sets of feature engineering to extract valuable information from input data and feed it into the corresponding towers in the model. The two sets are standard feature engineering and fine-tuned Sentence-BERT (SBERT) embeddings. We use the standard engineered features as input into the interaction encoder and feed the SBERT derived embedding into the query encoder and document encoder.
Model optimization and tuning – We utilize advanced training methodologies to train, test, and deploy the system with SageMaker. This includes SageMaker Distributed Data Parallel (DDP) training, SageMaker Automatic Model Tuning (AMT), learning rate scheduling, and early stopping to improve model performance and training speed. Using the DDP training framework helped speed up our model training to approximately eight times faster.
Model evaluation – We conduct both offline and online evaluation. We evaluate the model performance with Area Under the Curve (AUC) and Mean Average Precision at K (mAP@K) in offline evaluation. During online A/B testing, we evaluate the CTR improvements.

In the following sections, we present the details of these four components.
Deep learning model architecture design
We design a Triple Tower Deep Pointwise (TTDP) model using a triple-tower deep learning architecture and the pointwise pair modeling approach. The triple-tower architecture provides three parallel deep neural networks, with each tower processing a set of features independently. This design pattern allows the model to learn distinct representations from different sources of information. After the representations from all three towers are obtained, they are concatenated and passed through a classification head to make the final prediction (0–1) on the click probability (a pointwise modeling setup).
The three towers are named based on the information they process: the query encoder processes the user search query, the document encoder processes the candidate job’s documentational contents including the job title and company name, and the interaction encoder uses relevant features extracted from past user interactions and history (discussed more in the next section).
Each of these towers plays a crucial role in learning how to recommend jobs:

Query encoder – The query encoder takes in the SBERT embeddings derived from the user’s job search query. We enhance the embeddings through an SBERT model we fine-tuned. This encoder processes and understands the user’s job search intent, including details and nuances captured by our domain-specific embeddings.
Document encoder – The document encoder processes the information of each job listing. Specifically, it takes the SBERT embeddings of the concatenated text from the job title and company. The intuition is that users will be more interested in candidate jobs that are more relevant to the search query. By mapping the jobs and the search queries to the same vector space (defined by SBERT), the model can learn to predict the probability of the potential jobs a job seeker will click.
Interaction encoder – The interaction encoder deals with the user’s past interactions with job listings. The features are produced via a standard feature engineering step, which includes calculating popularity metrics for job roles and companies, establishing context similarity scores, and extracting interaction parameters from previous user engagements. It also processes the named entities identified in the job title and search queries with a pre-trained named entity recognition (NER) model.

Each tower generates an independent output in parallel, all of which are then concatenated together. This combined feature vector is then passed to predict the click probability of a job listing for a user query. The triple-tower architecture provides flexibility in capturing complex relationships between different inputs or features, allowing the model to take advantage of the strengths of each tower while learning more expressive representations for the given task.
Candidate jobs’ predicted click probabilities are ranked from high to low, generating personalized job recommendations. Through this process, we ensure that each piece of information—whether it’s the user’s search intent, job listing details, or past interactions—is fully captured by a specific tower dedicated to it. The complex relationships between them are also captured through the combination of the tower outputs.
Feature engineering
We perform two sets of feature engineering processes to extract valuable information from the raw data and feed it into the corresponding towers in the model: standard feature engineering and fine-tuned SBERT embeddings.
Standard feature engineering
Our data preparation process begins with standard feature engineering. Overall, we define four types of features:

Popularity – We calculate popularity scores at the individual job level, occupation level, and company level. This provides a metric of how attractive a particular job or company might be.
Textual similarity – To understand the contextual relationship between different textual elements, we compute similarity scores, including string similarity between the search query and the job title. This helps us gauge the relevance of a job opening to a job seeker’s search or application history.
Interaction – In addition, we extract interaction features from past user engagements with job listings. A prime example of this is the embedding similarity between past clicked job titles and candidate job titles. This measure helps us understand the similarity between previous jobs a user has shown interest in vs. upcoming job opportunities. This enhances the precision of our job recommendation engine.
Profile – Lastly, we extract user-defined job interest information from the user profile and compare it with new job candidates. This helps us understand if a job candidate matches a user’s interest.

A crucial step in our data preparation is the application of a pre-trained NER model. By implementing an NER model, we can identify and label named entities within job titles and search queries. Consequently, this allows us to compute similarity scores between these identified entities, providing a more focused and context-aware measure of relatedness. This methodology reduces the noise in our data and gives us a more nuanced, context-sensitive method of comparing jobs.
Fine-tuned SBERT embeddings
To enhance the relevance and accuracy of our job recommendation system, we use the power of SBERT, a powerful transformer-based model, known for its proficiency in capturing semantic meanings and contexts from text. However, generic embeddings like SBERT, although effective, may not fully capture the unique nuances and terminologies inherent in a specific domain such as ours, which centers around employment and job searches. To overcome this, we fine-tune the SBERT embeddings using our domain-specific data. This fine-tuning process optimizes the model to better understand and process the industry-specific language, jargon, and context, making the embeddings more reflective of our specific domain. As a result, the refined embeddings offer improved performance in capturing both semantic and contextual information within our sphere, leading to more accurate and meaningful job recommendations for our users.
The following figure illustrates the SBERT fine-tuning step.

We fine-tune SBERT embeddings using TripletLoss with a cosine distance metric that learns text embedding where anchor and positive texts have a higher cosine similarity than anchor and negative texts. We use users’ search queries as anchor texts. We combine job titles and employer names as inputs to the positive and negative texts. The positive texts are sampled from job postings that the corresponding user clicked on, whereas the negative texts are sampled from job postings that the user did not click on. The following is sample implementation of the fine-tuning procedure:

import math
from datetime import datetime

from torch.utils.data import DataLoader
from sentence_transformers import (SentenceTransformer, SentencesDataset,
LoggingHandler, losses)
from sentence_transformers.readers import InputExample

model_name = ‘all-mpnet-base-v2′
train_batch_size = 16
num_epochs = 1
model_save_path = (f’output/{model_name}_’+
datetime.now().strftime(“%Y-%m-%d_%H-%M-%S”))

### load pre-trained SBERT model
model = SentenceTransformer(model_name, device=”cuda”)

### construct training dataset of triplet texts,
### stored in three lists (achors, positives, negatives)
train_examples =[]
for anchor, positive, negative in zip(achors, positives, negatives):
train_examples.append(InputExample(texts=(anchor, positive, negative)))

train_dataset = SentencesDataset(train_examples, model)
train_dataloader = DataLoader(train_dataset, shuffle=True,
batch_size=train_batch_size)

### use TripletLoss with cosine distance metric and margin=0.5
distance_metric=losses.TripletDistanceMetric.COSINE
train_loss = losses.TripletLoss(model=model,
distance_metric=distance_metric,
triplet_margin=0.5)

### 10% of train data for warm-up
warmup_steps = math.ceil(len(train_dataloader) * num_epochs * 0.1)

# Train the model
model.fit(train_objectives=[(train_dataloader, train_loss)],
epochs=num_epochs,
warmup_steps=warmup_steps,
output_path=model_save_path)

Model training with SageMaker Distributed Data Parallel
We use SageMaker Distributed Data Parallel (SMDDP), a feature of the SageMaker ML platform that is built on top of PyTorch DDP. It provides an optimized environment for running PyTorch DDP training jobs on the SageMaker platform. It’s designed to significantly speed up deep learning model training. It accomplishes this by splitting a large dataset into smaller chunks and distributing them across multiple GPUs. The model is replicated on every GPU. Each GPU processes its assigned data independently, and the results are collated and synchronized across all GPUs. DDP takes care of gradient communication to keep model replicas synchronized and overlaps them with gradient computations to speed up training. SMDDP utilizes an optimized AllReduce algorithm to minimize communication between GPUs, reducing synchronization time and improving overall training speed. The algorithm adapts to different network conditions, making it highly efficient for both on-premises and cloud-based environments. In the SMDDP architecture (as shown in the following figure), distributed training is also scaled using a cluster of many nodes. This means not just multiple GPUs in a computing instance, but many instances with multiple GPUs, which further speeds up training.

For more information about this architecture, refer to Introduction to SageMaker’s Distributed Data Parallel Library.
With SMDDP, we have been able to substantially reduce the training time for our TTDP model, making it eight times faster. Faster training times mean we can iterate and improve our models more quickly, leading to better job recommendations for our users in a shorter amount of time. This efficiency gain is instrumental in maintaining the competitiveness of our job recommendation engine in a fast-evolving job market.
You can adapt your training script with the SMDDP with only three lines of code, as shown in the following code block. Using PyTorch as an example, the only thing you need to do is import the SMDDP library’s PyTorch client (smdistributed.dataparallel.torch.torch_smddp). The client registers smddp as a backend for PyTorch.

import smdistributed.dataparallel.torch.torch_smddp
import torch.distributed as dist

dist.init_process_group(backend=’smddp’)

After you have a working PyTorch script that is adapted to use the distributed data parallel library, you can launch a distributed training job using the SageMaker Python SDK.
Evaluating model performance
When evaluating the performance of a recommendation system, it’s crucial to choose metrics that align closely with business goals and provide a clear understanding of the model’s effectiveness. In our case, we use the AUC to evaluate our TTDP model’s job click prediction performance and the mAP@K to assess the quality of the final ranked jobs list.
The AUC refers to the area under the receiver operating characteristic (ROC) curve. It represents the probability that a randomly chosen positive example will be ranked higher than a randomly chosen negative example. It ranges from 0–1, where 1 indicates an ideal classifier and 0.5 represents a random guess. mAP@K is a metric commonly used to assess the quality of information retrieval systems, such as our job recommender engine. It measures the average precision of retrieving the top K relevant items for a given query or user. It ranges from 0–1, with 1 indicating optimal ranking and 0 indicating the lowest possible precision at the given K value. We evaluate the AUC, mAP@1, and mAP@3. Collectively, these metrics allow us to gauge the model’s ability to distinguish between positive and negative classes (AUC) and its success at ranking the most relevant items at the top (mAP@K).
Based on our offline evaluation, the TTDP model outperformed the baseline model—the existing XGBoost-based production model—by 16.65% for AUC, 20% for mAP@1, and 11.82% for mAP@3.

Furthermore, we designed an online A/B test to evaluate the proposed system and ran the test on a percentage of the US email population for 6 weeks. In total, approximately 22 million emails were sent using the job recommended by the new system. The resulting uplift in clicks compared to the previous production model was 8.6%. Talent.com is gradually increasing the percentage to roll out the new system to its complete population and channels.
Conclusion
Creating a job recommendation system is a complex endeavor. Each job seeker has unique needs, preferences, and professional experiences that can’t be inferred from a short search query. In this post, Talent.com collaborated with AWS to develop an end-to-end deep learning-based job recommender solution that ranks lists of jobs to recommend to users. The Talent.com team truly enjoyed collaborating with the AWS team throughout the process of solving this problem. This marks a significant milestone in Talent.com’s transformative journey, as the team takes advantage of the power of deep learning to empower its business.
This project was fine-tuned using SBERT to generate text embeddings. At the time of writing, AWS introduced Amazon Titan Embeddings as part of their foundational models (FMs) offered through Amazon Bedrock, which is a fully managed service providing a selection of high-performing foundational models from leading AI companies. We encourage readers to explore the machine learning techniques presented in this blog post and leverage the capabilities provided by AWS, such as SMDDP, while making use of AWS Bedrock’s foundational models to create their own search functionalities.
References

SBERT Training Overview
PyTorch Distributed Overview
The SageMaker Distributed Data Parallel Library Overview
Introduction to SageMaker’s Distributed Data Parallel Library

About the authors
Yi Xiang is a Applied Scientist II at the Amazon Machine Learning Solutions Lab, where she helps AWS customers across different industries accelerate their AI and cloud adoption.
Tong Wang is a Senior Applied Scientist at the Amazon Machine Learning Solutions Lab, where he helps AWS customers across different industries accelerate their AI and cloud adoption.
Dmitriy Bespalov is a Senior Applied Scientist at the Amazon Machine Learning Solutions Lab, where he helps AWS customers across different industries accelerate their AI and cloud adoption.
Anatoly Khomenko is a Senior Machine Learning Engineer at Talent.com with a passion for natural language processing matching good people to good jobs.
Abdenour Bezzouh is an executive with more than 25 years experience building and delivering technology solutions that scale to millions of customers. Abdenour held the position of Chief Technology Officer (CTO) at Talent.com when the AWS team designed and executed this particular solution for Talent.com.
Dale Jacques is a Senior AI Strategist within the Generative AI Innovation Center where he helps AWS customers translate business problems into AI solutions.
Yanjun Qi is a Senior Applied Science Manager at the Amazon Machine Learning Solution Lab. She innovates and applies machine learning to help AWS customers speed up their AI and cloud adoption.

Meet OmniControl: An Artificial Intelligence Approach for Incorporatin …

Researchers address the issue of combining spatial control signals over every joint at any given time into text-conditioned human motion production. Modern diffusion-based techniques may produce varied and lifelike human motion, but they find it difficult to incorporate variable spatial control signals, which are essential for many applications. For instance, a model must regulate the hand position to contact the cup at a particular place and time and understand “pick up” semantics to synthesize the action for picking up a cup. Similarly, when moving through a room with low ceilings, a model must carefully regulate the height of the head for a certain amount of time to avoid accidents. 

Since they are difficult to explain in the textual prompt, these control signals are often delivered as global positions of joints of interest in keyframes. However, previous inpainting-based approaches cannot incorporate flexible control signals due to their chosen relative human posture representations. The limits are mostly caused by the relative locations of the joints and the pelvis with respect to one another and the prior frame. The global pelvic position supplied in the control signal must thus be translated to a relative location concerning the previous frame to be input to the keyframe. Similar to how other joints’ positions must be input, the global position of the pelvis must also be converted. 

However, the pelvis’ relative locations between the diffusion generation process must be more present or corrected in both instances. To integrate any spatial control signal on joints other than the pelvis, one must first need help managing sparse limitations on the pelvis. Others present a two-stage model, but it still has trouble regulating other joints due to the limited control signals over the pelvis. In this study, researchers from Northeastern University and Google Research suggest OmniControl, a brand-new diffusion-based human generation model that may include flexible spatial control signals over any joint at any given moment. Building on OmniControl, realism guiding is added to regulate the creation of human movements. 

Figure 1: Given a written prompt and adaptable spatial control signals, OmniControl can produce convincing human gestures. Later frames in the series are indicated by darker colours. The input control signals are shown by the green line or points.

For the model to work well, they use the same relative human posture representations for input and output. However, they suggest, in contrast to current approaches, converting the produced motion to global coordinates for direct comparison with the input control signals in the spatial guidance module, where the gradients of the error are employed to improve the motion. It resolves the shortcomings of the earlier inpainting-based methods by removing the uncertainty regarding the relative locations of the pelvis. Additionally, compared to previous approaches, it enables dynamic iterative refining of the produced motion, improving control precision. 

Although successfully enforcing space limits, spatial guidance alone frequently results in drifting issues and abnormal human movements. They present the realism guidance, which outputs the residuals w.r.t. the features in each attention layer of the motion diffusion model, to solve these problems by drawing inspiration from the controlled picture production. These residuals can explicitly and densely alter whole-body motion. To produce realistic, coherent, and consistent movements with spatial restrictions, both the spatial and the realism guidance are crucial, and they are complementary in balancing control precision and motion realism. 

Studies using HumanML3D and KIT-ML demonstrate that OmniControl performs significantly better than the most advanced text-based motion generation techniques for pelvic control in terms of both motion realism and control accuracy. However, incorporating the spatial limitations over any joint at any moment is where OmniControl excels. Additionally, as illustrated in Fig. 1, they may train a single model to control numerous joints collectively rather than separately (for example, both the left and right wrists). 

These features of OmniControl make it possible for several downstream applications, such as tying produced a human motion to the surrounding scenery and objects, as seen in Fig. 1’s last column. Their brief contributions are: (1) As far as they are aware, OmniControl is the first strategy capable of combining spatial control signals over any joint at any moment. (2) To successfully balance the control precision and motion realism in the produced motion, they suggest a unique control module that uses spatial and realism guidance. (3) Tests demonstrate that OmniControl can control additional joints using a single model in text-based motion creation, setting a new standard for controlling the pelvis and opening up various applications in human motion production.

Check out the Paper and Project. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post Meet OmniControl: An Artificial Intelligence Approach for Incorporating Flexible Spatial Control Signals into a Text-Conditioned Human Motion Generation Model Based on the Diffusion Process appeared first on MarkTechPost.

Google Cloud Commits to Protect Customers for Generative AI Indemnific …

In a forward-looking move, Google Cloud has reaffirmed its dedication to its customers’ interests, positioning them at the forefront of a journey characterized by shared innovation, support, and fate. This means that when businesses choose to partner with Google Cloud, they embark on a collaborative expedition that prioritizes the latest and best technology, all while safeguarding their safety and security. In the ever-evolving realm of generative AI, this commitment takes on paramount importance.

Earlier this year, Google Cloud integrated Duet AI, an always-on AI collaborator, across its suite of products, spanning from Google Workspace to Google Cloud Platform. This monumental stride was coupled with significant advancements to Vertex AI, affording customers the ability to experiment and construct with generative AI foundation models in a safe, secure, and responsible manner. The outcomes have been nothing short of remarkable, with innovative use cases emerging from a diverse array of industries.

One pivotal aspect addressed by Google Cloud is intellectual property indemnity in the context of generative AI. The company acknowledges the potential legal risks customers may encounter, particularly in instances where copyright challenges arise. In response, Google Cloud has devised a groundbreaking, two-pronged approach that sets a new industry standard. This approach aims to provide customers with a greater sense of security and confidence when deploying generative AI products.

The first prong centers on Google’s use of training data. While this indemnity is not a new protection, it underscores Google Cloud’s unwavering commitment to standing behind its services. It extends to all services, including generative AI offerings, and serves as a third-party intellectual property indemnity standard for all customers. This assurance addresses any allegations asserting that Google’s utilization of training data for generative models infringes upon a third party’s intellectual property rights. In essence, this indemnity serves as a powerful safeguard, ensuring that regardless of the training data underpinning the services, Google unequivocally indemnifies its customers.

The second prong introduces a layer of protection relating to the generated output, crafted by customers in response to prompts or inputs they provide to Google’s services. This additional indemnity fortifies the customer’s position by extending the indemnity obligations to allegations of intellectual property rights infringement pertaining to the generated output. This protection encompasses a range of Google Cloud services, including Duet AI in Workspace, Vertex AI Search, and other components. It assures customers that Google will stand by them in the event of third-party IP claims, including copyright, assuming responsible AI practices are adhered to.

These dual indemnities represent a robust shield for Google Cloud customers. They provide coverage against potential claims, including copyright infringement, emanating from both the generated output and Google’s use of training data to craft generative AI models. By introducing these comprehensive protections, Google Cloud aims to offer a balanced and practical solution for relevant types of potential claims. Importantly, customers will automatically benefit from these terms without the need for any amendments to their existing agreements.

This marks just the initial step in Google Cloud’s ongoing commitment to supporting customers on their shared journey into the realm of generative AI. With these safeguards in place, businesses can harness the power of generative AI for their operations with confidence, knowing that Google Cloud has their back, come what may.

Check out the Reference Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post Google Cloud Commits to Protect Customers for Generative AI Indemnification appeared first on MarkTechPost.

Meet FastEmbed: A Fast and Lightweight Text Embedding Generation Pytho …

Words and phrases can be effectively represented as vectors in a high-dimensional space using embeddings, making them a crucial tool in the field of natural language processing (NLP). Machine translation, text classification, and question answering are just a few of the numerous applications that can benefit from the ability of this representation to capture semantic connections between words.

However, when dealing with large datasets, the computational requirements for generating embeddings can be daunting. This is primarily because constructing a large co-occurrence matrix is a prerequisite for traditional embedding approaches like Word2Vec and GloVe. For very large documents or vocabulary sizes, this matrix can become unmanageably enormous.

To address the challenges of slow embedding generation, the Python community has developed FastEmbed. FastEmbed is designed for speed, minimal resource usage, and precision. This is achieved through its cutting-edge embedding generation method, which eliminates the need for a co-occurrence matrix.

Rather than simply mapping words into a high-dimensional space, FastEmbed employs a technique called random projection. By utilizing the dimensionality reduction approach of random projection, it becomes possible to reduce the number of dimensions in a dataset while preserving its essential characteristics.

FastEmbed randomly projects words into a space where they are likely to be close to other words with similar meanings. This process is facilitated by a random projection matrix designed to preserve word meanings.

Once words are mapped into the high-dimensional space, FastEmbed employs a straightforward linear transformation to learn embeddings for each word. This linear transformation is learned by minimizing a loss function designed to capture semantic connections between words.

It has been demonstrated that FastEmbed is significantly faster than standard embedding methods while maintaining a high level of accuracy. FastEmbed can also be used to create embeddings for extensive datasets while remaining relatively lightweight.

FastEmbed’s Advantages

Speed: Compared to other popular embedding methods like Word2Vec and GloVe, FastEmbed offers remarkable speed improvements.

FastEmbed is a compact yet powerful library for generating embeddings in large databases.

FastEmbed is as accurate as other embedding methods, if not more so.

Applications of FastEmbed

Machine Translation

Text Categorization

Answering Questions and Summarizing Documents

Information Retrieval and Summarization

FastEmbed is an efficient, lightweight, and precise toolkit for generating text embeddings. If you need to create embeddings for massive datasets, FastEmbed is an indispensable tool.

Check out the Project Page.  All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post Meet FastEmbed: A Fast and Lightweight Text Embedding Generation Python Library appeared first on MarkTechPost.

Meet MatFormer: A Universal Nested Transformer Architecture for Flexib …

Transformer models find applications in various applications, ranging from powerful multi-accelerator clusters to individual mobile devices. The varied requirements for inference in these settings make developers train fundamental models like PaLM 2, Llama, and ViTs in different sizes. However, the higher costs associated with training lead to a restricted set of supported model sizes. 

Large foundational models are used in different situations, such as giving quick responses on mobile phones or handling batches on multi-cluster GPUs for large-scale web applications. Each model provides a selection of independently trained models in different sizes to accommodate various circumstances. To accommodate a wide range of applications, these model sizes are typically grouped on a logarithmic scale in a roughly linear fashion.

Consequently, a group of researchers from Google Research, the University of Texas at Austin, the University of Washington, and Harvard University have introduced MatFormer—a Transformer architecture explicitly crafted for adaptability, as outlined in their latest paper, which is titled MatFormer: Nested Transformer for Elastic Inference. MatFormer makes it easier to build an integrated model that can generate numerous smaller submodels without extra training.

They have incorporated a nested sub-structure within the standard Transformer and jointly optimized all the granularities to produce a single, universal elastic model.

The researchers emphasized that they have produced many accurate submodels without acquiring additional training costs by deliberately mixing various levels of information in various layers of a universal MatFormer model. Each Feed Forward Network (FFN) block in the MatFormer architecture is optimized with a collection of smaller, nested FFN blocks. Each Feed Forward Network (FFN) block in the MatFormer architecture is optimized with a collection of smaller, nested FFN blocks. Through this training approach, they combined and adjusted the complexity of the model across different layers. 

The nested structure is implemented on the hidden representations of the Feed Forward Network (FFN) block, amplifying the model’s capabilities by placing the attention heads in order of significance. A substructure within the attention heads is created from the most to the least. Compared to independently training equivalent Transformer-based submodels, training is accelerated by 15% since the more significant heads are distributed among a larger number of submodels. Additionally, this method aligns with the specifically optimized submodel curve and permits the extraction of several smaller submodels while maintaining accuracy.

The researchers found that they could produce a sizable number of accurate smaller models without further optimization by choosing different levels of detail for each MatFormer layer.

The team studied the effectiveness across a range of model types (decoders and encoders), modalities (language and vision), and scales (up to 2.6 billion parameters). The researchers emphasized that comparing these smaller models to their independently trained counterparts reveals comparable validation loss and one-shot downstream performance. Also, MatFormer exhibits robust generalization and works well as vision encoders (MatViT) and decoder-only language models (MatLM). In terms of accuracy and dependability, it scales similarly to the traditional Transformer. 

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post Meet MatFormer: A Universal Nested Transformer Architecture for Flexible Model Deployment Across Platforms appeared first on MarkTechPost.

Demystifying Generative Artificial Intelligence: An In-Depth Dive into …

To combine computer-generated visuals or deduce the physical characteristics of a scene from pictures, computer graphics, and 3D computer vision groups have been working to create physically realistic models for decades. Several industries, including visual effects, gaming, image and video processing, computer-aided design, virtual and augmented reality, data visualization, robotics, autonomous vehicles, and remote sensing, among others, are built on this methodology, which includes rendering, simulation, geometry processing, and photogrammetry. An entirely new way of thinking about visual computing has emerged with the rise of generative artificial intelligence (AI). With only a written prompt or high-level human instruction as input, generative AI systems enable the creation and manipulation of photorealistic and styled photos, movies, or 3D objects. 

These technologies automate several time-consuming tasks in visual computing that were previously only available to specialists with in-depth topic expertise. Foundation models for visual computing, such as Stable Diffusion, Imagen, Midjourney, or DALL-E 2 and DALL-E 3, have opened the unparalleled powers of generative AI. These models have “seen it all” after being trained on hundreds of millions to billions of text-image pairings, and they are incredibly vast, with just a few billion learnable parameters. These models were the basis for the generative AI tools mentioned above and were trained on an enormous cloud of powerful graphics processing units (GPUs). 

The diffusion models based on convolutional neural networks (CNN) frequently used to generate images, videos, and 3D objects integrate text calculated using transformer-based architectures, such as CLIP, in a multi-modal fashion. There is still room for the academic community to make significant contributions to the development of these tools for graphics and vision, even though well-funded industry players have used a significant amount of resources to develop and train foundation models for 2D image generation. For example, it needs to be clarified how to adapt current picture foundation models for use in other, higher-dimensional domains, such as video and 3D scene creation. 

A need for more specific kinds of training data mostly causes this. For instance, there are many more examples of low-quality and generic 2D photos on the web than of high-quality and varied 3D objects or settings. Furthermore, scaling 2D image creation systems to accommodate greater dimensions, as necessary for video, 3D scene, or 4D multi-view-consistent scene synthesis, is not immediately apparent. Another example of a current limitation is computation: even though an enormous amount of (unlabeled) video data is available on the web, current network architectures are frequently too inefficient to be trained in a reasonable amount of time or on a reasonable amount of compute resources. This results in diffusion models being rather slow at inference time. This is due to their networks’ large size and iterative nature. 

Figure 1: The theory and application of diffusion models for visual computing are covered in this cutting-edge paper. Recently, these models have taken over as the accepted norm for creating and modifying images, videos, and objects in 3D and 4D. 

Despite the unresolved issues, the number of diffusion models for visual computing has increased dramatically in the past year (see illustrative examples in Fig. 1). The objectives of this state-of-the-art report (STAR) developed by researchers from multiple universities are to offer an organized review of the numerous recent publications focused on applications of diffusion models in visual computing, to teach the principles of diffusion models, and to identify outstanding issues. 

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post Demystifying Generative Artificial Intelligence: An In-Depth Dive into Diffusion Models and Visual Computing Evolution appeared first on MarkTechPost.

SalesForce AI Introduces CodeChain: An Innovative Artificial Intellige …

A major objective in the study of Artificial Intelligence is the development of AI systems that can provide useful computer programs to address challenging issues. Much progress has been made in this direction in recent years, especially with the remarkable successes of massive pretrained Large Language Models (LLMs). These models were first created for natural language comprehension, but they have now expanded to include the ability to generate and comprehend code and text. Notable progress has been made in producing code from descriptions of natural language problems as a result of this development.

LLMs have already proven themselves capable of handling straightforward programming tasks, as seen by their achievements in benchmarks such as MBPP and HumanEval. However, these models encounter significant difficulties when trying to solve more difficult and competitive programming tasks. Their propensity to provide code solutions as monolithic blocks rather than decomposing them into logical subtasks and reusable sub-modules is one of the primary causes of their difficulties. On the other hand, when faced with complex problems, skilled human programmers instinctively write modular and abstract code. By reusing previously created modules, they effectively expand upon their current expertise.

In a recent research, a team of researchers from Salesforce Research has introduced CodeChain, an innovative framework for bridging the gap between LLMs and human developers. With a sequence of self-revisions driven by representative sub-modules developed in earlier iterations, this framework aims to improve the process of developing modularized code. CodeChain tells the LLM to write modularized code using a chain-of-thought approach. The intention is to motivate the model to approach problem-solving in terms of logical subtasks and submodules.

A sequence of self-revisions forms the basis of CodeChain. There are two iterative phases in it, which are as follows.

Sub-Module Extraction and Clustering: In this stage, sub-modules are found by analyzing the code that the LLM produced. After that, these sub-modules are arranged into clusters. Representative sub-modules are chosen from each cluster. These representations are thought to be more widely applicable and reusable.

Prompt Augmentation and Re-Generation: The initial chain-of-thought prompt is enhanced and regenerated by integrating the chosen module implementations from the preceding stage. After that, the LLM is told to produce fresh modularized solutions once more. As a result, the model can effectively expand upon the information and understanding that it has obtained from earlier iterations.

CodeChain has a great impact on code generation. The team has shared that the modularity and accuracy of generated solutions are greatly improved by pushing the LLM to build upon and reuse pre-existing, verified sub-modules. Relative pass@1 improvements have been achieved by the framework on APPS of 35% and on CodeContests of an astounding 76%. These gains are shown in a variety of LLMs, including open-source LLMs like WizardCoder and models from OpenAI. Comprehensive ablation studies have been carried out to gain a deeper understanding of the elements that have contributed to CodeChain’s success. Aspects such as prompting techniques, the number of clusters employed, the sizes of the LLM models, and the caliber of the programs produced are all examined in these studies. The understanding obtained from these investigations clarifies why CodeChain is so successful in raising the caliber and modularity of code produced by LLMs.

To sum up, CodeChain is a revolutionary development in the field of large language model code generation. It achieves this by promoting modularity and facilitating self-revisions by reusing previously created sub-modules, hence bridging the gap between LLMs and seasoned human programmers.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post SalesForce AI Introduces CodeChain: An Innovative Artificial Intelligence Framework For Modular Code Generation Through A Chain of Self-Revisions With Representative Sub-Modules appeared first on MarkTechPost.

KAIST Researchers Propose SyncDiffusion: A Plug-and-Play Module that …

In a recent research paper, a team of researchers from KAIST introduced SYNCDIFFUSION, a groundbreaking module that aims to enhance the generation of panoramic images using pretrained diffusion models. The researchers identified a significant problem in panoramic image creation, primarily involving the presence of visible seams when stitching together multiple fixed-size images. To address this issue, they proposed SYNCDIFFUSION as a solution.

Creating panoramic images, those with wide, immersive views, poses challenges for image generation models, as they are typically trained to produce fixed-size images. When attempting to generate panoramas, the naive approach of stitching multiple images together often results in visible seams and incoherent compositions. This issue has driven the need for innovative methods to seamlessly blend images and maintain overall coherence.

Two prevalent methods for generating panoramic images are sequential image extrapolation and joint diffusion. The former involves generating a final panorama by extending a given image sequentially, fixing the overlapped region in each step. However, this method often struggles to produce realistic panoramas and tends to introduce repetitive patterns, leading to less-than-ideal results.

On the other hand, joint diffusion operates the reverse generative process simultaneously across multiple views and averages intermediate noisy images in overlapping regions. While this approach effectively generates seamless montages, it falls short in terms of maintaining content and style consistency across the views. As a result, it frequently combines images with different content and styles within a single panorama, resulting in incoherent outputs.

The researchers introduced SYNCDIFFUSION as a module that synchronizes multiple diffusions by employing gradient descent based on a perceptual similarity loss. The critical innovation lies in the use of the predicted denoised images at each denoising step to calculate the gradient of the perceptual loss. This approach offers meaningful guidance for creating coherent montages, as it ensures that the images blend seamlessly while maintaining content consistency.

In a series of experiments using SYNCDIFFUSION with the Stable Diffusion 2.0 model, the researchers found that their method significantly outperformed previous techniques. The user study conducted showed a substantial preference for SYNCDIFFUSION, with a 66.35% preference rate, as opposed to the previous method’s 33.65%. This marked improvement demonstrates the practical benefits of SYNCDIFFUSION in generating coherent panoramic images.

SYNCDIFFUSION is a notable addition to the field of image generation. It effectively tackles the challenge of generating seamless and coherent panoramic images, which has been a persistent issue in the field. By synchronizing multiple diffusions and applying gradient descent from perceptual similarity loss, SYNCDIFFUSION enhances the quality and coherence of generated panoramas. As a result, it offers a valuable tool for a wide range of applications that involve creating panoramic images, and it showcases the potential of using gradient descent in improving image generation processes.

Check out the Paper and Project Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post KAIST Researchers Propose SyncDiffusion: A Plug-and-Play Module that Synchronizes Multiple Diffusions through Gradient Descent from a Perceptual Similarity Loss appeared first on MarkTechPost.

Meet ScaleCrafter: Unlocking Ultra-High-Resolution Image Synthesis wit …

The development of image synthesis techniques has experienced a notable upsurge in recent years, garnering major interest from the academic and industry worlds. Text-to-image generation models and Stable Diffusion (SD) are the most widely used developments in this field. Although these models have demonstrated remarkable abilities, they can only currently produce images with a maximum resolution of 1024 x 1024 pixels, which is insufficient to satisfy the requirements of high-resolution applications like advertising.

Problems develop when trying to generate images larger than these training resolutions, mostly with object repetition and deformed object architectures. Object duplication becomes more problematic as the image size increases if a Stable Diffusion model is used to generate images at dimensions of 512 × 512 or 1024 x 1024, having been trained on 512 x 512 images.

In the resulting graphics, these problems mostly show up as object duplication and incorrect object topologies. The existing methods for creating higher-resolution images, such as those based on joint-diffusion techniques and attention mechanisms, find it difficult to adequately address these issues. Researchers have examined the U-Net architecture’s structural elements in diffusion models by pinpointing a crucial element causing the problems, which is convolutional kernels’ constrained perceptual fields. Basically, issues like object recurrence arise because the model’s convolutional procedures are limited in their capacity to see and comprehend the content of the input images.

A team of researchers has proposed ScaleCrafter for higher-resolution visual generation at inference time. It uses re-dilation, a simple yet incredibly powerful solution that enables the models to handle greater resolutions and varying aspect ratios more effectively by dynamically adjusting the convolutional perceptual field throughout the picture production process. The model can enhance the coherence and quality of the generated images by dynamically adjusting the receptive field. The work presents two further advances: dispersed convolution and noise-damped classifier-free guidance. With this, the model can produce ultra-high-resolution photographs, up to 4096 by 4096 pixel dimensions. This method doesn’t require any extra training or optimization stages, making it a workable solution for high-resolution picture synthesis’s repetition and structural problems.

Comprehensive tests have been carried out for this study, which showed that the suggested method successfully addresses the object repetition issue and delivers cutting-edge results in producing images with higher resolution, especially excelling in displaying complex texture details. This work also sheds light on the possibility of using diffusion models that have already been trained on low-resolution images to generate high-resolution visuals without requiring a lot of retraining, which could guide future work in the field of ultra-high-resolution image and video synthesis.

The primary contributions have been summarized as follows.

The team has found that rather than the number of attention tokens, the primary cause of object repetition is the convolutional procedures’ constrained receptive field.

Based on these findings, the team has proposed a re-dilation approach that dynamically increases the convolutional receptive field while inference is underway, which tackles the root of the issue.

Two innovative strategies have been presented: dispersed convolution and noise-damped classifier-free guidance, specifically meant to be used in creating ultra-high-resolution images.

The method has been applied to a text-to-video model and has been comprehensively evaluated across a variety of diffusion models, including different iterations of Stable Diffusion. These tests include a wide range of aspect ratios and image resolutions, showcasing the model’s effectiveness in addressing the problem of object recurrence and improving high-resolution image synthesis.

Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post Meet ScaleCrafter: Unlocking Ultra-High-Resolution Image Synthesis with Pre-trained Diffusion Models appeared first on MarkTechPost.

6 Magic Commands for Jupyter Notebooks in Python Data Science

In the field of Python-based Data Science projects, the utilization of Jupyter Notebooks is ubiquitous. These interactive and user-friendly environments facilitate seamless integration of code and documentation, providing a conducive space for exploration and analysis. Within this framework exists a set of magic commands that prove invaluable tools. These commands enhance workflow efficiency and serve as time-saving instruments for the discerning data scientist.

1. Conversing with Models in Jupyter

The command “%%ai” makes it possible to enter the world of natural language interactions with machine learning models. Users can choose a model using this command and then have natural language conversations with that model. This function expands the range of possibilities for model exploration and enhances the interactivity of Jupyter Notebooks.

 2.%%latex: Elevating Visual Representations

The “%%latex” command must include mathematical equations or symbols in their notebooks. The rendering of LaTeX code directly in Jupyter Notebooks is made possible by this command, providing the seamless integration of mathematical expressions for clearer and more expert presentations.

3. %%sql: Empowering Database Interactions

With the “%%sql” magic instructions, the integration of SQL queries into Jupyter Notebooks is simplified. It allows the users to execute SQL queries directly inside the notebook environment. This functionality eliminates the need for external interfaces, which is useful for data scientists using databases.

4. %run: Effortless Python File Execution

With the “%run” magic command, running external Python files inside a Jupyter Notebook is simpler. Only one command is needed to access the data inside a Python file, whether a standalone script or module. This improves Jupyter-based applications’ modularity by making it easier to integrate external code easily.

5. %%writefile: Streamlining File Creation

The magic command “%%writefile” takes care of the necessity for quick file creation within the notebook. Users can easily create new Python files by entering the desired file name and including the content within the cell. This functionality guarantees a simpler approach to file management while improving code organization.

 6. %history -n: Retrieving Previous Commands

In Jupyter Notebooks, sometimes we accidentally delete our commands and the results they give. But there’s a helpful trick called “%history -n.” With this, we can see a list of all your past commands, and you can decide how many of them you want to look at (“-n” lets you choose).

For Python-based Data Science projects, the integration of these magic commands enhances the Jupyter Notebook experience. These commands greatly improve workflow through interactions with models, effective database interactions, and simplified file management. Having such tools becomes crucial for remaining ahead in searching for insights and discoveries as the data science landscape changes. Using these commands, data scientists can make their projects less complicated and work better. This will make their studies stronger and more important in the end.

Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post 6 Magic Commands for Jupyter Notebooks in Python Data Science appeared first on MarkTechPost.

Governing the ML lifecycle at scale, Part 1: A framework for architect …

Customers of every size and industry are innovating on AWS by infusing machine learning (ML) into their products and services. Recent developments in generative AI models have further sped up the need of ML adoption across industries. However, implementing security, data privacy, and governance controls are still key challenges faced by customers when implementing ML workloads at scale. Addressing those challenges builds the framework and foundations for mitigating risk and responsible use of ML-driven products. Although generative AI may need additional controls in place, such as removing toxicity and preventing jailbreaking and hallucinations, it shares the same foundational components for security and governance as traditional ML.
We hear from customers that they require specialized knowledge and investment of up to 12 months for building out their customized Amazon SageMaker ML platform implementation to ensure scalable, reliable, secure, and governed ML environments for their lines of business (LOBs) or ML teams. If you lack a framework for governing the ML lifecycle at scale, you may run into challenges such as team-level resource isolation, scaling experimentation resources, operationalizing ML workflows, scaling model governance, and managing security and compliance of ML workloads.
Governing ML lifecycle at scale is a framework to help you build an ML platform with embedded security and governance controls based on industry best practices and enterprise standards. This framework addresses challenges by providing prescriptive guidance through a modular framework approach extending an AWS Control Tower multi-account AWS environment and the approach discussed in the post Setting up secure, well-governed machine learning environments on AWS.
It provides prescriptive guidance for the following ML platform functions:

Multi-account, security, and networking foundations – This function uses AWS Control Tower and well-architected principles for setting up and operating multi-account environment, security, and networking services.
Data and governance foundations – This function uses a data mesh architecture for setting up and operating the data lake, central feature store, and data governance foundations to enable fine-grained data access.
ML platform shared and governance services – This function enables setting up and operating common services such as CI/CD, AWS Service Catalog for provisioning environments, and a central model registry for model promotion and lineage.
ML team environments – This function enables setting up and operating environments for ML teams for model development, testing, and deploying their use cases for embedding security and governance controls.
ML platform observability – This function helps with troubleshooting and identifying the root cause for problems in ML models through centralization of logs and providing tools for log analysis visualization. It also provides guidance for generating cost and usage reports for ML use cases.

Although this framework can provide benefits to all customers, it’s most beneficial for large, mature, regulated, or global enterprises customers that want to scale their ML strategies in a controlled, compliant, and coordinated approach across the organization. It helps enable ML adoption while mitigating risks. This framework is useful for the following customers:

Large enterprise customers that have many LOBs or departments interested in using ML. This framework allows different teams to build and deploy ML models independently while providing central governance.
Enterprise customers with a moderate to high maturity in ML. They have already deployed some initial ML models and are looking to scale their ML efforts. This framework can help accelerate ML adoption across the organization. These companies also recognize the need for governance to manage things like access control, data usage, model performance, and unfair bias.
Companies in regulated industries such as financial services, healthcare, chemistry, and the private sector. These companies need strong governance and audibility for any ML models used in their business processes. Adopting this framework can help facilitate compliance while still allowing for local model development.
Global organizations that need to balance centralized and local control. This framework’s federated approach allows the central platform engineering team to set some high-level policies and standards, but also gives LOB teams flexibility to adapt based on local needs.

In the first part of this series, we walk through the reference architecture for setting up the ML platform. In a later post, we will provide prescriptive guidance for how to implement the various modules in the reference architecture in your organization.
The capabilities of the ML platform are grouped into four categories, as shown in the following figure. These capabilities form the foundation of the reference architecture discussed later in this post:

Build ML foundations
Scale ML operations
Observable ML
Secure ML

Solution overview
The framework for governing ML lifecycle at scale framework enables organizations to embed security and governance controls throughout the ML lifecycle that in turn help organizations reduce risk and accelerate infusing ML into their products and services. The framework helps optimize the setup and governance of secure, scalable, and reliable ML environments that can scale to support an increasing number of models and projects. The framework enables the following features:

Account and infrastructure provisioning with organization policy compliant infrastructure resources
Self-service deployment of data science environments and end-to-end ML operations (MLOps) templates for ML use cases
LOB-level or team-level isolation of resources for security and privacy compliance
Governed access to production-grade data for experimentation and production-ready workflows
Management and governance for code repositories, code pipelines, deployed models, and data features
A model registry and feature store (local and central components) for improving governance
Security and governance controls for the end-to-end model development and deployment process

In this section, we provide an overview of prescriptive guidance to help you build this ML platform on AWS with embedded security and governance controls.
The functional architecture associated with the ML platform is shown in the following diagram. The architecture maps the different capabilities of the ML platform to AWS accounts.

The functional architecture with different capabilities is implemented using a number of AWS services, including AWS Organizations, SageMaker, AWS DevOps services, and a data lake. The reference architecture for the ML platform with various AWS services is shown in the following diagram.

This framework considers multiple personas and services to govern the ML lifecycle at scale. We recommend the following steps to organize your teams and services:

Using AWS Control Tower and automation tooling, your cloud administrator sets up the multi-account foundations such as Organizations and AWS IAM Identity Center (successor to AWS Single Sign-On) and security and governance services such as AWS Key Management Service (AWS KMS) and Service Catalog. In addition, the administrator sets up a variety of organization units (OUs) and initial accounts to support your ML and analytics workflows.
Data lake administrators set up your data lake and data catalog, and set up the central feature store working with the ML platform admin.
The ML platform admin provisions ML shared services such as AWS CodeCommit, AWS CodePipeline, Amazon Elastic Container Registry (Amazon ECR), a central model registry, SageMaker Model Cards, SageMaker Model Dashboard, and Service Catalog products for ML teams.
The ML team lead federates via IAM Identity Center, uses Service Catalog products, and provisions resources in the ML team’s development environment.
Data scientists from ML teams across different business units federate into their team’s development environment to build the model pipeline.
Data scientists search and pull features from the central feature store catalog, build models through experiments, and select the best model for promotion.
Data scientists create and share new features into the central feature store catalog for reuse.
An ML engineer deploys the model pipeline into the ML team test environment using a shared services CI/CD process.
After stakeholder validation, the ML model is deployed to the team’s production environment.
Security and governance controls are embedded into every layer of this architecture using services such as AWS Security Hub, Amazon GuardDuty, Amazon Macie, and more.
Security controls are centrally managed from the security tooling account using Security Hub.
ML platform governance capabilities such as SageMaker Model Cards and SageMaker Model Dashboard are centrally managed from the governance services account.
Amazon CloudWatch and AWS CloudTrail logs from each member account are made accessible centrally from an observability account using AWS native services.

Next, we dive deep into the modules of the reference architecture for this framework.
Reference architecture modules
The reference architecture comprises eight modules, each designed to solve a specific set of problems. Collectively, these modules address governance across various dimensions, such as infrastructure, data, model, and cost. Each module offers a distinct set of functions and interoperates with other modules to provide an integrated end-to-end ML platform with embedded security and governance controls. In this section, we present a short summary of each module’s capabilities.
Multi-account foundations
This module helps cloud administrators build an AWS Control Tower landing zone as a foundational framework. This includes building a multi-account structure, authentication and authorization via IAM Identity Center, a network hub-and-spoke design, centralized logging services, and new AWS member accounts with standardized security and governance baselines.
In addition, this module gives best practice guidance on OU and account structures that are appropriate for supporting your ML and analytics workflows. Cloud administrators will understand the purpose of the required accounts and OUs, how to deploy them, and key security and compliance services they should use to centrally govern their ML and analytics workloads.
A framework for vending new accounts is also covered, which uses automation for baselining new accounts when they are provisioned. By having an automated account provisioning process set up, cloud administrators can provide ML and analytics teams the accounts they need to perform their work more quickly, without sacrificing on a strong foundation for governance.
Data lake foundations
This module helps data lake admins set up a data lake to ingest data, curate datasets, and use the AWS Lake Formation governance model for managing fine-grained data access across accounts and users using a centralized data catalog, data access policies, and tag-based access controls. You can start small with one account for your data platform foundations for a proof of concept or a few small workloads. For medium-to-large-scale production workload implementation, we recommend adopting a multi-account strategy. In such a setting, LOBs can assume the role of data producers and data consumers using different AWS accounts, and the data lake governance is operated from a central shared AWS account. The data producer collects, processes, and stores data from their data domain, in addition to monitoring and ensuring the quality of their data assets. Data consumers consume the data from the data producer after the centralized catalog shares it using Lake Formation. The centralized catalog stores and manages the shared data catalog for the data producer accounts.
ML platform services
This module helps the ML platform engineering team set up shared services that are used by the data science teams on their team accounts. The services include a Service Catalog portfolio with products for SageMaker domain deployment, SageMaker domain user profile deployment, data science model templates for model building and deploying. This module has functionalities for a centralized model registry, model cards, model dashboard, and the CI/CD pipelines used to orchestrate and automate model development and deployment workflows.
In addition, this module details how to implement the controls and governance required to enable persona-based self-service capabilities, allowing data science teams to independently deploy their required cloud infrastructure and ML templates.
ML use case development
This module helps LOBs and data scientists access their team’s SageMaker domain in a development environment and instantiate a model building template to develop their models. In this module, data scientists work on a dev account instance of the template to interact with the data available on the centralized data lake, reuse and share features from a central feature store, create and run ML experiments, build and test their ML workflows, and register their models to a dev account model registry in their development environments.
Capabilities such as experiment tracking, model explainability reports, data and model bias monitoring, and model registry are also implemented in the templates, allowing for rapid adaptation of the solutions to the data scientists’ developed models.
ML operations
This module helps LOBs and ML engineers work on their dev instances of the model deployment template. After the candidate model is registered and approved, they set up CI/CD pipelines and run ML workflows in the team’s test environment, which registers the model into the central model registry running in a platform shared services account. When a model is approved in the central model registry, this triggers a CI/CD pipeline to deploy the model into the team’s production environment.
Centralized feature store
After the first models are deployed to production and multiple use cases start to share features created from the same data, a feature store becomes essential to ensure collaboration across use cases and reduce duplicate work. This module helps the ML platform engineering team set up a centralized feature store to provide storage and governance for ML features created by the ML use cases, enabling feature reuse across projects.
Logging and observability
This module helps LOBs and ML practitioners gain visibility into the state of ML workloads across ML environments through centralization of log activity such as CloudTrail, CloudWatch, VPC flow logs, and ML workload logs. Teams can filter, query, and visualize logs for analysis, which can help enhance security posture as well.
Cost and reporting
This module helps various stakeholders (cloud admin, platform admin, cloud business office) to generate reports and dashboards to break down costs at ML user, ML team, and ML product levels, and track usage such as number of users, instance types, and endpoints.
Customers have asked us to provide guidance on how many accounts to create and how to structure those accounts. In the next section, we provide guidance on that account structure as reference that you can modify to suit your needs according to your enterprise governance requirements.
Reference account structure
In this section, we discuss our recommendation for organizing your account structure. We share a baseline reference account structure; however, we recommend ML and data admins work closely with their cloud admin to customize this account structure based on their organization controls.

We recommend organizing accounts by OU for security, infrastructure, workloads, and deployments. Furthermore, within each OU, organize by non-production and production OU because the accounts and workloads deployed under them have different controls. Next, we briefly discuss those OUs.
Security OU
The accounts in this OU are managed by the organization’s cloud admin or security team for monitoring, identifying, protecting, detecting, and responding to security events.
Infrastructure OU
The accounts in this OU are managed by the organization’s cloud admin or network team for managing enterprise-level infrastructure shared resources and networks.
We recommend having the following accounts under the infrastructure OU:

Network – Set up a centralized networking infrastructure such as AWS Transit Gateway
Shared services – Set up centralized AD services and VPC endpoints

Workloads OU
The accounts in this OU are managed by the organization’s platform team admins. If you need different controls implemented for each platform team, you can nest other levels of OU for that purpose, such as an ML workloads OU, data workloads OU, and so on.
We recommend the following accounts under the workloads OU:

Team-level ML dev, test, and prod accounts – Set this up based on your workload isolation requirements
Data lake accounts – Partition accounts by your data domain
Central data governance account – Centralize your data access policies
Central feature store account – Centralize features for sharing across teams

Deployments OU
The accounts in this OU are managed by the organization’s platform team admins for deploying workloads and observability.
We recommend the following accounts under the deployments OU because the ML platform team can set up different sets of controls at this OU level to manage and govern deployments:

ML shared services accounts for test and prod – Hosts platform shared services CI/CD and model registry
ML observability accounts for test and prod – Hosts CloudWatch logs, CloudTrail logs, and other logs as needed

Next, we briefly discuss organization controls that need to be considered for embedding into member accounts for monitoring the infrastructure resources.
AWS environment controls
A control is a high-level rule that provides ongoing governance for your overall AWS environment. It’s expressed in plain language. In this framework, we use AWS Control Tower to implement the following controls that help you govern your resources and monitor compliance across groups of AWS accounts:

Preventive controls – A preventive control ensures that your accounts maintain compliance because it disallows actions that lead to policy violations and are implemented using a Service Control Policy (SCP). For example, you can set a preventive control that ensures that CloudTrail is not deleted or stopped in AWS accounts or Regions.
Detective controls – A detective control detects noncompliance of resources within your accounts, such as policy violations, provides alerts through the dashboard, and is implemented using AWS Config rules. For example, you can create a detective control to detects whether public read access is enabled to the Amazon Simple Storage Service (Amazon S3) buckets in the log archive shared account.
Proactive controls – A proactive control scans your resources before they are provisioned and makes sure that the resources are compliant with that control and are implemented using AWS CloudFormation hooks. Resources that aren’t compliant will not be provisioned. For example, you can set a proactive control that checks that direct internet access is not allowed for a SageMaker notebook instance.

Interactions between ML platform services, ML use cases, and ML operations
Different personas, such as the head of data science (lead data scientist), data scientist, and ML engineer, operate modules 2–6 as shown in the following diagram for different stages of ML platform services, ML use case development, and ML operations along with data lake foundations and the central feature store.

The following table summarizes the ops flow activity and setup flow steps for different personas. Once a persona initiates a ML activity as part of ops flow, the services run as mentioned in setup flow steps.

Persona
Ops Flow Activity – Number
Ops Flow Activity – Description
Setup Flow Step – Number
Setup Flow Step – Description

Lead Data Science or ML Team Lead
1
Uses Service Catalog in the ML platform services account and deploys the following:

ML infrastructure
SageMaker projects
SageMaker model registry

1-A

Sets up the dev, test, and prod environments for LOBs
Sets up SageMaker Studio in the ML platform services account

1-B

Sets up SageMaker Studio with the required configuration

Data Scientist
2
Conducts and tracks ML experiments in SageMaker notebooks
2-A

Uses data from Lake Formation
Saves features in the central feature store

3
Automates successful ML experiments with SageMaker projects and pipelines
3-A

Initiates SageMaker pipelines (preprocess, train, evaluate) in the dev account

Initiates the build CI/CD process with CodePipeline in the dev account

3-B
After the SageMaker pipelines run, saves the model in the local (dev) model registry

Lead Data Scientist or ML Team Lead
4
Approves the model in the local (dev) model registry
4-A
Model metadata and model package writes from the local (dev) model registry to the central model registry

5
Approves the model in the central model registry
5-A
Initiates the deployment CI/CD process to create SageMaker endpoints in the test environment

5-B
Writes the model information and metadata to the ML governance module (model card, model dashboard) in the ML platform services account from the local (dev) account

ML Engineer
6
Tests and monitors the SageMaker endpoint in the test environment after CI/CD
.

7
Approves deployment for SageMaker endpoints in the prod environment
7-A
Initiates the deployment CI/CD process to create SageMaker endpoints in the prod environment

8
Tests and monitors the SageMaker endpoint in the test environment after CI/CD
.

Personas and interactions with different modules of the ML platform
Each module caters to particular target personas within specific divisions that utilize the module most often, granting them primary access. Secondary access is then permitted to other divisions that require occasional use of the modules. The modules are tailored towards the needs of particular job roles or personas to optimize functionality.
We discuss the following teams:

Central cloud engineering – This team operates at the enterprise cloud level across all workloads for setting up common cloud infrastructure services, such as setting up enterprise-level networking, identity, permissions, and account management
Data platform engineering – This team manages enterprise data lakes, data collection, data curation, and data governance
ML platform engineering – This team operates at the ML platform level across LOBs to provide shared ML infrastructure services such as ML infrastructure provisioning, experiment tracking, model governance, deployment, and observability

The following table details which divisions have primary and secondary access for each module according to the module’s target personas.

Module Number
Modules
Primary Access
Secondary Access
Target Personas
Number of accounts

1
Multi-account foundations
Central cloud engineering
Individual LOBs

Cloud admin
Cloud engineers

Few

2
Data lake foundations
Central cloud or data platform engineering
Individual LOBs

Data lake admin
Data engineers

Multiple

3
ML platform services
Central cloud or ML platform engineering
Individual LOBs

ML platform Admin
ML team Lead
ML engineers
ML governance lead

One

4
ML use case development
Individual LOBs
Central cloud or ML platform engineering

Data scientists
Data engineers
ML team lead
ML engineers

Multiple

5
ML operations
Central cloud or ML engineering
Individual LOBs

ML Engineers
ML team leads
Data scientists

Multiple

6
Centralized feature store
Central cloud or data engineering
Individual LOBs

Data engineer
Data scientists

One

7
Logging and observability
Central cloud engineering
Individual LOBs

Cloud admin
IT auditors

One

8
Cost and reporting
Individual LOBs
Central platform engineering

LOB executives
ML managers

One

Conclusion
In this post, we introduced a framework for governing the ML lifecycle at scale that helps you implement well-architected ML workloads embedding security and governance controls. We discussed how this framework takes a holistic approach for building an ML platform considering data governance, model governance, and enterprise-level controls. We encourage you to experiment with the framework and concepts introduced in this post and share your feedback.

About the authors
Ram Vittal is a Principal ML Solutions Architect at AWS. He has over 3 decades of experience architecting and building distributed, hybrid, and cloud applications. He is passionate about building secure, scalable, reliable AI/ML and big data solutions to help enterprise customers with their cloud adoption and optimization journey to improve their business outcomes. In his spare time, he rides motorcycle and walks with his three-year old sheep-a-doodle!
Sovik Kumar Nath is an AI/ML solution architect with AWS. He has extensive experience designing end-to-end machine learning and business analytics solutions in finance, operations, marketing, healthcare, supply chain management, and IoT. Sovik has published articles and holds a patent in ML model monitoring. He has double masters degrees from the University of South Florida, University of Fribourg, Switzerland, and a bachelors degree from the Indian Institute of Technology, Kharagpur. Outside of work, Sovik enjoys traveling, taking ferry rides, and watching movies.
Maira Ladeira Tanke is a Senior Data Specialist at AWS. As a technical lead, she helps customers accelerate their achievement of business value through emerging technology and innovative solutions. Maira has been with AWS since January 2020. Prior to that, she worked as a data scientist in multiple industries focusing on achieving business value from data. In her free time, Maira enjoys traveling and spending time with her family someplace warm.
Ryan Lempka is a Senior Solutions Architect at Amazon Web Services, where he helps his customers work backwards from business objectives to develop solutions on AWS. He has deep experience in business strategy, IT systems management, and data science. Ryan is dedicated to being a lifelong learner, and enjoys challenging himself every day to learn something new.
Sriharsh Adari is a Senior Solutions Architect at Amazon Web Services (AWS), where he helps customers work backwards from business outcomes to develop innovative solutions on AWS. Over the years, he has helped multiple customers on data platform transformations across industry verticals. His core area of expertise include Technology Strategy, Data Analytics, and Data Science. In his spare time, he enjoys playing sports, binge-watching TV shows, and playing Tabla.

How Meesho built a generalized feed ranker using Amazon SageMaker infe …

This is a guest post co-written by Rama Badrinath, Divay Jindal and Utkarsh Agrawal at Meesho.

Meesho is India’s fastest growing ecommerce company with a mission to democratize internet commerce for everyone and make it accessible to the next billion users of India. Meesho was founded in 2015 and today focuses on buyers and sellers across India. The Meesho marketplace provides micro, small, and medium businesses and individual entrepreneurs access to millions of customers, a selection from over 30 categories and more than 900 sub-categories, pan-India logistics, payment services, and customer support capabilities to efficiently run their businesses on the Meesho ecosystem.
As an ecommerce platform, Meesho aims to improve the user experience by offering personalized and relevant product recommendations. We wanted to create a generalized feed ranker that considers individual preferences and historical behavior to effectively display products in each user’s feed. Through this, we wanted to boost user engagement, conversion rates, and overall business growth by tailoring the shopping experience to each customer’s unique requirements and providing the best value for their money.
We used AWS machine learning (ML) services like Amazon SageMaker to develop a powerful generalized feed ranker (GFR). In this post, we discuss the key components of the GFR and how this ML-driven solution streamlined the ML lifecycle, ensuring efficient infra management, scalability, and reliability within the ecosystem.
Solution overview
To personalize users’ feeds, we analyzed extensive historical data, extracting insights into features that include browsing patterns and interests. These valuable features are used to construct ranking models. The GFR personalizes each user’s feed in real time, considering various factors like geography, prior shopping pattern, acquisition channels, and more. Several interaction-based features are also used to capture the affinity of the user towards an item, item category, or item properties like price, rating, or discount.
Several user-agnostic features and scores at item level are used as well. These include an item popularity score and item propensity to buy score. All these features go as input to the Learning to Rank (LTR) model that tries to emit the Probability of Click (PCTR) and Probability of Purchase (PCVR).
For diverse and relevant recommendations, the GFR sources candidate products from multiple channels, including exploit (known user preferences), explore (novel and potentially interesting products), popularity (trending items), and recent (latest additions).
The following diagram illustrates the GFR architecture.

The architecture can be divided into two different components: model training and model deployment. In the following sections, we discuss each component and the AWS services used in more detail.
Model training
Meesho used Amazon EMR with Apache Spark to process hundreds of millions of data points, depending on the model’s complexity. One of the major challenges was to run distributed training at scale. We used Dask—a distributed data science computing framework that natively integrates with Python libraries—on Amazon EMR to scale out the training jobs across the cluster. The distributed training of the model helped cut down training time from days to hours and allowed us to schedule Spark jobs efficiently and cost-effectively. We used an offline feature store to maintain a historical record of all feature values that will be used for model training. Model artifacts from training are stored in Amazon Simple Storage Service (Amazon S3), providing convenient access and version management.
We used a time sampling strategy to create training, validation, and test datasets for model training. We kept track of various metrics to evaluate the performance of the model—the most important ones being area under the ROC curve and area under the precision recall curve. We also tracked calibration of the model to prevent overconfidence and underconfidence issues while predicting the probability scores.
Model deployment
Meesho used SageMaker inference endpoints with auto scaling enabled for deploying the trained model. SageMaker offered ease of deployment with support for various ML frameworks, allowing models to be served with low latency. Although AWS offers standard inference images suitable for most use cases, we built a custom inference image that caters specifically to our needs and pushed it to Amazon Elastic Container Registry (Amazon ECR).
We built an in-house A/B testing platform that facilitated live monitoring of A/B metrics, enabling us to make data-driven decisions promptly. We also used the A/B testing feature of SageMaker to deploy multiple production variants on an endpoint. Through A/B experiments, we observed an approximate 3.5% enhancement in the platform’s conversion rate and an increase in app open frequency of the users, highlighting the effectiveness of this approach.
We kept track of various drifts such as feature drift and prior drift multiple times a day after model deployment to prevent the model performance from deteriorating.
We used AWS Lambda to set up various automations and triggers that are required during model retraining, endpoint updates, and monitoring processes.
The recommendation workflow after model deployment works as follows (as noted in the solution architecture diagram):

The input requests with user context and interaction features are received at the application layer from Meesho’s mobile and web app.
The application layer fetches additional features like historical data of the user from the online feature store and appends these to the input requests.
The appended features are sent to the real-time endpoints for generating recommendations.
The model predictions are sent back to the application layer.
The application layer uses these predictions to personalize the user feeds on the mobile or web application.

Conclusion
Meesho successfully implemented a generalized feed ranker using SageMaker, which resulted in highly personalized product recommendations for each customer based on their preferences and historical behavior. This approach significantly improved user engagement and led to higher conversion rates, contributing to the company’s overall business growth. As a result of utilizing AWS services, our ML lifecycle runtime reduced significantly, from taking months to just weeks, leading to increased efficiency and productivity for our team.
With this advanced feed ranker, Meesho continues to deliver tailored shopping experiences, adding more value to its customers and fulfilling its mission to democratize ecommerce for everyone.
The team is grateful for the continuous support and guidance from Ravindra Yadav, Director of Data Science at Meesho, and Debdoot Mukherjee, Head of AI at Meesho, who played a key role in enabling this success.
To learn more about SageMaker, refer to the Amazon SageMaker Developer Guide.

About the Authors
Utkarsh Agrawal is currently working as a Senior Data Scientist at Meesho. He previously worked with Fractal Analytics and Trell on various domains, including recommender systems, time series, NLP, and more. He holds a master’s degree in Mathematics and Computing from Indian Institute of Technology Kharagpur (IIT), India.
Rama Badrinath is currently working as a Principal Data Scientist at Meesho. He previously worked with Microsoft and ShareChat on various domains, including recommender systems, image AI, NLP, and more. He holds a master’s degree in Machine Learning from Indian Institute of Science (IISc), India. He has also published papers in renowned conferences such as KDD and ECIR.
Divay Jindal is currently working as a Lead Data Scientist at Meesho. He previously worked with Bookmyshow on various domains, including recommender systems and dynamic pricing.
Venugopal Pai is a Solutions Architect at AWS. He lives in Bengaluru, India, and helps digital-native customers scale and optimize their applications on AWS.

AI Email Writer: The Key to Marketing in 2023

For years, marketers have spent countless hours perfecting personalized targeted email campaigns. All the effort was worth it for the great engagement rates, but it was a lot of effort. Now, AI email writers seem like an exciting alternative: similar results with much less work. 

As with any new technology, it makes sense to be skeptical of the claims about just how life-changing these tools are. Are AI email writers as good as everyone says? 

In a word, yes. But don’t worry: I’ll explain! Read along and gain a comprehensive understanding of everything AI Email Writer! 

What is an AI email writer?

How does an AI email writer work?

What are the benefits of using an AI email writer?

How to use an AI email writer effectively

How different companies can use an AI email writer

Grade Your Website Lead Conversion Rate Instantly

Get an instant website audit and 50 leads for free

Please enable JavaScript in your browser to complete this form.Website / URL *Grade my website

What is an AI email writer? 

An AI email writer is a software tool that uses artificial intelligence to generate personalized emails at scale. AI email writers help businesses save time, improve efficiency, generate more leads, and generate more sales. 

How does an AI email writer work? 

It’s actually a really simple concept. 

AI email writers work by using a combination of machine learning and natural language processing to analyze your email list data and generate personalized emails for each recipient. 

In other words, it does exactly what you used to do, but it does it in seconds while you grab a cup of coffee!

Here’s what’s happening on a more minute level: 

Machine learning is a type of artificial intelligence that allows computers to learn without being explicitly programmed. AI email writers use machine learning to identify patterns in your email list data, such as the demographics, interests, and behaviors of your recipients.

Natural language processing is a type of artificial intelligence that allows computers to understand and process human language. AI email writers use natural language processing to generate personalized emails that are tailored to the individual interests and needs of each recipient.

What are the benefits of using an AI email writer? 

All of this sounds great, but the real question is, what does it actually do for you? 

Here are just a handful of the benefits: 

Save time: AI email writers can save you a significant amount of time by generating personalized emails at scale. This means that you can focus on other important tasks, such as building relationships with customers and growing your business.

Improve efficiency: AI email writers can help you improve your email efficiency by automating tasks such as sending follow-up emails, segmenting your audience, and tracking results. This allows you to focus on the most important aspects of your email marketing campaigns.

Generate more leads and sales: AI email writers can help you generate more leads and sales by sending personalized emails that are tailored to the needs of your target audience. This can lead to higher open rates, click-through rates, and conversion rates.

How to use an AI email writer effectively

Generate enriched leads. AI email writers are only as good as the information they’re given. Email campaigns are only as effective as the audiences their sent to. So filling your pipeline with high-intent leads that have enough targeting information is key. 

Customers.ai’s X-Ray tool is perfect for this! It’s easy to install on your site and it provides you with super valuable information on 20% of your website visitors! Learn their name, email, landing page visited, and more! Then use that data to build out the rest of your campaign.

Segment your audience. Segmenting your audience is the process of dividing your email list into groups of people who share similar characteristics. This allows you to send more targeted and relevant emails to your audience. AI email writers can segment your campaigns based on a number of intent signals, like landing page visited.

This lets you target people who visited each page based on what they’re interested in!

Generate personalized emails. To generate personalized emails, you will need to provide the AI email writer with some information about your recipients, such as their demographics, interests, and landing page visited. The AI email writer will then use this information to generate personalized emails for each recipient.

Track your results. Tracking your email marketing results is important because it allows you to see what is working and what is not. This information can help you improve your email campaigns over time.

Customers.ai keeps your analytics front and center so you can always stay on top of how many leads you’re getting, how many emails you’re sending, and what the engagement is like. Improvements or issues will never go unnoticed!

How different companies can use an AI email writer

AI email writers can be used by businesses of all sizes and industries to improve their email marketing results. Here are some examples of how different types of companies can use AI email writers:

E-commerce companies

E-commerce companies can use AI email writers to:

Send personalized product recommendations to customers based on their purchase history and browsing behavior.

Send abandoned cart emails to customers who have added items to their cart but have not yet completed the purchase process.

Send win-back emails to customers who have not made a purchase in a while.

Promote new products and special offers to customers.

Insurance companies

Insurance companies can use AI email writers to:

Send personalized quotes to potential customers.

Send policy reminders to existing customers.

Send educational content about insurance to customers.

Cross-sell and upsell additional insurance products to customers.

Travel and tourism companies

Travel and tourism companies can use AI email writers to:

Send personalized travel recommendations to customers based on their interests and budget.

Send flight and hotel booking reminders to customers.

Send special offers to customers, such as discounts on flights and hotels.

Promote tourist attractions and activities to customers.

Grade Your Website Lead Conversion Rate Instantly

Get an instant website audit and 50 leads for free

Please enable JavaScript in your browser to complete this form.Website / URL *Grade my website

Conclusion

AI email writers are such a powerful time-saving, engagement-boosting tool. Your email marketing won’t ever be the same. Give ’em a shot and see for yourself.

Frequently Asked Question about AI Email Writers

What is an AI email writer? An AI email writer is a software tool that uses artificial intelligence to generate personalized emails at scale. AI email writers can help you save time, improve efficiency, and generate more leads and sales. How do AI email writers work? AI email writers work by using a combination of machine learning and natural language processing to analyze your email list data and generate personalized emails for each recipient. Machine learning allows AI email writers to identify patterns in your email list data, such as the demographics, interests, and behaviors of your recipients. Natural language processing allows AI email writers to generate personalized emails that are tailored to the individual interests and needs of each recipient. What are the benefits of using an AI email writer? AI email writers help you save time, improve efficiency, generate more leads and sales, and improve customer engagement. Can I use an AI email writer to send automated emails? Yes, you can use an AI email writer to send automated emails. For example, you can use an AI email writer to send automated welcome emails to new subscribers, abandoned cart emails to customers who have added items to their cart but have not yet completed the purchase process, and birthday emails to customers. Can I use an AI email writer to generate emails in different writing styles, such as formal, informal, or conversational? Yes, AI email writers can adapt to various writing styles, including formal, informal, or conversational tones. You can customize the style to match the context of your email and target audience. Can I use an AI email writer to create personalized email templates for different types of emails, such as sales outreach emails, welcome emails, and abandoned cart emails? Absolutely! AI email writers are versatile and can help you generate personalized templates for a wide range of emails, from sales outreach and welcome messages to abandoned cart recovery emails. This saves you time and ensures consistency in your communication. Can I use an AI email writer to test different email variations and see which ones perform best? Yes, many AI email writing tools offer A/B testing features. You can use these to create different email variations and determine which ones yield the best results, such as higher open rates, click-through rates, and conversions. Can I use an AI email writer to integrate with my other marketing tools, such as CRM systems and marketing automation platforms? Yes, integration with other marketing tools is a common feature in AI email writers. You can connect them with CRM systems and marketing automation platforms, streamlining your workflow and ensuring seamless data exchange between tools for better campaign management. Can AI email writers help with subject line optimization to increase email open rates? Yes, many AI email writers offer subject line suggestions based on data analysis and industry best practices. These suggestions can help improve your email open rates by crafting compelling subject lines.
The post AI Email Writer: The Key to Marketing in 2023 appeared first on Customers.ai.

Researchers from UC Berkeley Propose RingAttention: A Memory-Efficient …

A type of deep learning model architecture is called Transformers in the context of many state-of-the-art AI models. They have revolutionized the field of artificial intelligence, particularly in natural language processing and various other tasks in machine learning. It is based on a self-attention mechanism where the model weighs the importance of different parts of the input sequence when making predictions. They consist of an encoder and a decoder to process the inputs.  

However, scaling up the context length of Transformers takes a lot of work. It is due to the inherited self-attention. Self-attention has memory cost quadratic in the input sequence length, which makes it challenging to scale to the longer input sequences. Researchers at UC Berkley developed a method called Ring Attention to tackle this based on a simple observation. They observed that when self-attention and feedforward network computations are performed blockwise, the sequences can be distributed across multiple devices and easily analyzed.

They distribute the outer loop of computing blockwise attention among hosts, each device managing its respective input block. For the inner loop, they compute blockwise attention and feedforward operations specific to its designated input block for all devices. Their host devices form a conceptual ring and send a copy of its key-value blocks being used for blockwise computation to the next device in the ring. They also simultaneously receive key-value blocks from the previous one.

The block computations take longer than block transfers. The team overlapped these processes, resulting in no added overhead compared to standard transformers. By doing so, each device requires only memory proportional to the block size, independent of the original input sequence length. This effectively eliminates the memory constraints imposed by individual devices. 

Their experiments show that Ring Attention can reduce the memory requirements of Transformers by enabling them to train more than 500 times longer sequences than prior memory efficient state-of-the-arts. This method also allows training sequences that exceed 100 million in length without making approximations to attention. As Ring Attention eliminates the memory constraints imposed by individual devices, one can also achieve near-infinite context sizes. However, one would require many number of devices as sequence length is proportional to the number of devices.

The research only involves an evaluation of the effectiveness of the method without the large-scale training models. As the scale context length depends on the number of devices, the model’s efficiency depends on the optimization; they have only worked on the low-level operations required for achieving optimal computer performance. The researchers say that they would like to work on both maximum sequence length and maximum computer performance in the future. The possibility of near-infinite context introduces many exciting opportunities, such as large video-audio-language models, learning from extended feedback and trial-and-errors, understanding and generating codebase, and adapting AI models to understand scientific data such as gene sequences.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post Researchers from UC Berkeley Propose RingAttention: A Memory-Efficient Artificial Intelligence Approach to Reduce the Memory Requirements of Transformers appeared first on MarkTechPost.

Enhancing Reasoning in Large Language Models: Check Out the Hypotheses …

In the realm of reasoning tasks, large language models (LLMs) have displayed remarkable performance when provided with examples and intermediate steps. Nevertheless, approaches that depend on implicit knowledge within an LLM can sometimes produce erroneous answers when the implicit knowledge is incorrect or inconsistent with the task at hand. 

To address this issue, a team of researchers from Google, Mila – Québec AI Insitute, Université de Montréal, HEC Montréal, University of Alberta, and CIFAR AI Chair introduce the Hypotheses-to-Theories (HtT) framework that focuses on acquiring a rule library for LLM-based reasoning. HtT comprises two key stages: an induction stage and a deduction stage. In the induction stage, an LLM is initially tasked with generating and validating rules based on a set of training examples. 

The above image demonstrates the application of Hypotheses-to-Theories to the chain-of-thought method for solving base-9 arithmetic problems is exemplified here. To maintain conciseness, a few-shot examples have been omitted. In the induction stage, the Chain of Thought (CoT) technique is utilized to generate rules and validate them using training samples. 

Subsequently, the rules produced are gathered and refined to construct a rule library. In the deduction stage, the CoT prompt is enhanced with knowledge derived from the rule library. Correct rules are indicated with green markers, while incorrect ones are marked in red. Rules that frequently lead to correct answers are accumulated to establish a rule library. In the deduction stage, the LLM is subsequently prompted to utilize the acquired rule library for reasoning in order to answer test questions. 

In their evaluation of HtT, the researchers integrate it as an enhancement to pre-existing few-shot prompting techniques, such as chain-of-thought and least-to-most prompting. Performance is assessed on two challenging multi-step reasoning problems that have proven to be problematic for current few-shot prompting approaches.

Experimental results on both numerical reasoning and relational reasoning problems reveal that HtT enhances existing prompting methods, achieving an increase in accuracy ranging from 11% to 27%. Furthermore, the acquired rules can be effectively transferred to different models and various forms of the same problem. The introduced method paves the way for a novel approach to acquiring textual knowledge using LLMs. It is anticipated that HtT will enable a range of applications and inspire further research in the field of LLMs.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..
The post Enhancing Reasoning in Large Language Models: Check Out the Hypotheses-to-Theories (HtT) Framework for Accurate and Transferable Rule-Based Learning appeared first on MarkTechPost.