Meet PyRCA: An Open-Source Python Machine Learning Library Designed fo …

The fields of Artificial intelligence and Machine leaving are rapidly advancing, thanks to their incredible capabilities and use cases in almost every industry. With the increasing popularity and integration of AI into different fields, there are also problems and limitations associated with it. Root cause analysis (RCA) is a method for discovering the root causes of issues in order to find the best solutions for them. It helps in identifying the underlying reasons for incidents or failures in a model. In domains including IT operations, telecommunications, and specifically in the field of AI, the model’s increased complexity frequently results in events that reduce the dependability and effectiveness of production systems. With the help of RCA, the method looks for several factors and establishes their causal links in an effort to offer explanations for these instances.

Recently, a team of researchers from Salesforce AI has introduced PyRCA, an open-source Python Machine Learning library designed for Root Cause Analysis (RCA) in the field of Artificial Intelligence for IT Operations (AIOps). PyRCA provides a thorough framework that enables users to independently find complex causal relationships between metrics and incident root causes. The library offers both graph building and scoring operations with a unified interface that supports a variety of widely used RCA models, along with providing a streamlined method for quick model creation, testing, and deployment.

This holistic Python library for root cause analysis provides an end-to-end framework encompassing data loading, causal graph discovery, root cause localization, and RCA result visualization. It supports multiple models for creating graphs and rating root causes and helps users quickly load pertinent data and identify the causal connections between various system components. PyRCA comes with a GUI dashboard that makes interactive RCA easier, thus offering a more streamlined user experience and better aligning with real-world conditions. The GUI’s point-and-click interface has been made intuitive in nature, and the dashboard empowers users to interact with the library and inject their expert knowledge into the RCA process.

With PyRCA, engineers and researchers can now easily analyze the results, visualize the causal linkages, and move through the RCA process with the help of the GUI dashboard. Some of the key features of PyRCA shared by the team are as follows – 

PyRCA has been developed to offer a standardized and highly adaptable framework for loading metric data with the popular pandas.DataFrame format and benchmarking a diverse set of RCA models.

Through a single interface, PyRCA provides access to a variety of models for both discovering causal networks and locating underlying causes. Users also have the choice to completely customize each model to suit their unique requirements with models including GES, PC, random walk, and hypothesis testing. 

By incorporating user-provided domain knowledge, the RCA models offered in the library can be strengthened, making them more resilient when dealing with noisy metric data.

By implementing a single class that is inherited from the RCA base class, developers can quickly add new RCA models to PyRCA.

The PyRCA package provides a visualization tool that enables users to compare multiple models, review RCA results, and quickly include domain knowledge without the need for any code.

The team has explained the architecture and major functionalities of PyRCA in the technical report in detail. It provides an overview of the library’s design and its core capabilities.

Check Out The Paper and Github. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Meet PyRCA: An Open-Source Python Machine Learning Library Designed for Root Cause Analysis (RCA) in AIOps appeared first on MarkTechPost.

Researchers from Princeton Introduce Infinigen: A Procedural Generator …

The research team from Princeton University has introduced Infinigen, a groundbreaking procedural generator for photorealistic 3D scenes, in their recent paper titled “Infinite Photorealistic Worlds using Procedural Generation.” This work addresses the limitations of existing synthetic datasets that offer limited diversity and fail to capture the complexity of real-world objects.

Infinigen is a fully procedural system that enables the generation of an infinite number of shapes, textures, materials, and scene compositions from scratch. Its key feature lies in its ability to produce high levels of photorealism by procedurally generating both coarse and fine geometric and textural details. Infinigen is separated because all the geometric information it generates is based on real-world references, enhancing the authenticity of the synthetic scenes.

The architecture of Infinigen is built upon Blender, a widely used graphics system known for its capabilities in procedural generation. The research team has designed and implemented a library of procedural rules to expand the coverage of natural objects and scenes. These rules leverage the useful primitives available in Blender. Moreover, the team has developed utilities that simplify the creation of procedural rules, including an automatic conversion tool that transforms Blender node graphs into Python code. Additionally, utilities have been developed to render synthetic images with ground truth labels, providing information such as depth, occlusion boundaries, bounding boxes, optical flow, surface normals, object categories, and instance segmentation.

To evaluate the quality of the synthetic data generated by Infinigen, the team conducted extensive experiments and compared it with existing synthetic datasets and generators. The results of these experiments demonstrate Infinigen’s remarkable capability to produce photorealistic and original assets and scenes without relying on external sources. This showcases its potential for generating a diverse and expansive training dataset that more accurately reflects the complexity of the real world.

Infinigen is an open-source project that the researchers intend to nurture as a collaborative effort with the wider community. They are committed to expanding its coverage to encompass all real-world elements, ensuring its continued development and growth. By offering Infinigen as a freely available resource, the research team hopes to foster collaboration and inspire further advancements in procedural generation.

Overall, the introduction of Infinigen marks a significant advancement in generating synthetic data for computer vision tasks. Its procedural approach, coupled with its ability to produce photorealistic scenes, promises to bridge the gap between existing synthetic datasets and the complexity of real-world objects, making it an invaluable tool for training models in various computer vision applications.

Check Out The Paper, Github, and Project Page. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Researchers from Princeton Introduce Infinigen: A Procedural Generator of Photorealistic 3D Scenes of the Natural World appeared first on MarkTechPost.

Researchers from Allen Institute for AI Introduce VISPROG: A Neuro-Sym …

The search for general-purpose AI systems has facilitated the development of capable end-to-end trainable models, many of which aim to provide a simple natural language interface for a user to engage with the model. Massive-scale unsupervised pretraining followed by supervised multitask training has been the most common method for developing these systems. They eventually want these systems to execute to scale to the indefinitely long tail of difficult jobs. However, this strategy needs a carefully selected dataset for each task. By breaking down difficult activities stated in natural language into simpler phases that can be handled by specialized end-to-end trained models or other programs, they study the usage of big language models to handle the long tail of complex tasks in this work. 

Tell a computer vision program to “Tag the seven main characters from the TV show Big Bang Theory in this image.” The system must first comprehend the purpose of the instruction before carrying out the following steps: detecting faces, retrieving the list of Big Bang Theory’s main characters from a knowledge base, classifying faces using the list of characters, and tagging the image with the names and faces of the characters that were recognized. While several vision and language systems can carry out each task, natural language task execution is outside the purview of end-to-end trained systems. 

Figure 1: A modular and interpretable neuro-symbolic system for compositional visual reasoning – VISPROG. VISPROG develops a program for every new instruction using in-context learning in GPT-3, given a few instances of natural language instructions and the necessary high-level programs, and then runs the program on the input image(s) to get the prediction. Additionally, VISPROG condenses the intermediate outputs into an understandable visual justification. We use VISPROG to do jobs that call for assembling a variety of modules for knowledge retrieval, arithmetic, and logical operations, as well as for analyzing and manipulating images

Researchers from Allen Institute for AI propose VISPROG, a program that takes as input visual information (a single picture or a collection of images) and a natural language command, creates a series of instructions, or a visual program, as they can be called, and then executes these instructions to produce the required result. Each line of a visual program calls one of the many modules the system now supports. Modules can be pre-built language models, OpenCV image processing subroutines, or arithmetic and logical operators. They can also be pre-built computer vision models. The inputs created by running earlier lines of code are consumed by modules, producing intermediate outputs that can be used later.

In the example mentioned earlier, a face detector, GPT-3 as a knowledge retrieval system, and CLIP as an open-vocabulary image classifier are all used by the visual program created by VISPROG to provide the necessary output (see Fig. 1). The generation and execution of programs for vision applications are both enhanced by VISPROG. Neural Module Networks (NMN) combine specialized, differentiable neural modules to create a question-specific, end-to-end trainable network for the visual question answering (VQA) problem. These methods either train a layout generator using REINFORCE’s weak answer supervision or brittle, pre-built semantic parsers to generate the layout of modules deterministically. 

In contrast, VISPROG allows users to build complicated programs without prior training using a potent language model (GPT-3) and limited in-context examples. Invoking trained state-of-the-art models, non-neural Python subroutines, and greater levels of abstraction than NMNs, VISPROG programs are likewise more abstract than NMNs. Due to these benefits, VISPROG is a quick, effective, and versatile neuro-symbolic system. Additionally, VISPROG is very interpreted. First, VISPROG creates simple-to-understand programs whose logical accuracy may be checked by the user. Second, by breaking the prediction down into manageable parts, VISPROG enables the user to examine the results of intermediate phases to spot flaws and, if necessary, make corrections to the logic. 

A completed program with intermediate step outputs (such as text, bounding boxes, segmentation masks, produced pictures, etc.) connected to show the flow of information serves as a visual justification for the prediction. They employ VISPROG for four distinct activities to show off its versatility. These tasks involve common skills (such as picture parsing) but also require specialized thinking and visual manipulation skills. These tasks include:

Answering compositional visual questions.

Zero-shot NLVR on picture pairings.

Factual knowledge object labeling from NL instructions.

Language-guided image manipulation. 

They stress that none of the modules or the language model have been altered in any manner. It takes a few in-context examples with natural language commands and the appropriate programs to adapt VISPROG to any task. VISPROG is simple to use and has substantial gains over a base VQA model on the compositional VQA test of 2.7 points, good zero-shot accuracy on NLVR of 62.4%, and pleasing qualitative and quantitative results on knowledge tagging and picture editing tasks.

Check Out The Paper, Github, and Project Page. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Researchers from Allen Institute for AI Introduce VISPROG: A Neuro-Symbolic Approach to Solving Complex and Compositional Visual Tasks Given Natural Language Instructions appeared first on MarkTechPost.

Define customized permissions in minutes with Amazon SageMaker Role Ma …

Machine learning (ML) administrators play a critical role in maintaining the security and integrity of ML workloads. Their primary focus is to ensure that users operate with the utmost security, adhering to the principle of least privilege. However, accommodating the diverse needs of different user personas and creating appropriate permission policies can sometimes impede agility. To address this challenge, AWS introduced Amazon SageMaker Role Manager in December 2022. SageMaker Role Manager is a powerful tool can you can use to swiftly develop persona-based roles, which can be easily customized to meet specific requirements.
With SageMaker Role Manager, administrators can efficiently define persona-based roles tailored to distinct user groups. This approach ensures that individuals have access only to the resources and actions essential for their tasks, reducing the risk of unauthorized actions or breaches. SageMaker Role Manager also allows for fine-grained customization. ML administrators can tailor the roles to meet specific requirements by modifying the permissions associated with each persona. This flexibility ensures that the permissions align precisely with the tasks and responsibilities of individual users, providing a robust security framework while accommodating unique use cases.
SageMaker Role Manager is currently available on the Amazon SageMaker console of all commercial Regions. Today, we are launching the ability to define customized permissions in minutes with SageMaker Role Manager via the AWS Cloud Development Kit (AWS CDK). This addresses a critical obstacle to wider adoption because ML administrators can now automate their tasks programmatically. With the power of the AWS CDK, ML administrators can streamline workflows, reduce manual efforts, and ensure consistency in managing permissions for their ML infrastructure.
Solution overview
With the release of the SageMaker Role Manager CDK, we are launching two new infrastructure as code (IaC) capabilities:

Create fine-grained permissions for ML personas
Create fine-grained permissions for automated jobs through Amazon SageMaker Pipelines, AWS Lambda, and other AWS services

You can create fine-grained AWS Identity and Access Management (IAM) roles for ML personas such as data scientist, ML engineer, or data engineer. SageMaker Role Manager offers predefined personas and ML activities combined to streamline your permission generation process, allowing your ML practitioners to perform their responsibilities with the least privilege permissions. For secure access to your ML resources, SageMaker Role Manager allows you to specify networking and encryption permissions for Amazon Virtual Private Cloud (Amazon VPC) resources and AWS Key Management Service (AWS KMS) encryption keys. Furthermore, you can customize permissions by attaching your own customer managed policies.
The SageMaker Role Manager CDK lets you define custom permissions for SageMaker users in minutes. It comes with a set of predefined policy templates for different personas and ML activities. Personas represent the different types of users that need permissions to perform ML activities in SageMaker, such as data scientists or MLOps engineers. ML activities are a set of permissions to accomplish a common ML task, such as running Amazon SageMaker Studio applications or managing experiments, models, or pipelines. After you have selected the persona type and the set of ML activities, the SageMaker Role Manager CDK automatically creates the required IAM role and policies that you can assign to SageMaker users. Similarly, you can also create IAM roles with fine-grained permissions for automated jobs such as running SageMaker Pipelines.
Prerequisites
To start using the SageMaker Role Manager CDK, you need to complete the following prerequisite steps:

Set up a role for your ML administrator to create and manage personas, as well as the IAM permissions for those users. For a sample admin policy, refer to the prerequisite section in Define customized permissions in minutes with Amazon SageMaker Role Manager blog post.
Create a compute-only persona role (if you don’t have any) for passing to jobs and endpoints. For instructions to set up that role, refer to Using the role manager.
Set up your AWS CDK development environment. For instructions, refer to Getting started with the AWS CDK.

Install and run the SageMaker Role Manager CDK
Complete the following steps to set up the SageMaker Role Manager CDK:

Create your AWS CDK app and give it a name; for example, RoleManager.
Navigate to the RoleManager folder and run the following command to create a blank typescript AWS CDK project:

cdk init app –language typescript

Open package.json and add the highlighted package as shown in the following code:

“dependencies”: {
“aws-cdk-lib”: “2.85.0”,
“@cdklabs/cdk-aws-sagemaker-role-manager”: “0.0.15”,
“constructs”: “^10.0.0”,
“source-map-support”: “^0.5.21”
}

Run the following command to install the new cdk-aws-sagemaker-role-manager package:

npm install

Navigate to the lib folder and replace role_manager_stack.ts with the following code:

import * as cdk from ‘aws-cdk-lib’;
import { Construct } from ‘constructs’;
import * as iam from ‘aws-cdk-lib/aws-iam’;
import { Activity } from ‘@cdklabs/cdk-aws-sagemaker-role-manager’;

export class RoleManagerStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);

const activity = Activity.manageJobs(this, ‘id1’, {
rolesToPass: [iam.Role.fromRoleName(this, ‘passRoleId’, ‘passRoleName’)],
});

activity.createRole(this, ‘newRoleId’, ‘newRoleName’, newRoleDescription’);

}
}

Replace passRoleId, passRoleName, newRoleId, newRoleName, and newRoleDescription based on your requirements for role creation.
Navigate back to your AWS CDK app home folder and run the following command to verify the generated AWS CloudFormation template:

cdk synth

Finally, run the following command to run the CloudFormation stack in your AWS account:

cdk deploy

You should see an AWS CDK deployment output similar to the one in the following screenshot.

More SageMaker Role Manager CDK examples are available in the following GitHub repo.
ML persona and activity CDK reference
Administrators can define ML activities using one of the ML activity static functions of the ML activity class. For a list of the latest versions, refer to ML activity reference.
The ML persona class supports the following methods:

customizeVPC(subnets, securityGroups) – Customizes the VPC of all activities that support VPC customization of personas.
customizeKMS(dataKeys, volumeKeys) – Customizes KMS keys of all activities that support KMS key customization of personas.
createRole(scope, id, roleNameSuffix, roleDescription) – Creates a role with the persona’s activities’ permissions similar to the UI in the scope with ID, with the name SageMaker-${roleNameSuffix} and optionally with the passed role description.
grantPermissionsTo(identity) – Grants the persona’s activities’ permissions to the identity. The passed identity can be a role or an AWS resource associated with a role (for example, a Lambda function with the role of the Lambda function describing which resources the Lambda function can access).
grantPermissionsTo() – Updates the role of the passed identity to have the permissions specified in the ML activity.

The ML activity class supports the same set of functions as ML personas; however, the difference is an ML activity is constrained to a single activity when using this interface to create IAM roles.
Conclusion
SageMaker Role Manager enables you to create customized roles based on personas, pre-built ML activities, and custom policies, significantly reducing the time required. Now, with this latest AWS CDK support, the ability to define roles is further expanded to support infrastructure as code. This empowers ML practitioners to work programmatically in SageMaker, enhancing efficiency and enabling seamless integration into their workflows.
We would like to hear from you on how this new feature is helping you. Try out the new AWS CDK support for SageMaker Role Manager and send us your feedback!
To learn more about how to use SageMaker Role Manager, refer to the SageMaker Role Manager Developer Guide.

About The Authors
Akash Bhatia is a Principal Solution Architect with experience spanning multiple industries, including Manufacturing, Automotive, Retail ,and Space and Technology. Currently working in Amazon Web Services Enterprise Segments, Akash works closely with a diverse range of clients, including Fortune 100 companies and start-ups, to facilitate their cloud migration journey. In addition to his technical expertise, Akash has led product and program management, having successfully overseen numerous large-scale initiatives throughout his career.
Ram Vittal is a Principal ML Solutions Architect at AWS. He has over 20 years of experience architecting and building distributed, hybrid, and cloud applications. He is passionate about building secure and scalable AI/ML and big data solutions to help enterprise customers with their cloud adoption and optimization journey to improve their business outcomes. In his spare time, he enjoys riding motorcycle, playing tennis, and photography.
Ozan Eken is a Senior Product Manager at Amazon Web Services. He has over 15 years of experience in consulting and product management. He is passionate about building governance products, and Admin capabilities in Machine Learning for enterprise customers. Outside of work, he likes exploring different outdoor activities and watching soccer.

AI Marketing Agency: The Best Tools and Strategies

There are so many tools that are supposed to make working easier than ever. Why does it often feel like it’s actually harder? One reason: there are so many new tools and strategies it can be difficult to know what actually provides value! Clients crave cutting-edge solutions but don’t always know what they actually are. To be an effective AI Marketing Agency, you do have to know what they are. 

Don’t worry, though. We’re here to help. AI marketing tools can automate tasks, personalize marketing messages, analyze data, and identify new opportunities. 

AI marketing can help businesses in a variety of ways, including:

Improved website SEO: AI marketing can be used to improve a website’s SEO by identifying and fixing any technical issues, optimizing content for search engines, and building backlinks.

More effective PPC campaigns: AI marketing can be used to create more effective PPC campaigns by targeting the right keywords, setting the right budgets, and optimizing bids.

Engaged social media audiences: AI marketing can be used to engage with a business’s audience on social media by creating engaging content, responding to comments and questions, and running targeted social media ads.

High-quality content: AI marketing can be used to create high-quality content that attracts and retains customers by generating ideas, writing content, and optimizing content for search engines.

Effective email marketing: AI marketing can be used to create more effective email marketing campaigns by segmenting lists, personalizing messages, and tracking results.

All of this and more will be covered in our blogpost! 

Email Marketing

Social Media Marketing

Search Engine Marketing

Content Marketing

Influencer Marketing

AI Marketing Agency Tools for Different Channels

It’s a big understatement, but Artificial intelligence (AI) is rapidly changing the world of marketing. Businesses of all sizes are beginning to realize the benefits of using AI to reach their target audience more effectively. Marketing agencies can use AI to blow those customers away. 

Here’s how you can AI across all your different channels: 

AI Email Marketing for AI Marketing Agency

There are so many ways AI can help marketing agencies with email marketing. 

The most powerful and effective way to use AI in email marketing is personalization. 

Everyone knows that more personalized emails are more effective. But personalization used to be a lot of work. Even with fields that autofill, you can only get so far. 

That’s why Customers.ai released a suite of tools designed to be more effective and save you time. 

The best, and easiest, way to build a powerful email marketing campaign? Combine Customers.ai’s prospecting tools with our AI marketing tools. Here’s how to do it: 

The first step is to build an email list that includes useful targeting data. You can do this for your clients with Customers.ai’s powerful X-Ray tool, which converts web traffic into leads. 

Best practice is to install it on high-intent pages. Here’s how to install: 

Then set up a targeted email automation for each page, designed specifically around what the browser was looking at! 

Instead of writing the targeted emails yourself–time-consuming and difficult!–use our AI automated email generator to do it for you. 

You can set up an entire sequence of emails this way. This makes sure that you get the most out of your warm leads! 

Social Media Marketing for AI Marketing Agency

You can use AI to create engaging content, respond to comments and questions, and run targeted social media ads. 

AI marketing agencies can help businesses create engaging social media content that will capture the attention of their target audience. 

They can also help businesses respond to comments and questions in a timely and informative manner. 

Additionally, AI marketing agencies can help businesses run targeted social media ads that reach their target audience with the right message at the right time.

Search Engine Marketing for AI Marketing Agency

You should use AI to improve website SEO, create more effective PPC campaigns, and track results. 

AI marketing agencies can help businesses improve their website’s SEO by identifying and fixing any technical issues, optimizing content for search engines, and building backlinks. 

They can also help businesses create more effective PPC campaigns by targeting the right keywords, setting the right budgets, and optimizing bids. 

Additionally, AI marketing agencies can help businesses track the results of their SEM campaigns so that they can see what’s working and what’s not. Use this information to improve future campaigns.

Content Marketing for AI Marketing Agency

Struggle with generating ideas for content? Need help optimizing content for search engines? Use AI to help!

AI marketing agencies can help businesses generate ideas for content that will be of interest to their target audience. They can also help businesses write content that is informative, engaging, and optimized for search engines. 

Additionally, AI marketing agencies can help businesses track the results of their content marketing efforts so that they can see what’s working and what’s not. You can use this information to improve future marketing campaigns!

Influencer Marketing AI Marketing Agency

Use AI to identify and reach influencers, track results, and measure ROI. AI marketing agencies can help businesses identify influencers who have a large following of their target audience. 

They can also help businesses reach out to influencers and collaborate on marketing campaigns. Additionally, AI marketing agencies can help businesses track the results of their influencer marketing campaigns so that they can see what’s working and what’s not. This information can be used to improve future influencer marketing campaigns.

If you’re a smart marketer, you’re using AI marketing efforts across a variety of channels. By using AI, you can reach your target audience more effectively, improve your results, and save time and money. Adding AI to your marketing agency’s offerings is a great way to become more attractive to potential clients. 

The Benefits of Using an AI Marketing Agency

For clients, there are many benefits to using an AI marketing agency, including:

Expertise: AI marketing agencies have the expertise and experience in using AI to achieve results. They can help you to identify the right AI tools and technologies for your business, and they can help you to implement and use these tools and technologies effectively.

Time savings: AI marketing agencies can save you time by automating many of the tasks involved in marketing, such as creating and scheduling social media posts, responding to customer inquiries, and generating leads. This frees you up to focus on other aspects of your business, such as product development or customer service.

Cost savings: AI marketing agencies can help you to save money by using AI to optimize your marketing campaigns. For example, AI can be used to target your marketing messages more effectively, which can lead to a higher return on investment (ROI).

Improved results: AI marketing agencies can help you to improve the results of your marketing campaigns. For example, AI can be used to create more engaging content, target your marketing messages more effectively, and track the results of your campaigns more accurately.

If you’re looking for a way to improve your marketing efforts, consider using an AI marketing agency. An AI marketing agency can help you to save time, money, and improve your results.

Here are some additional benefits of using an AI marketing agency:

Increased reach: AI marketing agencies can help you to reach a wider audience by using AI to target your marketing messages to the right people at the right time.

Improved customer experience: AI marketing agencies can help you to improve the customer experience by using AI to personalize your marketing messages and provide better customer service.

Increased brand awareness: AI marketing agencies can help you to increase brand awareness by using AI to create more engaging content and target your marketing messages to a wider audience.

If you’re looking for a way to improve your marketing efforts and achieve your business goals, consider using an AI marketing agency.

Conclusion

In conclusion, AI marketing is a powerful tool that can be used to improve your marketing efforts across a variety of channels. By using AI, you can reach your target audience more effectively, improve your results, and save time and money. If you’re looking for an AI marketing agency to help you with your marketing efforts, be sure to do your research and choose an agency that has experience and expertise in using AI to achieve results.

Here are some tips for choosing an AI marketing agency:

Ask for references: Ask the agency for references from other businesses that they have worked with. This will give you an idea of the agency’s work and their ability to deliver results.

Get a proposal: Get a proposal from the agency that outlines their services, costs, and timeline. This will help you to compare different agencies and make an informed decision.

Meet with the agency: Meet with the agency in person to get to know them and their team. This will help you to determine if they are a good fit for your business and your needs.

AI marketing is a rapidly growing field, and there are many new and innovative AI marketing solutions available. By working with an AI marketing agency, you can stay ahead of the curve and take advantage of the latest AI marketing technologies to improve your marketing efforts and achieve your business goals.

Frequently Asked Questions about AI Marketing Agencies

What is an AI marketing agency?

An AI marketing agency is a company that leverages artificial intelligence technologies and strategies to help businesses enhance their marketing efforts. These agencies specialize in using AI tools and techniques to analyze data, automate tasks, personalize campaigns, optimize targeting, and improve overall marketing performance.

What services do AI marketing agencies offer?

AI marketing agencies offer a wide range of services to assist businesses in their marketing endeavors. These services may include AI-powered customer segmentation, predictive analytics, chatbot development, content optimization, programmatic advertising, AI-driven recommendation engines, data-driven decision making, and more. The exact services offered may vary from one agency to another.

Why should I hire an AI marketing agency?

Hiring an AI marketing agency can provide several benefits for your business. These agencies have expertise in AI technologies and can help you leverage advanced data analysis and automation tools to improve your marketing strategies. They can enhance your targeting and personalization efforts, optimize your campaigns, and provide valuable insights using AI algorithms. This can lead to increased efficiency, higher ROI, better customer experiences, and a competitive edge in the market.

How do I choose the right AI marketing agency for my business?

Choosing the right AI marketing agency requires careful consideration. Here are a few key factors to consider:

Experience and expertise in AI marketing

Track record and client testimonials

Services offered and alignment with your needs

Ability to understand and cater to your target audience

Transparent pricing and contract terms

Communication and collaboration processes

Evaluating agencies based on these criteria can help you find the one that aligns best with your business goals and requirements.
The post AI Marketing Agency: The Best Tools and Strategies appeared first on Customers.ai.

Microsoft AI Introduces an Advanced Communication Optimization Strateg …

Microsoft researchers introduced a new system called ZeRO++ has been developed to optimize the training of large AI models, addressing the challenges of high data transfer overhead and limited bandwidth. ZeRO++ builds upon the existing ZeRO optimizations and offers enhanced communication strategies to improve training efficiency and reduce training time and cost.

Training large models like Turing-NLG, ChatGPT, and GPT-4 requires substantial memory and computing resources across multiple GPU devices. ZeRO++, developed by DeepSpeed, introduces communication optimization strategies to overcome the limitations of ZeRO in scenarios with a small batch size per GPU or when training on low-bandwidth clusters.

The ZeRO family of optimizations, including ZeRO-Inference, enables the partitioning of model states across GPUs instead of replication, using the collective GPU memory and compute power. However, ZeRO can incur high communication overheads during training. ZeRO++ addresses this by incorporating three sets of communication optimizations: quantized weight communication (qwZ), hierarchical weight partition (hpZ), and quantized gradient communication (qgZ).

To reduce parameter communication volume, ZeRO++ employs quantization on weights, utilizing block-based quantization to preserve training precision. This optimized quantization process is faster and more accurate than basic quantization. To minimize communication overhead during backward propagation, ZeRO++ trades GPU memory for communication by maintaining a full model copy within each machine. For gradient communication, ZeRO++ introduces qgZ, a novel quantized gradient communication paradigm that reduces cross-node traffic and latency.

These communication optimizations result in a significant reduction in communication volume. ZeRO++ achieves up to a 4x reduction compared to ZeRO, improving training throughput and efficiency. ZeRO++ offers 28% to 36% throughput improvement over ZeRO-3 in high-bandwidth clusters when using small batch sizes per GPU. ZeRO++ achieves an average of 2x speedup in low-bandwidth clusters compared to ZeRO-3, making large model training more accessible across a wider variety of clusters.

ZeRO++ is not limited to training scenarios but extends to reinforcement learning from human feedback (RLHF) training used in dialogue models. By integrating ZeRO++ with DeepSpeed-Chat, RLHF training can benefit from improved generation and training phases, achieving up to 2.25x better generation throughput and 1.26x better training throughput than ZeRO.

DeepSpeed has released ZeRO++ to make large model training more efficient and accessible to the AI community. The system is designed to accelerate training, reduce communication overhead, and enable larger batch sizes, ultimately saving time and resources. Researchers and practitioners can leverage ZeRO++ to train models like ChatGPT more effectively and explore new possibilities in AI.

Check Out The Blog Article and Paper. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Microsoft AI Introduces an Advanced Communication Optimization Strategy Built on ZeRO for Efficient Large Model Training, Unhindered by Batch Size or Bandwidth Limitations appeared first on MarkTechPost.

10+ AI Tools for Public Relations (PR) 2023

ChatGPT 

To put it simply, ChatGPT is an AI-driven conversational user interface. It accepts input from the user, analyzes it, and generates an answer. The OpenAI technology allows the machine to understand both written and verbal language. It can give predetermined answers or ask the user to fill in the blanks. Because it employs machine learning and natural language processing, the technology can potentially have meaningful interactions with consumers. The system’s flexibility means it can be applied in various settings, including customer service, virtual agents, and chatbots. ChatGPT leverages OpenAI technology to provide users with a conversational A.I. system to understand and fulfill their requests.

Midjourney

Midjourney is among the best artificial intelligence image generators because of its powerful capabilities and quick image synthesis. Send an SMS command to Midjourney, which will take care of the rest. Many creative professionals use Midjourney to generate images that inspire their work. The artificial intelligence piece “Théâtre d’Opéra Spatial,” made with Midjourney, beat out 20 other painters to take first place in the fine art category at the Colorado State Fair. However, the present home for Midjourney is a Discord server. You must join the MidJourney Discord server and utilize the bot’s commands to make images. That said, it’s easy to get started right immediately.

Brandwatch

Brandwatch is your artificial intelligence social listening solution if media monitoring for your clients is your priority. Brandwatch uses A.I. to monitor written references to your firm and visual representations of your logo and items in the wild. Their sophisticated text analysis tools can also tell you whether a user’s comments about your brand are favorable, bad, or neutral, and they make it easy to keep tabs on all of these indicators.

Cleanup.pictures

Remove undesirable objects, people, text, and flaws from your photos with the help of Cleanup.pictures, an AI-powered photo editing application. It’s quick and simple to learn, and it lets anyone retouch photos in seconds without sacrificing quality. Photographers, advertising firms, real estate agents, online retailers, and anybody looking to get rid of text, logos, or watermarks are just some people who can benefit from this tool. Unlike Adobe Photoshop’s clone tool, this program can accurately identify what is hiding behind undesirable text, persons, and objects. You can import and edit images of any resolution, and while the export resolution is capped at 720px in the free edition, there is no such restriction in the Pro version.

Looka 

Create a polished logo and brand identity with minimal effort using Looka, an AI-powered brand identity platform. Looka, the rebranded version of Logojoy, is available without cost. The process begins with a Logo Maker that, using artificial intelligence, can produce hundreds of potential logo designs quickly. The user can then alter the layout to their liking. The Brand Kit then leverages the logo, colors, and fonts to quickly and easily produce dozens, if not hundreds, of unified promotional pieces. Business cards, social media profiles, email signatures, and other sample documents can all be found in the Brand Kit. Users of Looka, a platform powered by artificial intelligence, can alter their profile and cover images across many social media platforms, including YouTube, Twitter, and Facebook.

Canva 

It’s easy to see how a product manager can benefit from using Canva’s free image creator. It has always been challenging to source relevant images for presentations and decks at stakeholder meetings, product launches, etc. Sometimes you have a perfect vision of what you want, but the stock images you’re working with must be corrected. Canva’s AI-driven editor allows you to plan your content in advance, generate ideas, and refine your search results until you find the perfect graphic, depending on your inputs.

TLDR 

TLDR This cutting-edge AI-powered web tool can automatically summarize lengthy content like articles, documents, essays, and papers into concise, informative paragraphs. Students cramming for exams, writers looking to summarize their articles quickly, teachers who need to summarize a long document or chapter for their students, and journalists who need to summarize a long article for their newspaper or magazine can all benefit from the tool. TLDR A clean and focused reading experience is provided by the removal of adverts, pop-ups, graphics, and other online distractions, as well as the selection of key ideas from a text and the elimination of irrelevant material such as weak arguments, unsupported conjecture, flashy phrases, and attention wasters.

Hints 

Hints is an artificial intelligence (A.I.)-powered productivity tool that syncs with your other apps to help you stay on top of your to-do list, notes, deals, and schedule. Notion, Obsidian, Trello, ClickUp, Hubspot, Pipedrive, Google Calendar, and Jira are just some of the services that can be integrated. You may find hints in your preferred messaging apps, including Telegram, WhatsApp, SMS, and more. It’s also possible to leave voicemails. The ability to create, update, and pull data where it goes on the fly is made possible by Hints’s connection to various services, allowing for streamlined management of the business and personal life via a single interface. Some of Hints’s many potential applications are project management, sales, and CRM management, note-taking and information management, and personal organizing. Hints seek to help you save time and effort by integrating with other popular services and using A.I. to improve the efficiency of your daily tasks.

DeepL 

If you or your team need a reliable translator, go no further than DeepL Translate, an AI-powered tool that delivers precise results. It can translate text and entire files, including PDFs, Word documents, and PowerPoint presentations, into 31 other languages. Since the technology can recognize the language rapidly and automatically, the translation process is short, and the results are trustworthy. DeepL also has a lexicon and glossary for quick definitions. DeepL is a great tool for consumers on the road because it can be accessed from a desktop computer, a mobile device, or a Chrome extension. DeepL is one of the most widely used translation tools by millions daily.

Otter.AI

Otter.ai is an artificial intelligence-driven platform for accurately recording and transcribing meetings and conversations. Real-time, encrypted, searchable, and shareable notes from any conversation are taken using automatic speech recognition. Otter can go into your Zoom, Microsoft Teams, or Google Meet meeting and start recording immediately. By focusing on the most important points and delegating tasks, we produce summaries that can be readily distributed and recalled. People in professional, academic, and personal settings have all benefited from using Otter, a time-saving tool available on iOS, Android, and Chrome. The automatic slide-capture function and its capacity to transcribe from several speakers have both been praised.

Beautiful.ai 

Creating stunning presentations in record time is a breeze with the help of Beautiful.ai, a web-based presentation maker. Presenters can get their points through quickly and easily using hundreds of AI-designed smart slides. The online hub provides access to free resources such as stock images, videos, templates, and animation software. Audio narration, secure file-sharing and collaboration tools, and in-depth analytics are also included to help users craft flawless presentations.

Ellie 

Ellie, an intelligent email assistant, analyses a user’s writing pattern and delivers customized responses that sound like them. This Chrome and Firefox browser extension supports Gmail and promises to help other web-based email applications. Ellie replies in multiple languages and reads email threads. Users can change Ellie’s responses five times and add context for more personalized answers. Ellie uses user-provided sample content and does not read user emails. We limit responses per subscription due to the expensive cost of developing A.I. and content. Ellie can make emailing easier for people with dyslexia. Ellie developers are self-funded and don’t sell email addresses.

Copy.ai

Copy.ai is a copywriting tool powered by artificial intelligence that helps organizations produce high-quality, persuasive content. There is no sign-up fee or minimum purchase required. The tool employs cookies to customize the user experience and advertising purposes. The website use cookies for both GDPR compliance and bot detection. The application also tracks user navigation and actions on the website, which is then utilized to generate statistical reports and heat maps. Additionally, cookies save the user’s language and server cluster preferences. This improves the quality of the user experience and the ads they see.

Synthesia 

Synthesia is a video-creation platform using artificial intelligence to create high-quality videos cheaply. It’s a browser add-on that eliminates the need for video editing software, allowing anyone to create polished videos with their friends and family. In Synthesia, you can choose from more than 85 premade A.I. avatars or create your own from scratch using the platform’s 55 premade design templates or any of the 120 supported languages and dialects. The platform can be used for various purposes, from external uses like marketing and customer service to internal benefits like onboarding and training. Over 30,000 organizations have trusted Synthesia due to its capacity to reduce video production expenses by as much as 80%.

Grammarly 

Grammarly is a web-based writing tutor powered by artificial intelligence. It immediately corrects any grammar, spelling, punctuation, clarity, style, or tone errors you may have made. Over half a million programs and websites are compatible with Grammarly on Windows, Mac, iOS, and Android. It has several useful tools, including a citation maker, an essay checker, and a grammar, spelling, punctuation, and plagiarism detector. Grammarly is made to be used by anyone: people, groups, organizations, and even schools. It offers several distinct packages to meet a variety of requirements. In addition to its editing software, Grammarly provides a wealth of additional resources, including a developer’s blog, an education blog, a business blog, and a tech blog.

Note: If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Affiliate: This post contains affiliate links. If you use these links to buy something, we may earn a commission. Thanks.”
The post 10+ AI Tools for Public Relations (PR) 2023 appeared first on MarkTechPost.

Meet CoDi: A Novel Cross-Modal Diffusion Model For Any-to-Any Synthesi …

In the past few years, there has been a notable emergence of robust cross-modal models capable of generating one type of information from another, such as transforming text into text, images, or audio. An example is the notable Stable Diffusion, which can generate stunning images from an input prompt describing the expected outcome.

Despite delivering realistic results, these models face limitations in their practical application when multiple modalities coexist and interact. Let us assume we want to generate an image from a text description like “cute puppy sleeping on a leather couch.” That is, however, not enough. After receiving the output image from a text-to-image model, we also want to hear what such a situation would sound like with, for instance, the puppy snoring on the couch. In this case, we would need another model to transform the text or the resulting image into a sound. Therefore, although connecting multiple specific generative models in a multi-step generation scenario is possible, this approach can be cumbersome and slow. Additionally, independently generated unimodal streams will lack consistency and alignment when combined in a post-processing manner, such as synchronizing video and audio. 

A comprehensive and versatile any-to-any model could simultaneously generate coherent video, audio, and text descriptions, enhancing the overall experience and reducing the required time.

In pursuit of this goal, Composable Diffusion (CoDi) has been developed for simultaneously processing and generating arbitrary combinations of modalities. 

The architecture overview is reported here below.

https://arxiv.org/abs/2305.11846

Training a model to handle any mixture of input modalities and flexibly generate various output combinations entails significant computational and data requirements.

This is due to the exponential growth in possible combinations of input and output modalities. Additionally, obtaining aligned training data for many groups of modalities is very limited and nonexistent, making it infeasible to train the model using all possible input-output combinations. To address this challenge, a strategy is proposed to align multiple modalities in the input conditioning and generation diffusion step. Furthermore, a “Bridging Alignment” strategy for contrastive learning efficiently models the exponential number of input-output combinations with a linear number of training objectives.

To achieve a model with the ability to generate any-to-any combinations and maintain high-quality generation, a comprehensive model design and training approach is necessary, leveraging diverse data resources. The researchers have adopted an integrative approach to building CoDi. Firstly, they train a latent diffusion model (LDM) for each modality, such as text, image, video, and audio. These LDMs can be trained independently and in parallel, ensuring excellent generation quality for each individual modality using available modality-specific training data. This data consists of inputs with one or more modalities and an output modality.

For conditional cross-modality generation, where combinations of modalities are involved, such as generating images using audio and language prompts, the input modalities are projected into a shared feature space. This multimodal conditioning mechanism prepares the diffusion model to condition on any modality or combination of modalities without requiring direct training for specific settings. The output LDM then attends to the combined input features, enabling cross-modality generation. This approach allows CoDi to handle various modality combinations effectively and generate high-quality outputs.

The second stage of training in CoDi facilitates the model’s ability to handle many-to-many generation strategies, allowing for the simultaneous generation of diverse combinations of output modalities. To the best of current knowledge, CoDi stands as the first AI model to possess this capability. This achievement is made possible by introducing a cross-attention module to each diffuser and an environment encoder V, which projects the latent variables from different LDMs into a shared latent space.

During this stage, the parameters of the LDM are frozen, and only the cross-attention parameters and V are trained. As the environment encoder aligns the representations of different modalities, an LDM can cross-attend with any set of co-generated modalities by interpolating the output representation using V. This seamless integration enables CoDi to generate arbitrary combinations of modalities without the need to train on every possible generation combination. Consequently, the number of training objectives is reduced from exponential to linear, providing significant efficiency in the training process.

Some output samples produced by the model are reported below for each generation task. 

https://arxiv.org/abs/2305.11846

This was the summary of CoDi, an efficient cross-modal generation model for any-to-any generation with state-of-the-art quality. If you are interested, you can learn more about this technique in the links below.

Check Out The Paper and Github. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Featured Tools From AI Tools Club

AdCreative.ai

tinyEinstein

Notion

SaneBox

Notion

Aragon AI

Check Out 100’s AI Tools in AI Tools Club
The post Meet CoDi: A Novel Cross-Modal Diffusion Model For Any-to-Any Synthesis appeared first on MarkTechPost.

Meet vLLM: An Open-Source LLM Inference And Serving Library That Accel …

Large language models, or LLMs in short, have emerged as a groundbreaking advancement in the field of artificial intelligence (AI). These models, such as GPT-3, have completely revolutionalized natural language understanding. With the capacity of such models to interpret vast amounts of existing data and generate human-like texts, these models hold immense potential to shape the future of AI and open up new possibilities for human-machine interaction and communication. However, despite the massive success achieved by LLMs, one significant challenge often associated with such models is their computational inefficiency, leading to slow performance even on the most powerful hardware. Since these models comprise millions and billions of parameters, training such models demands extensive computational resources, memory, and processing power, which is not always accessible. Moreover, these complex architectures with slow response times can make LLMs impractical for real-time or interactive applications. As a result, addressing these challenges becomes essential in unlocking the full potential of LLMs and making their benefits more widely accessible. 

Tacking this problem statement, researchers from the University of California, Berkeley, have developed vLLM, an open-source library that is a simpler, faster, and cheaper alternative for LLM inference and serving. Large Model Systems Organization (LMSYS) is currently using the library to power their Vicuna and Chatbot Arena. By switching to vLLM as their backend, in contrast to the initial HuggingFace Transformers based backend, the research organization has managed to handle peak traffic efficiently (5 times more than before) while using limited computational resources and reducing high operational costs. Currently, vLLM supports several HuggingFace models like GPT-2, GPT BigCode, and LLaMA, to name a few. It achieves throughput levels that are 24 times higher than those of HuggingFace Transformers while maintaining the same model architecture and without necessitating any modifications.

As a part of their preliminary research, the Berkeley researchers determined that memory-related issues pose the primary constraint on the performance of LLMs. LLMs use input tokens to generate attention key and value tensors, which are then cached in GPU memory for generating subsequent tokens. These dynamic key and value tensors, known as KV cache, occupy a substantial portion of memory, and managing them becomes a cumbersome task. To address this challenge, the researchers introduced the innovative concept of PagedAttention, a novel attention algorithm that extends the conventional idea of paging in operating systems to LLM serving. PagedAttention offers a more flexible approach to managing key and value tensors by storing them in non-contiguous memory spaces, eliminating the requirement for continuous long memory blocks. These blocks can be independently retrieved using a block table during attention computation, leading to more efficient memory utilization. Adopting this clever technique reduces memory wastage to less than 4%, resulting in near-optimal memory usage. Moreover, PagedAttention can batch 5x more sequences together, thereby enhancing GPU utilization and throughput.

PagedAttention offers the additional benefit of efficient memory sharing. During parallel sampling, i.e., when multiple output sequences are created simultaneously from a single prompt, PagedAttention enables the sharing of computational resources and memory associated with that prompt. This is accomplished by utilizing a block table, where different sequences within PagedAttention can share blocks by mapping logical blocks to the same physical block. By employing this memory-sharing mechanism, PagedAttention not only minimizes memory usage but also ensures secure sharing. The experimental evaluations conducted by the researchers revealed that parallel sampling could reduce memory usage by a whopping 55%, resulting in a 2.2 times increase in throughput.

To summarize, vLLM effectively handles the management of attention key and value memory through the implementation of the PagedAttention mechanism. This results in exceptional throughput performance. Moreover, vLLM seamlessly integrates with well-known HuggingFace models and can be utilized alongside different decoding algorithms, such as parallel sampling. The library can be installed using a simple pip command and is currently available for both offline inference and online serving.

Check Out The Blog Article and Github. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Meet vLLM: An Open-Source LLM Inference And Serving Library That Accelerates HuggingFace Transformers By 24x appeared first on MarkTechPost.

Researchers From LinkedIn And UC Berkeley Propose A New Method To Dete …

The sophistication of false profiles has increased alongside the proliferation of artificial intelligence (AI)-produced synthetic and text-to-image generated media. LinkedIn partnered with UC Berkeley to study cutting-edge detection methods. Their recent detection method accurately identifies artificially generated profile pictures 99.6% of the time while misidentifying genuine pictures as fake only 1%.

Two types of forensic methods can be used to investigate this issue. 

Methods based on hypotheses can spot oddities in synthetically made faces. The methods benefit by learning blatant semantic outliers. The problem, however, is that learning-capable synthesis engines already seem to have these features.

Data-driven methods like machine learning can tell natural faces apart from CGI ones. When presented with images outside of its region of expertise, it is not uncommon for a trained system to struggle with classification. 

The proposed work adopts a hybrid approach, first identifying a unique geometric attribute in computer-generated faces and then employing data-driven methods to measure and detect it. This method uses a lightweight, quickly trainable classifier and requires training on a small set of synthetic faces. Five distinct synthesis engines are used to build 41,500 synthetic faces and they use 100,000 real LinkedIn profile pictures as additional data.

To see how actual (publicly available) LinkedIn profile pictures stack up against synthetically generated (StyleGAN2) faces, they took an average of 400 each and put them side by side. Since people’s actual photos are so different from one another, most profile pictures are just generic headshots. In comparison, the typical StyleGAN face has very clear features and sharp eyes. This is because the ocular location and interocular distance of StyleGAN faces are standardized. Real profile pictures typically focus on the upper body and shoulders, whereas StyleGAN faces are generally synthesized from the neck up. They wanted to make use of the similarities and differences that exist within and between social groups. 

To identify deepfake face swaps in the FaceForensics++ dataset, the researchers combine a one-class variational autoencoder (VAE) with a baseline one-class autoencoder. In contrast to earlier work focusing on face-swap deepfakes, this work emphasizes synthetic faces (e.g., StyleGAN). The researchers also use a considerably simpler and easier-to-train classifier on a relatively small number of synthetic images while achieving comparable overall classification performance. 

Using images generated with Generated.photos and Stable Diffusion, they evaluate the models’ generalization ability. Generated.photos faces, generated using a generative adversarial network (GAN), are relatively generalizable using their method, whereas stable diffusion faces are not.

TPR stands for “true positive rate” and measures how successfully fake images are identified as such. To calculate the FPR, take the number of genuine images wrongly labeled as fake. The findings show that the proposed method accurately identifies only 1% (FPR) of authentic LinkedIn profile pictures as fake while correctly identifying 99.6% (TPR) of synthetic StyleGAN, StyleGAN2, and StyleGAN3 faces.

They also evaluate the method against a state-of-the-art convolutional neural network (CNN) model used for forensic picture classification and find that their methods perform better. 

According to the team, their method can be easily compromised by a cropping attack, which is a major disadvantage. StyleGAN-generated images are already closely cropped around the face, so this attack might lead to unusual profile pictures. They plan to use advanced techniques and may be able to learn scale and translation invariant representations. 

Check Out The Paper and Reference Article. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Researchers From LinkedIn And UC Berkeley Propose A New Method To Detect AI-Generated Profile Photos appeared first on MarkTechPost.

Revolutionizing Cancer Detection: the University of Surrey Unleashes G …

Since prehistoric times, people have used sketches for communication and documentation. Over the past decade, researchers have made great strides in understanding how to use sketches from classification and synthesis to more novel applications like modeling visual abstraction, style transfer, and continuous stroke fitting. However, only sketch-based image retrieval (SBIR) and its fine-grained counterpart (FGSBIR) have investigated the expressive potential of sketches. Recent systems are already mature for commercial adaptation, a fantastic testament to how developing sketch expressiveness may have a significant effect.

Sketches are incredibly evocative because they automatically capture nuanced and personal visual clues. However, the study of these inherent qualities of human sketching has been confined to the field of image retrieval. For the first time, scientists are training systems to use the evocative power of sketches for the most fundamental task in vision: detecting objects in a scene. The final product is a framework for detecting objects based on sketches, so one can zero in on the specific “zebra” (e.g., one eating grass) in a herd of zebras. In addition, the researchers impose that the model is successful without:

Going into testing with an idea of what kind of results to expect (zero-shot).

Not requiring extra boundary boxes or class labels (as in fully supervised).

Researchers further stipulate that the sketch-based detector also operates in a zero-shot fashion, increasing the system’s novelty. In the sections that follow, they detail how they switch object detection from a closed-set to an open-vocab configuration. Object detectors, for instance, use prototype learning instead of classification heads, with encoded query sketch features serving as the support set. The model is then trained with a multi-category cross-entropy loss across the prototypes of all conceivable categories or instances in a weakly supervised object detection (WSOD) environment. Object detection operates on an image level, while SBIR is trained with pairs of sketches and photos of individual objects. Because of this, SBIR object detector training requires a bridge between object-level and image-level characteristics. 

Researchers’ contributions are:

Cultivating the expressiveness of human sketching for object detection.

An object detector built on top of the sketch that can figure out what it is one is trying to convey

A detector for objects capable of traditional category-level and instance- and part-level detection.

A novel prompt learning configuration that combines CLIP and SBIR to produce a sketch-aware detector that can function in a zero-shot fashion without bounding box annotations or class labels.

The findings are superior to SOD and WSOD in a zero-shot setting.

Instead of starting from scratch, researchers have demonstrated an intuitive synergy between foundation models (like CLIP) and existing sketch models built for sketch-based image retrieval (SBIR), which can already elegantly solve the task. In particular, they first conduct separate prompting on an SBIR model’s sketch and photo branches, then use CLIP’s generalization capability to construct highly generalizable sketch and photo encoders. To ensure that the region embeddings of detected boxes match those of the SBIR sketches and photos, they design a training paradigm to adjust the learned encoders for item detection. The framework outperforms supervised (SOD) and weakly supervised (WSOD) object detectors on zero-shot setups when tested on industry-standard object detection datasets, including PASCAL-VOC and MS-COCO.

To sum it up

To improve object detection, researchers actively encourage humans’ expressiveness in sketching. The suggested sketch-enabled object identification framework is an instance-aware and part-aware object detector that can understand what one is trying to convey in a sketch. As a result, they devise an innovative prompt learning setup that brings together CLIP and SBIR to educate a sketch award detector that functions without bounding box annotation or class labels. The detector is also specified to operate in a zero-shot fashion for various purposes. On the other hand, SBIR is taught through pairs of sketches and photos of a single thing. They use a data augmentation approach that increases resistance to corruption and generalization to out-of-vocabulary to help bridge the gap between the object and image levels. The resultant framework outperforms supervised and weakly supervised object detectors in a zero-shot setting.

Check Out The Paper and Reference Article. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Revolutionizing Cancer Detection: the University of Surrey Unleashes Game-Changing Sketch-Based Object Detection Tool in Machine Learning appeared first on MarkTechPost.

Google Researchers Introduce AudioPaLM: A Game-Changer in Speech Techn …

Large Language Models (LLMs) have been in the limelight for a few months. Being one of the best advancements in the field of Artificial Intelligence, these models are transforming the way how humans interact with machines. As every industry is adopting these models, they are the best example of how AI is taking over the world. LLMs are excelling in producing text for tasks involving complex interactions and knowledge retrieval, the best example of which is the famous chatbot developed by OpenAI, ChatGPT, based on the Transformer architecture of GPT 3.5 and GPT 4. Not only in text generation but models like CLIP (Contrastive Language-Image Pretraining) have also been developed for image production, enabling the creation of text depending on the content of the image.

To progress in audio generation and understanding, a team of researchers from Google has introduced AudioPaLM, a large language model that can tackle speech understanding and generation tasks. AudioPaLM combines the advantages of two existing models, i.e., the PaLM-2 model and the AudioLM model, in order to produce a unified multimodal architecture that can process and produce both text and speech. This allows AudioPaLM to handle a variety of applications, ranging from voice recognition to voice-to-text conversion.

While AudioLM is excellent at maintaining paralinguistic information like speaker identity and tone, PaLM-2, which is a text-based language model, specializes in text-specific linguistic knowledge. By combining these two models, AudioPaLM takes advantage of PaLM-2’s linguistic expertise and AudioLM’s paralinguistic information preservation, leading to a more thorough comprehension and creation of both text and speech.

AudioPaLM makes use of a joint vocabulary that can represent both speech and text using a limited number of discrete tokens. Combining this joint vocabulary with markup task descriptions enables training a single decoder-only model on a variety of voice and text-based tasks. Tasks like speech recognition, text-to-speech synthesis, and speech-to-speech translation, which separate models traditionally addressed, can now be unified into a single architecture and training process.

Upon evaluation, AudioPaLM outperformed existing systems in speech translation by a significant margin. It demonstrated the ability to perform zero-shot speech-to-text translation for language combinations which means it can accurately translate speech into text for languages it has never encountered before, opening up possibilities for broader language support. AudioPaLM can also transfer voices across languages based on short spoken prompts and can capture and reproduce distinct voices in different languages, enabling voice conversion and adaptation.

The key contributions mentioned by the team are – 

AudioPaLM uses the capabilities of both PaLM and PaLM-2s from text-only pretraining.

It has achieved SOTA results on Automatic Speech Translation and Speech-to-Speech Translation benchmarks and competitive performance on Automatic Speech Recognition benchmarks.

The model performs Speech-to-Speech Translation with voice transfer of unseen speakers, surpassing existing methods in speech quality and voice preservation.

AudioPaLM demonstrates zero-shot capabilities by performing Automatic Speech Translation with unseen language combinations.

In conclusion, AudioPaLM, which is a unified LLM that handles both speech and text by using the capabilities of text-based LLMs and incorporating audio prompting techniques, is a promising addition to the list of LLMs.

Check Out The Paper and Project. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Featured Tools From AI Tools Club

AdCreative.ai

tinyEinstein

Notion

SaneBox

Notion

Aragon AI

Check Out 100’s AI Tools in AI Tools Club
The post Google Researchers Introduce AudioPaLM: A Game-Changer in Speech Technology – A New Large Language Model That Listens, Speaks, and Translates with Unprecedented Accuracy appeared first on MarkTechPost.

15+ Best AI Tools To Help You Land Your Next Dream Role (2023)

Resumaker.ai

Resumaker.ai is a website that helps people make resumes in minutes. The portal provides users with several customizable, designer-made resume templates and intuitive tools to help them land their dream jobs. Unlike other resume builders, Resumaker.ai’s artificial intelligence (AI) engine streamlines the resume-building process by automatically completing and filling in data for users. Resumaker.ai uses SSL encryption and other measures to safeguard user data against unauthorized access. You can use the tool’s writing guides and recommendations to design a resume that stands out from the competition. Users can modify their resumes to reflect the requirements of the posted position, provide an overview of who they are, and utilize numbers to back up claims about their qualifications.

Interviewsby.ai

Job-seekers can use Interviewsby.ai, a platform driven by artificial intelligence, to get ready for interviews. ChatGPT, a language model that can recognize and interpret human language, provides real-time feedback during mock interviews customized to the user. By inputting information about the desired employment, the application can generate appropriate and realistic interview questions for the user. The ability to create questions eliminates the possibility of users training with obsolete or irrelevant material. Users may hone their interviewing skills in a controlled setting using interviewsby.ai and get instant feedback on how they did. Each user receives specific feedback that draws attention to their strengths and weaknesses.

Existential

By assessing a user’s interests, talents, and values, Existential, an AI-powered job exploration tool, makes specific suggestions about the user’s professional path. Its purpose is to point its customers toward occupations that will provide them stimulation, challenge, and satisfaction. The application has a straightforward discovery process: after answering certain questions about their ideal job, the program will provide users with recommendations that best fit their interests. Before committing to anything, users can learn more about these choices and see if they mesh with their objectives. Existential aims to empower individuals to shape their destinies and discover meaning in their work.

Jobscan 

To improve their odds of securing interviews, job seekers should use Jobscan ATS Resume Checker and Job Search Tools powered by artificial intelligence (AI). The program uses a proprietary artificial intelligence algorithm to examine the job description and the applicant’s résumé to isolate the relevant qualifications. After analyzing the applicant’s resume, the program generates a match rate report that details the applicant’s strengths and areas for improvement. You can optimize your resume for Applicant Tracking Systems (ATS) and increase your chances of getting noticed with the help of Jobscan ATS Resume Checker.

Aragon 

Artificial intelligence (AI) powers Aragon Professional Headshots, a program that lets users take polished headshots without visiting a photographer, spending time on hair and makeup, or waiting days for retouching. The user uploads 10 selfies, and the tool instantly returns 40 high-definition photographs. Moreover, the application protects users’ privacy by encrypting data with AES256 and only storing it with service providers who have earned SOC 2 and ISO 27001 certifications. Please note that this service is not intended for use by anyone under the age of 18 since doing so is a violation of the terms of service.

Practice Interview 

Job-seekers can use Practice Mock Interviews with AI to prepare for an interview with a potential employer. Using chatbot technology powered by artificial intelligence, the app helps users prepare for interviews for over a hundred occupations. There is no need to create an account or provide personal information to use the practice interview. Users can sign up for a mailing list and practice interviews for various positions in marketing, software engineering, administration, construction, sales, customer service, operations, finance and accounting, engineering, analysis, teaching, the arts, hospitality, and food service.

NetworkAI 

To help its customers rapidly and efficiently grow their professional networks, Wonsulting has developed NetworkAI, an AI-powered networking platform. NetworkAI employs cutting-edge machine learning technology to construct personalized LinkedIn introduction messages that sound like they were written by a real person for the user’s desired career, present position, and desired business. In addition, it lets people keep tabs on their progress and star items they like. Users can further strengthen their networking abilities by accessing resources such as templates, courses, and success stories. NetworkAI offers three different token packages for users to choose from to create greeting messages. Trying out the product costs nothing initially (10 tokens are provided at no cost). NetworkAI is a helpful tool for those wanting to expand their professional network and build meaningful contacts.

FutureFinder AI

FutureFinder AI is a cutting-edge resource that facilitates global career and education research, personalized guidance, and application process facilitation for students and professionals. The platform provides individualized suggestions for schools and courses, deep AI-driven insights, and specialized support from seasoned education advisers. Key features of FutureFinder AI include a personalized AI recommendation engine based on a user’s preferences, academic background, and aspirations; comprehensive assessment tools powered by GPT-4; an AI-powered essay analyzer that offers personalized feedback on college application essays; and mock AI interviews for college admissions purposes; all of which help users to unlock their academic and career potential and make the pursuit of academic excellence accessible and affordable.

Mimir

Mimir’s AI companions, such as the philosopher Aristotle, provide individualized instruction. Mimir users can select a guide and receive individualized training without investing in networking or financial resources. Conversations with the AI guide will prompt it to ask pertinent questions informed by previous research and improve the quality of its advice-giving. Mimir is a novel solution for those who want to improve themselves through mentoring but need help finding a suitable mentor due to financial or time constraints. Mimir gives you access to low-priced guidance powered by AI. Users log on, select an AI character, and start working with Mimir. Mimir’s advice is superior to other systems that charge more for less individualized support. Mimir employs machine learning techniques to analyze and resolve user difficulties. A powerful resource acts as a mentor to its users on both a personal and professional level.

Engage AI

The artificial intelligence–driven FILT Pod aims to expand a company’s influence on LinkedIn through strategic partnerships. FILT Pod’s primary function is to enable users to automate and scale their engagement on LinkedIn with a feature called “Engage AI.” The AI-driven tool saves users time and effort and generates engaging and relevant comments for each post. In addition, it lets users keep tabs on the posting habits of an unlimited number of leads, shortening the time it takes to make the seven to thirteen touches (or more) required to close a sale. You can use the device with Chrome, Microsoft Edge, or Firefox. It doesn’t cost anything to sign up for or use.

Yoodli 

Yoodli is a one-on-one, AI-powered speech trainer that gives users constructive feedback on enhancing their communication abilities. The software offers immediate feedback on a user’s filler words, tempo, and word choice, in addition to monitoring their visual and verbal delivery in real-time. Users can either record a speech on Yoodli’s protected website or incorporate it into a Zoom call. Once users have built up their self-assurance, they can record their progress and share it with friends or instructors. Yoodli is the best public speaking software, and it is used by people at some of the best firms in the world. Yoodli also provides classes to help users improve their self-assurance for job interviews, public speaking, regular discussions, and corporate presentations.

Kickresume 

Kickresume is cutting-edge resume and cover letter technology, allowing users to create both in an automated fashion. Its advanced AI system uses cutting-edge technology to aid users in creating compelling, professional documents suited to their specific career goals. There is a large selection of professionally designed resume templates available on Kickresume. Kickresume is the best place to start when making a resume from scratch or updating an old one. Learn how Kickresume’s AI-powered features can improve your job search immediately. Kickresume is the future of writing your resume and cover letter.

WonsultingAI

WonsultingAI is an AI-powered platform designed to assist users in finding satisfying employment. You can use the platform’s resume builder to make an AI-powered résumé optimized for your job search and the cover letter generator to quickly and easily craft a letter of application unique to each position you’re applying for. There’s also a built-in networking feature for connecting with future employers and industry peers. The platform is free, but a paid subscription unlocks premium features, including unrestricted usage of the ResumAI and CoverLetterAI tools. WonsultingAI is a fantastic resource for anyone in the market for new employment. The software is user-friendly, and it can help you prepare a professional resume and cover letter, both of which will interest hiring managers.

Thejobforme

TheJobForMe is an online resource that connects people looking for work with employers. The website uses AI to help users find jobs that are a good fit for them in terms of their interests, background, and skills. There are several ways in which TheJobForMe stands out from competing employment boards. First, it utilizes AI to find the most suitable employment opportunities for each individual. As a result, job-seekers have a better shot at landing positions that truly utilize their talents and interests. Second, TheJobForMe offers job-seekers individualized guidance for their professional development. Job-seekers can use this information to hone their application materials and prepare for interviews. Third, TheJobForMe facilitates communication between job-seekers and hiring organizations. This aids anyone looking for work in gaining an introduction to potential employers.

CareerCircles

Job-seekers and employers can meet each other through CareerCircle. It helps users find jobs, learn about companies, create resumes and cover letters, practice for interviews, and get direction in their careers. Job-seekers and companies alike can benefit from CareerCircle. You can find a job using CareerCircle. CareerCircle is a tool for finding and hiring the best employees. The job search engine on CareerCircle allows you to narrow your results by location, keyword, and more. The employment profiles on CareerCircle also feature information about the company’s culture, perks, and potential for advancement in one’s career. It allows you to create unique cover letters for each job application by combining a resume builder with a cover letter writer. You can prepare for interviews and make a good impression on potential employers with the help of CareerCircle’s interview tools and career advice.

JobHunnT

JobHunnT is a job search engine that can match you with the ideal employer. JobHunnT’s job search engine lets you search for jobs by area, keyword, and other parameters, just one of the many tools and resources available on the platform. It offers in-depth analyses of businesses, covering office life, perks, and career growth potential. JobHunnT also provides a cover letter writer and a resume builder to assist users in creating effective application materials. JobHunnT provides job-seekers with career guidance and resources to help them build their talents and advance their careers, as well as interview preparation tools and materials to help them succeed in their job interviews. Wage negotiating might be difficult, but JobHunnT provides resources and tools to help you achieve the wage you deserve.

Rec;less

Recless is a convenient tool that streamlines the process of looking for a new job. The program employs AI to help you find suitable work for your interests and abilities. Recless stands out from the crowd of job-searching apps in several significant ways. One of its primary functions is to help you find work that fits your talents and interests through AI. As a result, you should have an easier time locating suitable employment opportunities. Secondly, Recless facilitates the application process for jobs with less effort. Because of this, looking for a job takes much less time and effort. Finally, Recless critiques your application materials, including your résumé and cover letter. Your chances of getting recruited will be boosted as a result of this.

Jobprofile.io

JobProfile.io is a free service that can be used to make a resume or cover letter that is sure to impress potential employers. Among the many tools available on the site are a resume builder and a cover letter writer, both designed to assist users in preparing application materials unique to each job for which they are applying. The job search engine on JobProfile.io lets users look for work in specific areas or by using particular keywords. It offers in-depth analyses of businesses, covering office life, perks, and career growth potential. Additionally, JobProfile.io provides users with career advice and resources to assist them in developing their talents and advancing their careers, as well as interview preparation tools and materials to help users succeed in interviews.

CareerHub AI

CareerHub AI is an AI-powered career advising platform designed to assist users in identifying and acquiring desirable employable skills and occupations. Numerous tools, such as career guidance tailored to your unique interests, experiences, and abilities, are available on the site. In addition to aiding in the job search process, CareerHub AI provides access to a wealth of tools designed to help you become the best professional you can be. You may earn the compensation you deserve by using the wage-negotiating resources provided by CareerHub AI and nailing your next interview with their guidance. Anyone looking for work, regardless of their degree of expertise, can benefit from using CareerHub AI. It’s an excellent resource for locating desirable employment opportunities and learning relevant skills to help you succeed in your chosen field.

Jobinterview.coach

Job Interview Coach is an online resource designed to aid candidates in getting ready for interviews. To help you be prepared for your interview, Job Interview Coach provides a wide range of sample questions and simulated scenarios. This can give you a sense of calm and assurance before your interview. Gain insight into how you performed during your interview with the help of a Job Interview Coach. You can use the information provided to fine-tune your approach to future interviews. Job Interview Coach offers a wealth of knowledge to help you perform well in your interview. If you follow this guidance, you will have a far better chance of impressing your interviewer and landing the job you want. It’s a terrific strategy to improve your chances of landing the job you want if you’re going in for an interview for a new position.

Don’t forget to join our 24k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com
The post 15+ Best AI Tools To Help You Land Your Next Dream Role (2023) appeared first on MarkTechPost.

Shaping the Future of AI: A Comprehensive Survey on Vision-Language Pr …

In the latest release of published papers in Machine Intelligence Research, a team of researchers dives deep into the area of vision-language pretraining (VLP) and its applications in multi-modal tasks. The paper explores the idea of uni-modal training and how it differs from multi-modal adaptations. Then the report demonstrates the five important areas of VLP: feature extraction, model architecture, pretraining objectives, pretraining datasets, and downstream tasks. The researchers then review the existing VLP models and how they adapt and emerge in the field on different fronts.

The field of AI has always tried to train the models in a way where they perceive, think, and understand the patterns and nuances as Humans do. Various attempts have been made to incorporate as many data input fields as possible, such as visual, audio, or textual data. But most of these approaches have tried to solve the problem of “understanding” in a uni-modal sense.

A uni-modal approach is an approach where you asses a situation undertaking only one aspect of it, such as in a video, you are only focusing on the audio of it or the transcript of it, while in a multi-modal approach, you try to target as many available features as you can and incorporate them into the model. E.g., while analyzing a video, you undertake the audio, the transcription, and the speaker’s facial expression to truly “understand” the context.

The multi-modal approach renders itself challenging because its resource intensive and also for the fact that the need for large amounts of labeled data to train capable models has been difficult. Pretraining models based on transformer structures have addressed this issue by leveraging self-supervised learning and additional tasks to learn universal representations from large-scale unlabeled data.

Pretraining models in a uni-modal fashion, starting with BERT in NLP, have shown remarkable effectiveness by fine-tuning with limited labeled data for downstream tasks. Researchers have explored the viability of vision-language pretraining (VLP) by extending the same design philosophy to the multi-modal field. VLP uses pretraining models on large-scale datasets to learn semantic correspondences between modalities.

The researchers review the advancements made in VLP approach across five major areas. Firstly, they discuss how VLP models preprocess and represent images, videos, and text to obtain corresponding features, highlighting various models employed. Secondly, they also explore and examine the perspective of single-stream and its usability versus dual-stream fusion and encoder-only versus encoder-decoder design.

The paper explores more about the pretraining of VLP models, categorizing them into completion, matching, and particular types. These objectives are important as they help to define universal vision-language representations. The researchers then provide an overview of the two main categories of pre-training the datasets, image-language models and video-language models. The paper emphasizes how the multi-modal approach helps to achieve a better understanding and accuracy in terms of understanding context and producing better-mapped content. Lastly, the article presents the goals and details of downstream tasks in VLP, emphasizing their significance in evaluating the effectiveness of pre-trained models.

https://link.springer.com/content/pdf/10.1007/s11633-022-1369-5.pdf

https://link.springer.com/content/pdf/10.1007/s11633-022-1369-5.pdf

The paper provides a detailed overview of the SOTA VLP models. It lists those models and highlights their key features and performance. The models mentioned and covered are a solid foundation for cutting-edge technological advancement and can serve as a benchmark for future development.

Based on the research paper, The future of VLP architecture seems promising and dependable. They have proposed various areas of improvement, such as incorporating acoustic information, knowledgeable and cognitive learning, prompt tuning, model compression and acceleration, and out-of-domain pretraining. These areas of improvement are meant to inspire the new age of researchers to advance in the field of VLP and come out with breakthrough approaches.

Check Out The Paper and Reference Article. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Featured Tools From AI Tools Club

AdCreative.ai

tinyEinstein

Notion

SaneBox

Notion

Aragon AI

Check Out 100’s AI Tools in AI Tools Club
The post Shaping the Future of AI: A Comprehensive Survey on Vision-Language Pre-Training Models and their Role in Uni-Modal and Multi-Modal Tasks appeared first on MarkTechPost.

Accelerate time to business insights with the Amazon SageMaker Data Wr …

Amazon SageMaker Data Wrangler is a single visual interface that reduces the time required to prepare data and perform feature engineering from weeks to minutes with the ability to select and clean data, create features, and automate data preparation in machine learning (ML) workflows without writing any code.
SageMaker Data Wrangler supports Snowflake, a popular data source for users who want to perform ML. We launch the Snowflake direct connection from the SageMaker Data Wrangler in order to improve the customer experience. Before the launch of this feature, administrators were required to set up the initial storage integration to connect with Snowflake to create features for ML in Data Wrangler. This includes provisioning Amazon Simple Storage Service (Amazon S3) buckets, AWS Identity and Access Management (IAM) access permissions, Snowflake storage integration for individual users, and an ongoing mechanism to manage or clean up data copies in Amazon S3. This process is not scalable for customers with strict data access control and a large number of users.
In this post, we show how Snowflake’s direct connection in SageMaker Data Wrangler simplifies the administrator’s experience and data scientist’s ML journey from data to business insights.
Solution overview
In this solution, we use SageMaker Data Wrangler to speed up data preparation for ML and Amazon SageMaker Autopilot to automatically build, train, and fine-tune the ML models based on your data. Both services are designed specifically to increase productivity and shorten time to value for ML practitioners. We also demonstrate the simplified data access from SageMaker Data Wrangler to Snowflake with direct connection to query and create features for ML.
Refer to the diagram below for an overview of the low-code ML process with Snowflake, SageMaker Data Wrangler, and SageMaker Autopilot.

The workflow includes the following steps:

Navigate to SageMaker Data Wrangler for your data preparation and feature engineering tasks.

Set up the Snowflake connection with SageMaker Data Wrangler.
Explore your Snowflake tables in SageMaker Data Wrangler, create a ML dataset, and perform feature engineering.

Train and test the models using SageMaker Data Wrangler and SageMaker Autopilot.
Load the best model to a real-time inference endpoint for predictions.
Use a Python notebook to invoke the launched real-time inference endpoint.

Prerequisites
For this post, the administrator needs the following prerequisites:

A Snowflake user with administrator permission to create a Snowflake virtual warehouse, user, and role, and grant access to this user to create a database. For more details on the administration setup, refer to Import data from Snowflake.
An AWS account with admin access.
A Snowflake Enterprise Account in your preferred AWS Region with ACCOUNTADMIN access.
Optionally, if you’re using Snowflake OAuth access in SageMaker Data Wrangler, refer to Import data from Snowflake to set up an OAuth identity provider.
Familiarity with Snowflake, basic SQL, the Snowsight UI, and Snowflake objects.

Data scientists should have the following prerequisites

Access to Amazon SageMaker, an instance of Amazon SageMaker Studio, and a user for SageMaker Studio. For more information about prerequisites, see Get Started with Data Wrangler.
Familiarity with AWS services, networking, and the AWS Management Console. Basic knowledge of Python, Jupyter notebooks, and ML.

Lastly, you should prepare your data for Snowflake

We use credit card transaction data from Kaggle to build ML models for detecting fraudulent credit card transactions, so customers are not charged for items that they didn’t purchase. The dataset includes credit card transactions in September 2013 made by European cardholders.
You should use the SnowSQL client and install it in your local machine, so you can use it to upload the dataset to a Snowflake table.

The following steps show how to prepare and load the dataset into the Snowflake database. This is a one-time setup.
Snowflake table and data preparation
Complete the following steps for this one-time setup:

First, as the administrator, create a Snowflake virtual warehouse, user, and role, and grant access to other users such as the data scientists to create a database and stage data for their ML use cases:

— Use the role SECURITYADMIN to create Role and User
USE ROLE SECURITYADMIN;

— Create a new role ‘ML Role’
CREATE OR REPLACE ROLE ML_ROLE COMMENT=’ML Role’;
GRANT ROLE ML_ROLE TO ROLE SYSADMIN;

— Create a new user and password and grant the role to the user
CREATE OR REPLACE USER ML_USER PASSWORD='<REPLACE_PASSWORD>’
DEFAULT_ROLE=ML_ROLE
DEFAULT_WAREHOUSE=ML_WH
DEFAULT_NAMESPACE=ML_WORKSHOP.PUBLIC
COMMENT=’ML User’;
GRANT ROLE ML_ROLE TO USER ML_USER;

— Grant privliges to role
USE ROLE ACCOUNTADMIN;
GRANT CREATE DATABASE ON ACCOUNT TO ROLE ML_ROLE;

–Create Warehouse for AI/ML work
USE ROLE SYSADMIN;

CREATE OR REPLACE WAREHOUSE ML_WH
WITH WAREHOUSE_SIZE = ‘XSMALL’ AUTO_SUSPEND = 120 AUTO_RESUME = true INITIALLY_SUSPENDED = TRUE;

GRANT ALL ON WAREHOUSE ML_WH TO ROLE ML_ROLE;

As the data scientist, let’s now create a database and import the credit card transactions into the Snowflake database to access the data from SageMaker Data Wrangler. For illustration purposes, we create a Snowflake database named SF_FIN_TRANSACTION:

— Select the role and the warehouse
USE ROLE ML_ROLE;
USE WAREHOUSE ML_WH;

— Create the DB to import the financial transactions
CREATE DATABASE IF NOT EXISTS sf_fin_transaction;

— Create CSV File Format
create or replace file format my_csv_format
type = csv
field_delimiter = ‘,’
skip_header = 1
null_if = (‘NULL’, ‘null’)
empty_field_as_null = true
compression = gzip;

Download the dataset CSV file to your local machine and create a stage to load the data into the database table. Update the file path to point to the downloaded dataset location before running the PUT command for importing the data to the created stage:

— Create a Snowflake named internal stage to store the transactions csv file
CREATE OR REPLACE STAGE my_stage
FILE_FORMAT = my_csv_format;

— Import the file in to the stage
— This command needs be run from SnowSQL client and not on WebUI
PUT file:///Users/*******/Downloads/creditcard.csv @my_stage;

— Check whether the import was successful
LIST @my_stage;

Create a table named credit_card_transactions:

— Create table and define the columns mapped to the csv transactions file
create or replace table credit_card_transaction (
Time integer,
V1 float, V2 float, V3 float,
V4 float, V5 float, V6 float,
V7 float, V8 float, V9 float,
V10 float,V11 float,V12 float,
V13 float,V14 float,V15 float,
V16 float,V17 float,V18 float,
V19 float,V20 float,V21 float,
V22 float,V23 float,V24 float,
V25 float,V26 float,V27 float,
V28 float,Amount float,
Class varchar(5)
);

Import the data into the created table from the stage:

— Import the transactions in to a new table named ‘credit_card_transaction’
copy into credit_card_transaction from @my_stage ON_ERROR = CONTINUE;

— Check whether the table was successfully created
select * from credit_card_transaction limit 100;

Set up the SageMaker Data Wrangler and Snowflake connection
After we prepare the dataset to use with SageMaker Data Wrangler, let us create a new Snowflake connection in SageMaker Data Wrangler to connect to the sf_fin_transaction database in Snowflake and query the credit_card_transaction table:

Choose Snowflake on the SageMaker Data Wrangler Connection page.
Provide a name to identify your connection.
Select your authentication method to connect with the Snowflake database:

If using basic authentication, provide the user name and password shared by your Snowflake administrator. For this post, we use basic authentication to connect to Snowflake using the user credentials we created in the previous step.
If you are using OAuth, provide your identity provider credentials.

SageMaker Data Wrangler by default queries your data directly from Snowflake without creating any data copies in S3 buckets. SageMaker Data Wrangler’s new usability enhancement uses Apache Spark to integrate with Snowflake to prepare and seamlessly create a dataset for your ML journey.
So far, we have created the database on Snowflake, imported the CSV file into the Snowflake table, created Snowflake credentials, and created a connector on SageMaker Data Wrangler to connect to Snowflake. To validate the configured Snowflake connection, run the following query on the created Snowflake table:

select * from credit_card_transaction;

Note that the storage integration option that was required before is now optional in the advanced settings.
Explore Snowflake data
After you validate the query results, choose Import to save the query results as the dataset. We use this extracted dataset for exploratory data analysis and feature engineering.

You can choose to sample the data from Snowflake in the SageMaker Data Wrangler UI. Another option is to download complete data for your ML model training use cases using SageMaker Data Wrangler processing jobs.

Perform exploratory data analysis in SageMaker Data Wrangler
The data within Data Wrangler needs to be engineered before it can be trained. In this section, we demonstrate how to perform feature engineering on the data from Snowflake using SageMaker Data Wrangler’s built-in capabilities.
First, let’s use the Data Quality and Insights Report feature within SageMaker Data Wrangler to generate reports to automatically verify the data quality and detect abnormalities in the data from Snowflake.
You can use the report to help you clean and process your data. It gives you information such as the number of missing values and the number of outliers. If you have issues with your data, such as target leakage or imbalance, the insights report can bring those issues to your attention. To understand the report details, refer to Accelerate data preparation with data quality and insights in Amazon SageMaker Data Wrangler.
After you check out the data type matching applied by SageMaker Data Wrangler, complete the following steps:

Choose the plus sign next to Data types and choose Add analysis.
For Analysis type, choose Data Quality and Insights Report.
Choose Create.
Refer to the Data Quality and Insights Report details to check out high-priority warnings.

You can choose to resolve the warnings reported before proceeding with your ML journey.

The target column Class to be predicted is classified as a string. First, let’s apply a transformation to remove the stale empty characters.

Choose Add step and choose Format string.
In the list of transforms, choose Strip left and right.
Enter the characters to remove and choose Add.

Next, we convert the target column Class from the string data type to Boolean because the transaction is either legitimate or fraudulent.

Choose Add step.
Choose Parse column as type.
For Column, choose Class.
For From, choose String.
For To, choose Boolean.
Choose Add.

After the target column transformation, we reduce the number of feature columns, because there are over 30 features in the original dataset. We use Principal Component Analysis (PCA) to reduce the dimensions based on feature importance. To understand more about PCA and dimensionality reduction, refer to Principal Component Analysis (PCA) Algorithm.

Choose Add step.
Choose Dimensionality Reduction.
For Transform, choose Principal component analysis.
For Input columns, choose all the columns except the target column Class.
Choose the plus sign next to Data flow and choose Add analysis.
For Analysis type, choose Quick Model.
For Analysis name, enter a name.
For Label, choose Class.
Choose Run.

Based on the PCA results, you can decide which features to use for building the model. In the following screenshot, the graph shows the features (or dimensions) ordered based on highest to lowest importance to predict the target class, which in this dataset is whether the transaction is fraudulent or valid.

You can choose to reduce the number of features based on this analysis, but for this post, we leave the defaults as is.
This concludes our feature engineering process, although you may choose to run the quick model and create a Data Quality and Insights Report again to understand the data before performing further optimizations.
Export data and train the model
In the next step, we use SageMaker Autopilot to automatically build, train, and tune the best ML models based on your data. With SageMaker Autopilot, you still maintain full control and visibility of your data and model.
Now that we have completed the exploration and feature engineering, let’s train a model on the dataset and export the data to train the ML model using SageMaker Autopilot.

On the Training tab, choose Export and train.

We can monitor the export progress while we wait for it to complete.

Let’s configure SageMaker Autopilot to run an automated training job by specifying the target we want to predict and the type of problem. In this case, because we’re training the dataset to predict whether the transaction is fraudulent or valid, we use binary classification.

Enter a name for your experiment, provide the S3 location data, and choose Next: Target and features.
For Target, choose Class as the column to predict.
Choose Next: Training method.

Let’s allow SageMaker Autopilot to decide the training method based on the dataset.

For Training method and algorithms, select Auto.

To understand more about the training modes supported by SageMaker Autopilot, refer to Training modes and algorithm support.

Choose Next: Deployment and advanced settings.
For Deployment option, choose Auto deploy the best model with transforms from Data Wrangler, which loads the best model for inference after the experimentation is complete.
Enter a name for your endpoint.
For Select the machine learning problem type, choose Binary classification.
For Objection metric, choose F1.
Choose Next: Review and create.
Choose Create experiment.

This starts an SageMaker Autopilot job that creates a set of training jobs that uses combinations of hyperparameters to optimize the objective metric.

Wait for SageMaker Autopilot to finish building the models and evaluation of the best ML model.
Launch a real-time inference endpoint to test the best model
SageMaker Autopilot runs experiments to determine the best model that can classify credit card transactions as legitimate or fraudulent.
When SageMaker Autopilot completes the experiment, we can view the training results with the evaluation metrics and explore the best model from the SageMaker Autopilot job description page.

Select the best model and choose Deploy model.

We use a real-time inference endpoint to test the best model created through SageMaker Autopilot.

Select Make real-time predictions.

When the endpoint is available, we can pass the payload and get inference results.

Let’s launch a Python notebook to use the inference endpoint.

On the SageMaker Studio console, choose the folder icon in the navigation pane and choose Create notebook.
Use the following Python code to invoke the deployed real-time inference endpoint:

# Library imports
import os
import io
import boto3
import json
import csv

#: Define the endpoint’s name.
ENDPOINT_NAME = ‘SnowFlake-FraudDetection’ # replace the endpoint name as per your config
runtime = boto3.client(‘runtime.sagemaker’)

#: Define a test payload to send to your endpoint.
payload = {
“body”: {
“TIME”: 152895,
“V1”: 2.021155535,
“V2”: 0.05372872624,
“V3”: -1.620399104,
“V4”: 0.3530165253,
“V5”: 0.3048483853,
“V6”: -0.6850955461,
“V7”: 0.02483335885,
“V8”: -0.05101346021,
“V9”: 0.3550896835,
“V10”: -0.1830053153,
“V11”: 1.148091498,
“V12”: 0.4283365505,
“V13”: -0.9347237892,
“V14”: -0.4615291327,
“V15”: -0.4124343184,
“V16”: 0.4993445934,
“V17”: 0.3411548305,
“V18”: 0.2343833846,
“V19”: 0.278223588,
“V20”: -0.2104513475,
“V21”: -0.3116427235,
“V22”: -0.8690778214,
“V23”: 0.3624146958,
“V24”: 0.6455923598,
“V25”: -0.3424913329,
“V26”: 0.1456884618,
“V27”: -0.07174890419,
“V28”: -0.040882382,
“AMOUNT”: 0.27
}
}

#: Submit an API request and capture the response object.
response = runtime.invoke_endpoint(
EndpointName=ENDPOINT_NAME,
ContentType=’text/csv’,
Body=str(payload)
)

#: Print the model endpoint’s output.
print(response[‘Body’].read().decode())

The output shows the result as false, which implies the sample feature data is not fraudulent.

Clean up
To make sure you don’t incur charges after completing this tutorial, shut down the SageMaker Data Wrangler application and shut down the notebook instance used to perform inference. You should also delete the inference endpoint you created using SageMaker Autopilot to prevent additional charges.
Conclusion
In this post, we demonstrated how to bring your data from Snowflake directly without creating any intermediate copies in the process. You can either sample or load your complete dataset to SageMaker Data Wrangler directly from Snowflake. You can then explore the data, clean the data, and perform featuring engineering using SageMaker Data Wrangler’s visual interface.
We also highlighted how you can easily train and tune a model with SageMaker Autopilot directly from the SageMaker Data Wrangler user interface. With SageMaker Data Wrangler and SageMaker Autopilot integration, we can quickly build a model after completing feature engineering, without writing any code. Then we referenced SageMaker Autopilot’s best model to run inferences using a real-time endpoint.
Try out the new Snowflake direct integration with SageMaker Data Wrangler today to easily build ML models with your data using SageMaker.

About the authors
Hariharan Suresh is a Senior Solutions Architect at AWS. He is passionate about databases, machine learning, and designing innovative solutions. Prior to joining AWS, Hariharan was a product architect, core banking implementation specialist, and developer, and worked with BFSI organizations for over 11 years. Outside of technology, he enjoys paragliding and cycling.
Aparajithan Vaidyanathan is a Principal Enterprise Solutions Architect at AWS. He supports enterprise customers migrate and modernize their workloads on AWS cloud. He is a Cloud Architect with 23+ years of experience designing and developing enterprise, large-scale and distributed software systems. He specializes in Machine Learning & Data Analytics with focus on Data and Feature Engineering domain. He is an aspiring marathon runner and his hobbies include hiking, bike riding and spending time with his wife and two boys.
Tim Song is a Software Development Engineer at AWS SageMaker, with 10+ years of experience as software developer, consultant and tech leader he has demonstrated ability to deliver scalable and reliable products and solve complex problems. In his spare time, he enjoys the nature, outdoor running, hiking and etc.
Bosco Albuquerque is a Sr. Partner Solutions Architect at AWS and has over 20 years of experience in working with database and analytics products from enterprise database vendors and cloud providers. He has helped large technology companies design data analytics solutions and has led engineering teams in designing and implementing data analytics platforms and data products.