This AI Paper Proposes COPlanner: A Machine Learning-based Plug-and-Pl …

One of the critical challenges in model-based reinforcement learning (MBRL) is managing imperfect dynamics models. This limitation of MBRL becomes particularly evident in complex environments, where the ability to forecast accurate models is crucial yet difficult, often leading to suboptimal policy learning. The challenge is achieving accurate predictions and ensuring these models can adapt and perform effectively in varied, unpredictable scenarios. Therefore, a critical need arises for innovation in MBRL methodologies to better address and compensate for these model inaccuracies.

Recent research in MBRL has explored various methods to address dynamic model inaccuracies. Plan to Predict (P2P) focuses on learning an uncertainty-foreseeing model to avoid uncertain regions during rollouts. Branched and bidirectional rollouts utilize shorter horizons to mitigate early-stage model errors, though this can limit planning capabilities. Notably, Model-Ensemble Exploration and Exploitation (MEEE) expands the dynamics model while minimizing error impacts during rollouts by leveraging uncertainty in loss calculation, presenting a significant advancement in the field.

Combining their efforts with JPMorgan AI Research and Shanghai Qi Zhi Institute, researchers from the University of Maryland and Tsinghua University have introduced COPlanner, a novel approach within the MBRL paradigm. It utilizes an uncertainty-aware policy-guided model predictive control (UP-MPC). This component is essential for estimating uncertainties and selecting appropriate actions. The methodology includes a detailed ablation study on the Hopper-hop task in visual control DMC, focusing on different uncertainty estimation methods and assessing their computational time consumption. 

A key feature of COPlanner is its comparative analysis with existing methods. The paper visualizes trajectories from real environment evaluations, highlighting the performance differences between DreamerV3 and COPlanner-DreamerV3. Specifically, it focuses on tasks like Hopper-hop and Quadruped-walk, providing a clear picture of COPlanner’s enhancements over standard approaches. This visual comparison underscores COPlanner’s advancements in handling tasks with varying complexities, demonstrating its practical applications in model-based reinforcement learning.

The research demonstrates that COPlanner significantly enhances sample efficiency and asymptotic performance in proprioceptive and visual continuous control tasks. This improvement is particularly notable in challenging visual tasks, where optimistic exploration and conservative rollouts yield the best outcomes. Results have demonstrated how model prediction error and rollout uncertainty change as the environment step increases. The study also presents the ablation results on different hyperparameters of COPlanner, such as optimistic rate, conservative rate, action candidate number, and planning horizon. 

The COPlanner framework marks a substantial advancement in the field of MBRL. Its innovative integration of conservative planning and optimistic exploration addresses a fundamental challenge in the discipline. This research contributes to the theoretical understanding of MBRL and offers a pragmatic solution with potential applications in various real-world scenarios, underscoring its significance in advancing the field.

Check out the Paper and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel

The post This AI Paper Proposes COPlanner: A Machine Learning-based Plug-and-Play Framework that can be Applied to any Dyna-Style Model-based Methods appeared first on MarkTechPost.

Revolutionizing Fluid Dynamics: Integrating Physics-Informed Neural Ne …

Background Oriented Schlieren (BOS) imaging is an effective technique for visualizing and quantifying fluid flow. BOS is cost-effective and flexible, unlike other methods like Particle Image Velocimetry (PIV) and Laser-Induced Fluorescence (LIF). It relies on the distortion of objects in a density-varying medium due to light refraction, with digital image correlation or optical flow algorithms used for analysis. Despite advancements, quantifying complete fluid velocity and pressure fields from BOS images remains challenging. Existing algorithms, mostly based on cross-correlation, are optimized for PIV and provide sparse velocity vectors. Direct pressure estimation requires additional methods. The reconstruction of three-dimensional velocity fields from Tomographic BOS (Tomo-BOS) is an open area in experimental fluid mechanics.

Researchers from the Division of Applied Mathematics, Brown University, LaVision GmbH, Anna-Vandenhoeck-Ring, Germany, and  LaVision Inc., Michigan Ave., Ypsilanti, USA, have developed a method employing Physics-Informed Neural Networks (PINNs) to deduce complete 3D velocity and pressure fields from 3D temperature snapshots obtained through Tomo-BOS imaging. PINNs integrate fluid flow physics and visualization data seamlessly, enabling inference with limited experimental data. The method is validated using synthetic data and applied successfully to Tomo-BOS data, accurately inferring velocity and pressure fields over an espresso cup. 

The study discusses using Schlieren features in sequential images and the sensitivity of physical properties in PINN for estimating 2-D pressure fields. The researchers conduct a Tomo-BOSPINN experiment with downsampling data to investigate the sensitivity of physical properties in the estimation process. The training data is sampled with a time interval of 0.1 s, and the relative L2-norm temperature error is calculated for unseen data using the trained parameters. The researchers compare the inferred velocity field with the displacement determined from Schlieren-tracking and agree. The proposed Tomo-BOSPINN method can accurately guess the full temperature and velocity fields.

The PINN algorithm, functioning as a  data assimilation technique, predicts velocity and pressure fields by analyzing visualization data across a spatio-temporal domain. Unlike conventional data assimilation methods, the efficiency of which relies heavily on accurately choosing initial guesses for velocity and pressure conditions, the PINN algorithm doesn’t require such information. In PINN, the trainable variables are the parameters of the neural network, not the conventional control variables. This distinction eliminates the need to specify initial and boundary conditions for velocity or pressure, simplifying the implementation of the algorithm.

The study presents the results of the Tomo-BOSPINN experiment, which utilizes Schlieren features in sequential images to estimate 2-D pressure fields. The researchers report the residuals of the momentum equations in the x, y, and z directions, with an average residual in the order of 10^-4 m s^-2. Velocity profiles along a horizontal line at various time instances are compared between Tomo-BOSPINN and planar PIV results. The researchers acknowledge the support from the PhILMS grant under the grant number DE-SC0019453.

In conclusion, the researchers have developed a machine-learning algorithm based on PINNs for estimating velocity and pressure fields from temperature data in Tomo-BOS experiments. PINNs integrate governing equations and temperature data without requiring CFD solvers, allowing simultaneous inference of velocity and pressure without initial or boundary conditions. The method is evaluated through a 2D buoyancy-driven flow simulation, demonstrating accurate performance with sparse and noisy data. A Tomo-BOS experiment on flow over an espresso cup successfully infers 3D velocity and pressure fields from reconstructed temperature data, showing the versatility of PINNs with either planar or tomographic BOS data. The flexibility of the proposed method suggests its potential for various fluid mechanics problems, marking a promising direction in experimental fluid mechanics.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel

The post Revolutionizing Fluid Dynamics: Integrating Physics-Informed Neural Networks with Tomo-BOS for Advanced Flow Analysis appeared first on MarkTechPost.

Meet RAGxplorer: An interactive AI Tool to Support the Building of Ret …

Understanding how well they comprehend and organize information is crucial in advanced language models. A common challenge arises in visualizing the intricate relationships between different document parts, especially when using complex models like the Retriever-Answer Generator (RAG). Existing tools can only sometimes provide a clear picture of how chunks of information relate to each other and specific queries.

Several attempts have been made to address this issue, but they often need to deliver the need to provide an intuitive and interactive solution. These tools need help breaking down documents into manageable pieces and visualizing their semantic landscape effectively. As a result, users find it challenging to assess how healthy RAG models genuinely understand the content or identify any biases in their knowledge.

Meet RAGxplorer: An interactive AI Tool to Support the Building of Retrieval Augmented Generation (RAG) Applications by Visualizing Document Chunks and the Queries in the Embedding Space. RAGxplorer takes a document, breaks it into smaller, overlapping chunks, and converts each into a mathematical representation called an embedding. This unique approach captures the meaning and context of each chunk in a high-dimensional space, laying the foundation for insightful visualizations.

The critical feature of RAGxplorer is its ability to display these embeddings in a 2D or 3D space, creating an interactive map of the document’s semantic landscape. Users can see how different chunks relate to each other and specific queries, represented as dots in the embedding space. This visualization allows for a quick assessment of how well the RAG models understand the document, with closer dots indicating more similar meanings.

One notable capability of RAGxplorer is its flexibility in handling various document formats. Users can easily upload PDF documents for analysis and configure the chunk size and overlap, providing adaptability to different types of content. The tool also allows users to build a vector database for efficient retrieval and visualization, enhancing the overall user experience.

Users can experiment with different query expansion techniques and observe how the retrieval of relevant chunks is affected. The tool’s effectiveness is evident in its ability to reveal the semantic relationships within a document, helping users identify biases, gaps in knowledge, and overall model performance.

In conclusion, RAGxplorer is a powerful solution to the challenges of visualizing complex language models like RAG. Its unique approach to chunking, embedding, and visualizing the semantic landscape provides users with a valuable tool for understanding model behavior and improving overall comprehension. As the landscape of language models continues to evolve, tools like RAGxplorer become essential for researchers, developers, and practitioners seeking more profound insights into the workings of these advanced systems.

The post Meet RAGxplorer: An interactive AI Tool to Support the Building of Retrieval Augmented Generation (RAG) Applications by Visualizing Document Chunks and the Queries in the Embedding Space appeared first on MarkTechPost.

Deploy a Microsoft Teams gateway for Amazon Q, your business expert

Amazon Q is a new generative AI-powered application that helps users get work done. Amazon Q can become your tailored business expert and let you discover content, brainstorm ideas, or create summaries using your company’s data safely and securely. You can use Amazon Q to have conversations, solve problems, generate content, gain insights, and take action by connecting to your company’s information repositories, code, data, and enterprise systems. For more information, see Introducing Amazon Q, a new generative AI-powered assistant (preview).
In this post, we show you how to bring Amazon Q, your business expert, to users in Microsoft Teams. (If you use Slack, refer to Deploy a Slack gateway for Amazon Q, your business expert.)
You’ll be able converse with Amazon Q business expert using Teams direct messages (DMs) to ask questions and get answers based on company data, get help creating new content such as email drafts, summarize attached files, and perform tasks.
You can also invite Amazon Q business expert to participate in your Teams channels. In a channel, users can ask Amazon Q business expert questions in a new message, or tag it in an existing thread at any point, to provide additional data points, resolve a debate, or summarize the conversation and capture the next steps.
Solution overview
Amazon Q business expert is amazingly powerful. Check out the following demo—seeing is believing!

In the demo, our Amazon Q business expert application is populated with some Wikipedia pages. You can populate your Amazon Q business expert application with your own company’s documents and knowledge base articles, so it will be able to answer your specific questions!
Everything you need is provided as open source in our GitHub repo.
In this post, we walk you through the process to deploy Amazon Q business expert in your AWS account and add it to Microsoft Teams. When you’re done, you’ll wonder how you ever managed without it!
The following are some of the things it can do:

Respond to messages – In DMs, it responds to all messages. In channels, it responds only to @mentions and responds in a conversation thread.
Render answers containing markdown – This includes headings, lists, bold, italics, tables, and more.
Track sentiment – It provides thumbs up and thumbs down buttons to track user sentiment.
Provide source attribution – It provides references and hyperlinks to sources used by Amazon Q business expert.
Understand conversation context – It tracks the conversation and responds based on the context.
Stay aware of multiple users – When it’s tagged in a thread, it knows who said what, and when, so it can contribute in context and accurately summarize the thread when asked.
Process attached files – It can process up to five attached files for document question answering, summaries, and more.
Start new conversations – You can reset and start new conversations in DM chats by using /new_conversation.

In the following sections, we show how to deploy the project to your own AWS account and Teams account, and start experimenting!
Prerequisites
You need to have an AWS account and an AWS Identity and Access Management (IAM) role and user with permissions to create and manage the necessary resources and components for this application. If you don’t have an AWS account, see How do I create and activate a new Amazon Web Services account?
You also need to have an existing, working Amazon Q business expert application. If you haven’t set one up yet, see Creating an Amazon Q application.
Lastly, you need a Microsoft account and a Microsoft Teams subscription to create and publish the app using the steps outlined in this post. If you don’t have these, see if your company can create sandboxes for you to experiment, or create a new account and trial subscription as needed to complete the steps.
Deploy the solution resources
We’ve provided pre-built AWS CloudFormation templates that deploy everything you need in your AWS account.
If you’re a developer and you want to build, deploy, or publish the solution from code, refer to the Developer README.
Complete the following steps to launch the CloudFormation stack:

Log in to the AWS Management Console.
Choose one of the following Launch Stack buttons for your desired AWS Region to open the AWS CloudFormation console and create a new stack.

Region
Launch Stack

N. Virginia (us-east-1)

Oregon (us-west-2)

For Stack name, enter a name for your app (for example, AMAZON-Q-TEAMS-GATEWAY).
For AmazonQAppId, enter your existing Amazon Q business expert application ID (for example, 80xxxxx9-7xx3-4xx0-bxx4-5baxxxxx2af5). You can copy it from the Amazon Q business expert console.
For AmazonQRegion, choose the Region where you created your Amazon Q business expert application (us-east-1 or us-west-2).
For AmazonQUserId, enter an Amazon Q business expert user ID email address (leave blank to use a Teams user email as the user ID).
For ContextDaysToLive, enter the length of time to keep conversation metadata cached in Amazon DynamoDB (you can leave this as the default).

When your CloudFormation stack status is CREATE_COMPLETE, choose the Outputs tab, and keep it open—you’ll need it in later steps.
Register a new app in the Microsoft Azure portal
Complete the following steps to register a new app in the Microsoft Azure portal:

Go to the Azure Portal and log in with your Microsoft account.
Choose New registration.

For Name, provide the name for your app. You can keep things simple by using the stack name you used for the CloudFormation stack.
For Who can use this application or access this API?, choose Accounts in this organizational directory only (AWS only – Single tenant).
Choose Register.
Note down the Application (client) ID value and the Directory (tenant) ID from the Overview page. You’ll need them later when asked for MicrosoftAppId and MicrosoftAppTenantId.

Choose Select API permissions in the navigation pane.

Choose Add a permission.
Choose Microsoft Graph.
Choose Application permissions.
Select User.Read.All.
Select ChannelMessage.Read.All.
Select Team.ReadBasic.All.
Select Files.Read.All.
Choose Add permissions. This permission allows the app to read data in your organization’s directory about the signed-in user.
Use the options menu (three dots) on the right to choose Remove permission.
Remove the original User.Read – Delegated permission.
Choose Grant admin consent for Default Directory.

Choose Certificates & secrets in the navigation pane.

Choose New client secret.
For Description, provide a value, such as description of my client secret.
Choose a value for Expires. Note that in production, you’ll need to manually rotate your secret before it expires.
Choose Add.
Note down the value for your new secret. You’ll need it later when asked for MicrosoftAppPassword.

Optionally, choose Owners to add any additional owners for the application.

Register your new app in the Microsoft Bot Framework
Complete the following steps to register your app in the Microsoft Bot Framework:

Go to the Microsoft Bot Framework and log in with your Microsoft account.
Optionally, you can create and upload a custom icon for your new Amazon Q business expert bot. For example, we created the following using Amazon Bedrock image playground.

Enter your preferred display name, bot handle, and description.
For Messaging endpoint, copy and paste the value of TeamsEventHandlerApiEndpoint from your stack Outputs tab.
Do not select Enable Streaming Endpoint.
For App type, choose Single Tenant.
For Paste your app ID below to continue, enter the MicrosoftAppId value you noted earlier.
For App Tenant ID, enter the MicrosoftAppTenantId value you noted earlier.
Leave the other values as they are, agree to the terms, and choose Register.
On the Channels page, under Add a featured channel, choose Microsoft Teams.
Choose Microsoft Teams Commercial (most common), then choose Save.
Agree to the Terms of Service and choose Agree.

Configure your secrets in AWS
Let’s configure your Teams secrets in order to verify the signature of each request and post on behalf of your Amazon Q business expert bot.
In this example, we are not enabling Teams token rotation. You can enable it for a production app by implementing rotation via AWS Secrets Manager. Create an issue (or, better yet, a pull request) in the GitHub repo if you want this feature added to a future version.
Complete the following steps to configure a secret in Secrets Manager:

On the AWS CloudFormation console, navigate to your stack Outputs tab and choose the link for TeamsSecretConsoleUrl to be redirected to the Secrets Manager console.
Choose Retrieve secret value.
Choose Edit.
Replace the values of MicrosoftAppId, MicrosoftAppPassword, and MicrosoftAppTenantId with the values you noted in the previous steps.

Deploy your app into Microsoft Teams
Complete the following steps to deploy the app to Teams:

Go to the Developer Portal for Teams and log in with your Microsoft Teams user account.
Choose Apps in the navigation pane, then choose New app.

For Name, enter your bot name.
Enter a name for Full name and both short and full descriptions (you can use the bot name for them all if you want, just don’t leave them empty).
Enter values for Developer information and App URLs. For testing, you can make up values, and URLs like https://www.anycompany.com/. Use real ones for production.
For Application (client) ID*, enter the value of MicrosoftAppId from earlier.
Choose Save.

Under Branding, you can upload AI-generated icons, or different icons, or none at all, it’s up to you. The following are some examples:

Color icon 192×192
Outline icon 32×32

Under App features, choose Bot.

Select Enter a bot ID, and enter the MicrosoftAppId value from the earlier steps.
Under What can your bot do?, select Upload and download files.
Under Select the scopes in which people can use this command, select Personal, Team, and Group chat.
Choose Save.

Choose Publish.
Choose Download the app package to download a .zip file to your computer.
Choose Preview in Teams to launch Microsoft Teams (work or school) app.

In the navigation pane, choose Apps, then Manage your apps, then Upload an app.
Choose Upload an app to your orgs app catalog, and select the .zip file you downloaded. This adds the app to Teams.
Select the card for your new app, choose Add, and wait for it to complete (10–20 seconds).

Add your bot to one or more teams
Complete the following step to add your bot to a team:

In the Teams app, select your team and choose Manage team.
On the Apps tab, choose the new Amazon Q business expert app, and choose Add.

Now you can test your bot in Microsoft Teams!
Start using Amazon Q business expert
Complete the following steps to start using Amazon Q business expert in Teams:

Open your Teams client.
Under Apps, add your new Amazon Q business expert app to a chat.
Optionally, add your Amazon Q business expert app to one or more Teams channels.
In the app DM chat, enter Hello.

You have now deployed a powerful new AI assistant into your sandbox Teams environment.
Play with it, try all the features discussed in this post, and copy the things you saw in the demo video. Most importantly, you can ask about topics related to the documents that you have ingested into your own Amazon Q business expert application. But don’t stop there. You can find additional ways to make it useful, and when you do, let us know by posting a comment.
Once you are convinced how useful it is, talk to your Teams admins (show them this post) and work with them to deploy it in your company’s Teams organizations. Your fellow employees will thank you!
Clean up
When you’re finished experimenting with this solution, delete your app in Microsoft Teams, Bot Framework, and Azure portal. Then clean up your AWS resources by opening the AWS CloudFormation console and deleting the AMAZON-Q-TEAMS-GATEWAY stack that you deployed. This deletes the resources that you created by deploying the solution.
Conclusions
The sample Amazon Q business expert Teams application discussed in this post is provided as open source—you can use it as a starting point for your own solution, and help us make it better by contributing back fixes and features via GitHub pull requests. Explore the code, choose Watch in the GitHub repo to be notified of new releases, and check back for the latest updates. We’d also love to hear your suggestions for improvements and features.
For more information on Amazon Q business expert, refer to the Amazon Q (For Business Use) Developer Guide.

About the Authors
Gary Benattar is a Senior Software Development Manager in AWS HR. Gary started at Amazon in 2012 as an intern, focusing on building scalable, real-time outlier detection systems. He worked in Seattle and Luxembourg and is now based in Tel Aviv, Israel, where he dedicates his time to building software to revolutionize the future of Human Resources. He co-founded a startup, Zengo, with a focus on making digital wallets secure through multi-party computation. He received his MSc in Software Engineering from Sorbonne University in Paris.

Bob Strahan is a Principal Solutions Architect in the AWS Language AI Services team.

This AI Paper from the University of Washington, CMU, and Allen Instit …

Large Language Models (LLMs), which are the latest and most incredible developments in the field of Artificial Intelligence (AI), have gained massive popularity. Due to their human-imitating skills of answering questions like humans, completing codes, summarizing long textual paragraphs, etc, these models have utilized the potential of Natural Language Processing (NLP) and Natural Language Generation (NLG) to a great extent.

Though these models have shown impressive capabilities, there still arise challenges when it comes to these models producing content that is factually correct as well as fluent. LLMs are capable of producing extremely realistic and cohesive text, but they also have a tendency sometimes to produce factually false information, i.e., hallucinations. These hallucinations can hamper the practical use of these models in real-world applications.

Previous studies on hallucinations in the Natural Language Generation have frequently concentrated on situations in which a certain reference text is available, examining how closely the generated text adheres to these references. On the other hand, issues have been brought up regarding hallucinations that result from the model depending more on facts and general knowledge than from a particular source text.

To overcome this, a team of researchers has recently released a study on a unique task: automatic fine-grained hallucination detection. The team has proposed a comprehensive taxonomy consisting of six hierarchically defined forms of hallucinations. Automated systems for modifying or detecting hallucinations have been developed. 

Current systems frequently focus on particular domains or types of errors, oversimplifying factual errors into binary categories like factual or not factual. This oversimplification may not capture the variety of hallucination kinds, such as entity-level contradictions and the creation of entities that have no real-world existence. For that, the team has suggested a more detailed method of hallucination identification by introducing a new task, benchmark, and model in order to get over these drawbacks. 

The objectives are precise detection of hallucination sequences, differentiation of mistake types, and recommendations for possible improvements. The team has focused on hallucinations in information-seeking contexts when grounding in world knowledge is vital. They have also provided a unique taxonomy that divides factual errors into six kinds.

The team has presented a new benchmark that incorporates human judgments on outputs from two Language Models (LM), ChatGPT and Llama2-Chat 70B, across multiple domains to help in the evaluation of fine-grained hallucination identification. Based on the benchmark study, it was observed that a considerable percentage of ChatGPT and Llama2-Chat’s outputs, 60% and 75%, respectively, display hallucinations. 

In ChatGPT and Llama2-Chat, the benchmark indicated an average of 1.9 and 3.4 hallucinations per response. It was also noted that a large proportion of these hallucinations belong to categories that have not been properly examined. Flaws other than entity-level faults, like fabricated concepts or unverifiable words, were present in more than 60% of LM-generated hallucinations.

The team has also trained FAVA, a retrieval-augmented LM, as a potential solution. The training procedure included meticulously creating synthetic data production to identify and address fine-grained hallucinations. Both automated and human assessments on the benchmark demonstrated that FAVA performs better than ChatGPT in terms of fine-grained hallucination identification. FAVA’s proposed edits improved the factuality of LM-generated text and detected hallucinations simultaneously, yielding 5–10% FActScore improvements.

In conclusion, this study has proposed a unique task of automatic fine-grained hallucination identification in order to address the common problem of hallucinations in text generated by Language Models. The paper’s thorough taxonomy and benchmark have provided insight into the degree of hallucinations in popular LMs. Promising results have been shown in detecting and correcting fine-grained hallucinations using FAVA, the proposed retrieval-augmented LM, highlighting the necessity for further developments in this area.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel

The post This AI Paper from the University of Washington, CMU, and Allen Institute for AI Unveils FAVA: The Next Leap in Detecting and Editing Hallucinations in Language Models appeared first on MarkTechPost.

This AI Paper from UCLA Revolutionizes Uncertainty Quantification in D …

With the growth of Deep learning, it is used in many fields, including data mining and natural language processing. It is also widely used in solving inverse imaging problems, such as image denoising and super-resolution imaging. The image denoising techniques are used to generate high-quality images from raw data. However, deep neural networks are inaccurate and can produce unreliable outcomes.

To address this challenge, thorough research has been conducted by the researchers. It is found that incorporating uncertainty quantification (UQ) into deep learning models gauges their confidence level regarding predictions. It enables the model to find unusual situations like anomalous data and malicious attacks. However, many deep learning models do not have robust UQ capabilities for distinguishing data distribution shifts during testing stages.

Consequently, researchers at the University of California, Los Angeles, have proposed a new UQ technique that relies on cycle consistency. It can improve deep neural networks’ reliability in inverse imaging issues. Their powerful UQ method quantitatively estimates the uncertainty of neural network outputs and automatically detects any unknown input data corruption and distribution shifts. The model works by executing forward–backward cycles using a physical forward model and has an iterative-trained neural network. Also, it accumulates uncertainty and estimates it by combining a computational representation of the underlying processes with a neural network and executing cycles between input and output data. 

The researchers have set upper and lower limits for cycle consistency. These limits clarify its linkage to the output uncertainty of a given neural network. These limits are derived using expressions for converging and diverging cycle outputs. The limit determination allows us to estimate uncertainty even if the ground truth remains undisclosed. Further, the researchers developed a machine learning model that can categorize images according to disturbances they have via forward-backward cycles. The researchers emphasized that cycle consistency metrics enhanced the final classification’s precision.

Also, to tackle the problem of identification of out-of-distribution (OOD) images related to image super-resolution, they gathered three categories of low-resolution images: animé, microscopy, and human faces. They used Separate super-resolution neural networks for each image category and then performed evaluations across all three systems. Then, they used a machine learning algorithm to determine data distribution mismatches based on forward-backward cycles. They found that model-triggered alerts were classified as OOD instances when the animé-image super-resolution network was used on other inputs, microscopic and facial images. Comparing the other two networks showed similar results. It shows that overall accuracy in identifying OOD photos was higher than other approaches.

In conclusion, this cycle-consistency-based UQ method, developed by researchers at the University of California, Los Angeles, can increase the dependability of neural networks in inverse imaging. Furthermore, this method can also be used in other fields where uncertainty estimates are necessary. Also, this model can be a significant step in addressing the challenges of uncertainty in neural network predictions, and it can mark the way for more reliable deployment of deep learning models in real-world applications.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel

The post This AI Paper from UCLA Revolutionizes Uncertainty Quantification in Deep Neural Networks Using Cycle Consistency appeared first on MarkTechPost.

Researchers from ByteDance and Sun Yat-Sen University Introduce Diffus …

In image generation, diffusion models have significantly advanced, leading to the widespread availability of top-tier models on open-source platforms. Despite these strides, challenges in text-to-image systems persist, particularly in managing diverse inputs and being confined to single-model outcomes. Unified efforts commonly address two distinct facets: first, the parsing of various prompts during the input stage, and second, the activation of expert models for generating output.

Recent years have seen the rise of diffusion models like DALLE-2 and Imagen, transforming image editing and stylization. However, their non-open source nature impedes widespread adoption. Stable Diffusion (SD), an open-source text-to-image model, and its latest iteration, SDXL, have gained popularity. Challenges include model limitations and prompt constraints, which are addressed through approaches like SD1.5+Lora and prompt engineering. Despite progress, achieving optimal performance still needs to be completed. Various methods, such as prompt engineering and fixed templates, partially address challenges in stable diffusion models. However, lacking a comprehensive solution prompts the question: Can a unified framework be devised to unlock prompt constraints and activate domain expert models?

ByteDance and Sun Yat-Sen University researchers have proposed DiffusionGPT, employing a Large Language Model (LLM) to create an all-encompassing generation system. Utilizing a Tree-of-Thought (ToT) structure, it integrates various generative models based on prior knowledge and human feedback. The LLM parses prompt and guides the ToT to select the most suitable model for generating the desired output. Advantage Databases enhance the ToT with valuable human feedback, aligning the model selection process with human preferences, thus providing a comprehensive and user-informed solution.

The system(DifusionGPT) follows a four-step workflow: Prompt Parse, Tree-of-thought of Models Build and Search, Model Selection with Human Feedback, and Execution of Generation. The Prompt Parse stage extracts salient information from diverse prompts, while the Tree-of-Thought of Models constructs a hierarchical model tree for efficient searching. Model Selection leverages human feedback through Advantage Databases, ensuring alignment with user preferences. The chosen generative model then undergoes the Execution of Generation, with a Prompt Extension Agent enhancing prompt quality for improved outputs.

Researchers employed ChatGPT as the LLM controller in the experimental setup, integrating it into the LangChain framework for precise guidance. DiffusionGPT showcased superior performance compared to baseline models such as SD1.5 and SD XL across various prompt types. Notably, DiffusionGPT addressed semantic limitations and enhanced image aesthetics, outperforming SD1.5 in both image-reward and aesthetic scores by 0.35% and 0.44%, respectively.

To conclude, The proposed Diffusion-GPT by the researchers from ByteDance Inc. and Sun Yat-Sen University introduces a comprehensive framework that seamlessly integrates high-quality generative models, effectively handling a variety of prompts. Utilizing LLMs and a ToT structure, Diffusion-GPT adeptly interprets input prompts and selects the most suitable model. This adaptable training-free solution showcases exceptional performance across diverse prompts and domains. It also incorporates human feedback through Advantage Databases, offering an efficient and easily integrable plug-and-play solution conducive to community development in the field.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel

The post Researchers from ByteDance and Sun Yat-Sen University Introduce DiffusionGPT: LLM-Driven Text-to-Image Generation System appeared first on MarkTechPost.

Build enterprise-ready generative AI solutions with Cohere foundation …

Generative AI solutions have the potential to transform businesses by boosting productivity and improving customer experiences, and using large language models (LLMs) with these solutions has become increasingly popular. Building proofs of concept is relatively straightforward because cutting-edge foundation models are available from specialized providers through a simple API call. Therefore, organizations of various sizes and across different industries have begun to reimagine their products and processes using generative AI.
Despite their wealth of general knowledge, state-of-the-art LLMs only have access to the information they were trained on. This can lead to factual inaccuracies (hallucinations) when the LLM is prompted to generate text based on information they didn’t see during their training. Therefore, it’s crucial to bridge the gap between the LLM’s general knowledge and your proprietary data to help the model generate more accurate and contextual responses while reducing the risk of hallucinations. The traditional method of fine-tuning, although effective, can be compute-intensive, expensive, and requires technical expertise. Another option to consider is called Retrieval Augmented Generation (RAG), which provides LLMs with additional information from an external knowledge source that can be updated easily.
Additionally, enterprises must ensure data security when handling proprietary and sensitive data, such as personal data or intellectual property. This is particularly important for organizations operating in heavily regulated industries, such as financial services and healthcare and life sciences. Therefore, it’s important to understand and control the flow of your data through the generative AI application: Where is the model located? Where is the data processed? Who has access to the data? Will the data be used to train models, eventually risking the leak of sensitive data to public LLMs?
This post discusses how enterprises can build accurate, transparent, and secure generative AI applications while keeping full control over proprietary data. The proposed solution is a RAG pipeline using an AI-native technology stack, whose components are designed from the ground up with AI at their core, rather than having AI capabilities added as an afterthought. We demonstrate how to build an end-to-end RAG application using Cohere’s language models through Amazon Bedrock and a Weaviate vector database on AWS Marketplace. The accompanying source code is available in the related GitHub repository hosted by Weaviate. Although AWS will not be responsible for maintaining or updating the code in the partner’s repository, we encourage customers to connect with Weaviate directly regarding any desired updates.
Solution overview
The following high-level architecture diagram illustrates the proposed RAG pipeline with an AI-native technology stack for building accurate, transparent, and secure generative AI solutions.

Figure 1: RAG workflow using Cohere’s language models through Amazon Bedrock and a Weaviate vector database on AWS Marketplace

As a preparation step for the RAG workflow, a vector database, which serves as the external knowledge source, is ingested with the additional context from the proprietary data. The actual RAG workflow follows the four steps illustrated in the diagram:

The user enters their query.
The user query is used to retrieve relevant additional context from the vector database. This is done by generating the vector embeddings of the user query with an embedding model to perform a vector search to retrieve the most relevant context from the database.
The retrieved context and the user query are used to augment a prompt template. The retrieval-augmented prompt helps the LLM generate a more relevant and accurate completion, minimizing hallucinations.
The user receives a more accurate response based on their query.

The AI-native technology stack illustrated in the architecture diagram has two key components: Cohere language models and a Weaviate vector database.
Cohere language models in Amazon Bedrock
The Cohere Platform brings language models with state-of-the-art performance to enterprises and developers through a simple API call. There are two key types of language processing capabilities that the Cohere Platform provides—generative and embedding—and each is served by a different type of model:

Text generation with Command – Developers can access endpoints that power generative AI capabilities, enabling applications such as conversational, question answering, copywriting, summarization, information extraction, and more.
Text representation with Embed – Developers can access endpoints that capture the semantic meaning of text, enabling applications such as vector search engines, text classification and clustering, and more. Cohere Embed comes in two forms, an English language model and a multilingual model, both of which are now available on Amazon Bedrock.

The Cohere Platform empowers enterprises to customize their generative AI solution privately and securely through the Amazon Bedrock deployment. Amazon Bedrock is a fully managed cloud service that enables development teams to build and scale generative AI applications quickly while helping keep your data and applications secure and private. Your data is not used for service improvements, is never shared with third-party model providers, and remains in the Region where the API call is processed. The data is always encrypted in transit and at rest, and you can encrypt the data using your own keys. Amazon Bedrock supports security requirements, including U.S. Health Insurance Portability and Accountability Act (HIPAA) eligibility and General Data Protection Regulation (GDPR) compliance. Additionally, you can securely integrate and easily deploy your generative AI applications using the AWS tools you are already familiar with.
Weaviate vector database on AWS Marketplace
Weaviate is an AI-native vector database that makes it straightforward for development teams to build secure and transparent generative AI applications. Weaviate is used to store and search both vector data and source objects, which simplifies development by eliminating the need to host and integrate separate databases. Weaviate delivers subsecond semantic search performance and can scale to handle billions of vectors and millions of tenants. With a uniquely extensible architecture, Weaviate integrates natively with Cohere foundation models deployed in Amazon Bedrock to facilitate the convenient vectorization of data and use its generative capabilities from within the database.
The Weaviate AI-native vector database gives customers the flexibility to deploy it as a bring-your-own-cloud (BYOC) solution or as a managed service. This showcase uses the Weaviate Kubernetes Cluster on AWS Marketplace, part of Weaviate’s BYOC offering, which allows container-based scalable deployment inside your AWS tenant and VPC with just a few clicks using an AWS CloudFormation template. This approach ensures that your vector database is deployed in your specific Region close to the foundation models and proprietary data to minimize latency, support data locality, and protect sensitive data while addressing potential regulatory requirements, such as GDPR.
Use case overview
In the following sections, we demonstrate how to build a RAG solution using the AI-native technology stack with Cohere, AWS, and Weaviate, as illustrated in the solution overview.
The example use case generates targeted advertisements for vacation stay listings based on a target audience. The goal is to use the user query for the target audience (for example, “family with small children”) to retrieve the most relevant vacation stay listing (for example, a listing with playgrounds close by) and then to generate an advertisement for the retrieved listing tailored to the target audience.

Figure 2: First few rows of vacation stay listings available from Inside Airbnb.

The dataset is available from Inside Airbnb and is licensed under a Creative Commons Attribution 4.0 International License. You can find the accompanying code in the GitHub repository.
Prerequisites
To follow along and use any AWS services in the following tutorial, make sure you have an AWS account.
Enable components of the AI-native technology stack
First, you need to enable the relevant components discussed in the solution overview in your AWS account. Complete the following steps:

In the left Amazon Bedrock console, choose Model access in the navigation pane.
Choose Manage model access on the top right.
Select the foundation models of your choice and request access.

Figure 3: Manage model access in Amazon Bedrock console.

Next, you set up a Weaviate cluster.

Subscribe to the Weaviate Kubernetes Cluster on AWS Marketplace.
Launch the software using a CloudFormation template according to your preferred Availability Zone.

The CloudFormation template is pre-populated with default values.

For Stack name, enter a stack name.
For helmauthenticationtype, it is recommended to enable authentication by setting helmauthenticationtype to apikey and defining a helmauthenticationapikey.
For helmauthenticationapikey, enter your Weaviate API key.
For helmchartversion, enter your version number. It must be at least v.16.8.0. Refer to the GitHub repo for the latest version.
For helmenabledmodules, make sure tex2vec-aws and generative-aws are present in the list of enabled modules within Weaviate.

Figure 4: CloudFormation template.

This template takes about 30 minutes to complete.
Connect to Weaviate
Complete the following steps to connect to Weaviate:

In the Amazon SageMaker console, navigate to Notebook instances in the navigation pane via Notebook > Notebook instances on the left.
Create a new notebook instance.
Install the Weaviate client package with the required dependencies:

$ pip install weaviate-client

Connect to your Weaviate instance with the following code:

import weaviate

client = weaviate.Client(
url = “http://<YOUR-WEAVIATE-URL>”,
auth_client_secret=weaviate.AuthApiKey(api_key=”YOUR-WEAVIATE-API-KEY”),
additional_headers={
“X-AWS-Access-Key”: “<YOUR-AWS-ACCESS-KEY>”,
“X-AWS-Secret-Key”: “<YOUR-AWS-SECRET-ACCESS-KEY>”
}
)

Provide the following information:

Weaviate URL – Access Weaviate via the load balancer URL. In the Amazon Elastic Compute Cloud (Amazon EC2) console, choose Load balancers in the navigation pane and find the load balancer. Look for the DNS name column and add http:// in front of it.
Weaviate API key – This is the key you set earlier in the CloudFormation template (helmauthenticationapikey).
AWS access key and secret access key – You can retrieve the access key and secret access key for your user in the AWS Identity and Access Management (IAM) console.

Figure 5: AWS Identity and Access Management (IAM) console to retrieve AWS access key and secret access key.

Configure the Amazon Bedrock module to enable Cohere models
Next, you define a data collection (class) called Listings to store the listings’ data objects, which is analogous to creating a table in a relational database. In this step, you configure the relevant modules to enable the usage of Cohere language models hosted on Amazon Bedrock natively from within the Weaviate vector database. The vectorizer (“text2vec-aws“) and generative module (“generative-aws“) are specified in the data collection definition. Both of these modules take three parameters:

“service” – Use “bedrock” for Amazon Bedrock (alternatively, use “sagemaker” for Amazon SageMaker JumpStart)
“Region” – Enter the Region where your model is deployed
“model” – Provide the foundation model’s name

See the following code:

collection_definition = {
“class”: “Listings”,
“moduleConfig”: {
“text2vec-aws”: {
“service”: “bedrock”,
“region”: “us-east-1”,
“model”: “cohere.embed-english-v3”,
},
“generative-aws”: {
“service”: “bedrock”,
“region”: “us-east-1”,
“model”: “cohere.command-text-v14”
}
},
“vectorizer”: “text2vec-aws”
}

Ingest data into the Weaviate vector database
In this step, you define the structure of the data collection by configuring its properties. Aside from the property’s name and data type, you can also configure if only the data object will be stored or if it will be stored together with its vector embeddings. In this example, host_name and property_type are not vectorized:

collection_definition[“properties”] = [
{ “name”: “host_name”, “dataType”: [“text”],
“moduleConfig”: {“text2vec-aws”: {“skip”: True}}
},
{ “name”: “property_type”, “dataType”: [“text”],
“moduleConfig”: {“text2vec-aws”: {“skip”: True}}
}
{ “name”: “description”, “dataType”: [“text”] },
{“name”: “neighborhood_overview”, “dataType”: [“text”] },
]

Run the following code to create the collection in your Weaviate instance:

client.schema.create_class(collection_definition)

You can now add objects to Weaviate. You use a batch import process for maximum efficiency. Run the following code to import data. During the import, Weaviate will use the defined vectorizer to create a vector embedding for each object. The following code loads objects, initializes a batch process, and adds objects to the target collection one by one:

from weaviate.util import generate_uuid5
import pandas as pd

# Read CSV file
csv_file = ‘./data/listings.csv’
df = pd.read_csv(csv_file, usecols = [‘host_name’,
‘property_type’,
‘description’,
‘neighborhood_overview’,
])

df.fillna(”, inplace=True)

# Configure batch
client.batch.configure(batch_size=100)

# Initialize batch process
with client.batch as batch:
for _, row in df.iterrows():
listing_object = {
“host_name”: row[“host_name”],
“property_type” : row[“property_type”],
“description”: row[“description”],
“neighborhood_overview” : row[“neighborhood_overview”],
}
batch.add_data_object(
class_name = “Listings”,
data_object = listing_object,
uuid = generate_uuid5(listing_object)
)

Retrieval Augmented Generation
You can build a RAG pipeline by implementing a generative search query on your Weaviate instance. For this, you first define a prompt template in the form of an f-string that can take in the user query ({target_audience}) directly and the additional context ({{host_name}}, {{property_type}}, {{description}}, and {{neighborhood_overview}}) from the vector database at runtime:

prompt_template = f”””You are a copywriter.
Write a short advertisement for the following vacation stay.
Host: {{host_name}}
Property type: {{property_type}}
Description: {{description}}
Neighborhood: {{neighborhood_overview}}
Target audience: {target_audience}
“””

Next, you run a generative search query. This prompts the defined generative model with a prompt that is comprised of the user query as well as the retrieved data. The following query retrieves one listing object (.with_limit(1)) from the Listings collection that is most similar to the user query (.with_near_text({“concepts”: target_audience})). Then the user query (target_audience) and the retrieved listings properties ([“description”, “neighborhood”, “host_name”, “property_type”]) are fed into the prompt template. See the following code:

result = client.query
.get(“Listings”,
[“description”, “neighborhood”, “host_name”, “property_type”])
.with_near_text({“concepts”: target_audience})
.with_limit(1)
.with_generate(single_prompt=prompt_template)
.do()

In the following example, you can see that the preceding piece of code for target_audience = “Family with small children” retrieves a listing from the host Marre. The prompt template is augmented with Marre’s listing details and the target audience:

“You are a copywriter.
Write a short advertisement for the following vacation stay.
Host: Marre
Property type: Entire townhouse
Description: Welcome to our lovely home! You’ve come to the right place …
Neighborhood: THE NEIGHBORHOOD:<br /><br />We are in the city centre …
Target audience: Family with small children”

Based on the retrieval-augmented prompt, Cohere’s Command model generates the following targeted advertisement:

“Looking for a kid-friendly home away from home in one of the trendiest areas of
Amsterdam? Look no further than this stylish townhouse in the heart of the city!
Our 120ft² space is perfect for a family of four or a group of adults, with two
bedrooms featuring elevated beds suitable for kids and one bedroom with a single
bed. The ground floor features a spacious living room, a kitchen with a large
dining table, and a half bath while heading upstairs leads you to a master bedroom
and a full bathroom. Our central location means you’re just steps away from the
best cafes, restaurants, and bars that the city has to offer, and the Vondelpark
and other attractions are only a short walk away! Supermarkets and paid parking
are also conveniently located nearby. Experience the best of Amsterdam in a
laid-back,local way and create unforgettable memories with your family at our
cozy townhouse.”

Alternative customizations
You can make alternative customizations to different components in the proposed solution, such as the following:

Cohere’s language models are also available through Amazon SageMaker JumpStart, which provides access to cutting-edge foundation models and enables developers to deploy LLMs to Amazon SageMaker, a fully managed service that brings together a broad set of tools to enable high-performance, low-cost machine learning for any use case. Weaviate is integrated with SageMaker as well.
A powerful addition to this solution is the Cohere Rerank endpoint, available through SageMaker JumpStart. Rerank can improve the relevance of search results from lexical or semantic search. Rerank works by computing semantic relevance scores for documents that are retrieved by a search system and ranking the documents based on these scores. Adding Rerank to an application requires only a single line of code change.
To cater to different deployment requirements of different production environments, Weaviate can be deployed in various additional ways. For example, it is available as a direct download from Weaviate website, which runs on Amazon Elastic Kubernetes Service (Amazon EKS) or locally via Docker or Kubernetes. It’s also available as a managed service that can run securely within a VPC or as a public cloud service hosted on AWS with a 14-day free trial.
You can serve your solution in a VPC using Amazon Virtual Private Cloud (Amazon VPC), which enables organizations to launch AWS services in a logically isolated virtual network, resembling a traditional network but with the benefits of AWS’s scalable infrastructure. Depending on the classified level of sensitivity of the data, organizations can also disable internet access in these VPCs.

Clean up
To prevent unexpected charges, delete all the resources you deployed as part of this post. If you launched the CloudFormation stack, you can delete it via the AWS CloudFormation console. Note that there may be some AWS resources, such as Amazon Elastic Block Store (Amazon EBS) volumes and AWS Key Management Service (AWS KMS) keys, that may not be deleted automatically when the CloudFormation stack is deleted.

Figure 6: Delete all resources via the AWS CloudFormation console.

Conclusion
This post discussed how enterprises can build accurate, transparent, and secure generative AI applications while still having full control over their data. The proposed solution is a RAG pipeline using an AI-native technology stack as a combination of Cohere foundation models in Amazon Bedrock and a Weaviate vector database on AWS Marketplace. The RAG approach enables enterprises to bridge the gap between the LLM’s general knowledge and the proprietary data while minimizing hallucinations. An AI-native technology stack enables fast development and scalable performance.
You can start experimenting with RAG proofs of concept for your enterprise-ready generative AI applications using the steps outlined in this post. The accompanying source code is available in the related GitHub repository. Thank you for reading. Feel free to provide comments or feedback in the comments section.

About the authors
James Yi is a Senior AI/ML Partner Solutions Architect in the Technology Partners COE Tech team at Amazon Web Services. He is passionate about working with enterprise customers and partners to design, deploy, and scale AI/ML applications to derive business value. Outside of work, he enjoys playing soccer, traveling, and spending time with his family.
Leonie Monigatti is a Developer Advocate at Weaviate. Her focus area is AI/ML, and she helps developers learn about generative AI. Outside of work, she also shares her learnings in data science and ML on her blog and on Kaggle.
Meor Amer is a Developer Advocate at Cohere, a provider of cutting-edge natural language processing (NLP) technology. He helps developers build cutting-edge applications with Cohere’s Large Language Models (LLMs).
Shun Mao is a Senior AI/ML Partner Solutions Architect in the Emerging Technologies team at Amazon Web Services. He is passionate about working with enterprise customers and partners to design, deploy and scale AI/ML applications to derive their business values. Outside of work, he enjoys fishing, traveling and playing Ping-Pong.

How to Create a Better Lead Nurturing Strategy for X-Ray Contacts

If you’re reading this, you probably know about our Website Visitor ID X-Ray pixel. And that’s because we’re proud of our game-changing tool, which provides businesses with the contact and intent information of ~20% of their anonymous website visitors. 

Anyone who does digital marketing knows how valuable that data is. 

But anyone who does digital marketing also knows that data, on its own, doesn’t really sell anything. Even companies with mountains of accurate, precise data need the right lead nurturing strategy. 

So that’s what we’re going to talk about today: How to create a better lead nurturing strategy for your X-Ray contacts!  

If you’re new to this process, don’t worry–we’ll cover the basics! And if you’re an old pro, we’ve got product-specific tips you’ll need too! 

What is lead nurturing?

Why is lead nurturing strategy important?

Audience segmentation

Lead Nurturing Strategy for X-Ray Contacts

See Who Is On Your Site Right Now!

Turn anonymous visitors into genuine contacts.

Try it Free, No Credit Card Required

Get The X-Ray Pixel

Lead Nurturing Strategy

What is lead nurturing?

At its core, lead nurturing is like tending a garden. It’s the process of cultivating relationships with potential customers at every stage of the sales funnel, and throughout the buyer’s journey. It’s not just about making a sale; it’s about creating a meaningful dialogue and a lasting relationship.

Imagine meeting someone for the first time. You wouldn’t ask them to marry you right away, right? Similarly, in the world of digital marketing, we don’t push for a sale at the first interaction. Instead, we foster that connection, provide value, and gently guide them towards deciding that purchasing from us is in their best interest.

Lead nurturing involves understanding the needs and interests of your leads and providing them with relevant information at the right time. This means sending targeted content, personalized emails, engaging social media posts, and more, all tailored to address the specific needs and pain points of your leads. By doing this, we establish trust and credibility, which are essential in converting leads into loyal customers.

Why is a lead nurturing strategy important?

Now, you might be wondering, “What makes lead nurturing so important?” Here’s the deal: not all visitors to your website are ready to buy. In fact, most of them aren’t. They’re in the process of researching, considering their options, or might not even realize they need your product yet. Lead nurturing helps you to stay connected with these potential customers, gently nudging them towards a purchase decision over time.

With our Website Visitor ID X-Ray pixel, you’re not just shooting in the dark. You have access to valuable data about who your visitors are and what they’re interested in. This goldmine of information is your starting point for effective lead nurturing. By leveraging this data, you can create a nurturing strategy that resonates with your audience, addresses their needs, and ultimately guides them down the sales funnel.

Audience Segmentation

So, what is the valuable data the X-Ray pixel gives you and how do you use it to build an effective lead-nurturing campaign? Let’s dig in. 

This is another fancy-sounding term but don’t be intimidated. It just means dividing your potential customers into groups based on their common characteristics! 

You want to segment your audience to make sure that your X-Ray contacts are seeing the right type of communication about the right thing at the right time.

This process can be very difficult without the right information, but thankfully, your X-Ray contacts have lots of useful information!   

AI-Powered Advertising

How to Unlock AI and Lead Capture Tech for 10X Return on Ad Spend

HOSTED BY

Larry Kim

Founder and CEO, Customers.ai

Free Webinar: Watch Now

Principles of Audience Segmentation in Lead Nurturing Strategy

Segmentation is the key to creating personalized marketing campaigns that resonate with your audience. With the rich data provided by our X-Ray leads, you can segment your audience in various meaningful ways. Let’s dive into how you can use each piece of information to segment and target your leads effectively:

1. Contact Information

Personalization: Use names and other contact details to personalize your communications. Addressing leads by their names in emails or messages can significantly boost engagement.

2. Landing Page

Interest-Based Segmentation: Group leads based on the landing pages they visited. This indicates their specific interests or the solutions they are seeking. Tailor your content and offers based on these interests.

3. Clicks and Page Depth

Engagement Level Segmentation: Leads who click through multiple pages or delve deeper into your site are showing higher engagement. Segment these active users and target them with more detailed information or special offers.

4. Time Spent on Page

Interest Intensity Segmentation: Longer time spent on a page suggests a higher level of interest in that topic. Segment these users to receive more comprehensive content or offers related to the pages they spent the most time on.

5. Location

Geographic Segmentation: Tailor your marketing efforts based on the geographic location of your leads. This can involve regional offers, events, or content that resonates with local trends and preferences.

6. Company Size

Business Segmentation: Segment leads based on their company size. Different sizes of companies have different needs and pain points. Tailor your messaging to address these unique needs.

7. Industry

Industry-Specific Segmentation: Different industries require different approaches. Segment your leads by industry to provide highly relevant content, solutions, and case studies.

Tips for Effective Segmentation

Combine Segments for Precision: Don’t hesitate to combine multiple data points for more precise segmentation. For instance, you might target leads from a specific industry, within a certain company size range, who spent a lot of time on a specific page.

Test and Refine: Use A/B testing to see which segments respond best to certain types of content and offers. Continuously refine your segments based on lead behavior and feedback.

Keep it Dynamic: As leads move through the sales funnel, their interests and needs might change. Ensure your segments are dynamic and update your strategies accordingly.

By effectively segmenting your audience with the data provided by our X-Ray leads, you can create highly targeted and personalized marketing campaigns. This not only enhances the user experience but also significantly increases the chances of converting leads into loyal customers. Remember, the goal is to deliver the right message, to the right person, at the right time.

How to Create a Better Lead Nurturing Strategy for X-Ray Contacts

Now that you’ve got an understanding of the fundamentals, it’s time to apply the strategies to your X-ray contacts.

There are a number of things you can do with these contacts but the most important thing is to not spam them!

Your X-ray contacts should be broken up into different audiences based on intent.

From there, you can create specific lead flows that segment them into the right place. Here’s how to break it out:

Send Every Contact to your Digital Ad Networks

Your digital ad platforms, like Meta Ads, are a safe place for all types of users regardless of their intent. Digital advertising is an unintrusive way to keep your business in front of potential customers which makes it a safe landing spot for people who aren’t so familiar with you yet! 

Once you’ve sent these contacts to your digital ad platforms, you can build retargeting and lookalike campaigns depending on your needs! 

Create Page-Specific Digital Ad Audiences

Even better than general retargeting is product-specific retargeting! 

Imagine, for example, that you sell both Homeowners and Renters insurance. 

You could put everyone who visits your site into one bucket and advertise insurance to all of them the same way. But because your X-Ray data shows you which page they landed on, you can get more specific! 

This means that you can (say it with me) segment them!

So you can show renter-specific ads to the renters and homeowner-specific ads to the homeowners! Personalization is always more appealing! 

Creating audiences like this is dead simple, too. 

In your Customers.ai account, navigate to your My Leads tab and click on “Audiences.” 

You’ll see a big list of all your contacts. Then select “Add FIlter.”

In the Attribute drop-down, select “Landing Page URL” and in the Operator drop-down select “Equals.” Paste the landing page URL in the Value section. 

Then just save it as an audience and you’re good to send it to your Facebook account! 

Add High-Intent Contacts to Email Campaigns

Contacts who are showing more interest in you are great fits for email campaigns! 

The exact parameters of what indicates high-intent will depend on your specific business but, as we’ve covered, strong indicators include: 

Spending a lot of time on your website

Clicking several pages

Visiting high-intent pages, like your Pricing page

Adding an item to their cart

With Customers.ai, it is easy to set up automations based on a combination of high-intent signals so that you can send personalized, timely campaigns to the right people. 

When you’re setting up your X-Ray automation, pick a high-intent page, like Pricing: 

Then, add other intent signals in the Advanced Capture Settings: 

Once you’ve got that set-up, write a S.o.L.D. email personalized to their specific interests with our AI Email generator! 

Once they’ve engaged with that email, you can send them to your Klaviyo, SendGrid, or other CRM to further nurture them! 

Send the Highest-of-High Intent Contacts to Your Sales Team

Customers who visit your site repeatedly, spend a lot of time on high-intent pages, meet your ICP, and engage with your email marketing are in the distinguished group of people who you might send over to your sales team to finally close the deal! 

You only want to do this with the cream of the crop, so to speak! 

If you’ve got Salesforce, Customers.ai has a simple integration that makes this a snap. You can also notify your sales team on Slack via our Zapier integration! 

Create an X-Ray Strategy That Works for You

The data that X-ray provides is invaluable when it comes to sales and marketing and creating the right lead-nurturing strategy is imperative.

It’s important to remember that all leads are not created equal and therefore shouldn’t be treated equally.

With Customers.ai website visitor identification and the right lead flows, you can 10x your conversions and improve ROI. Two things we know businesses love to hear.

Want to get started? See how many contacts X-ray can pick up for you with our form below.

Convert Website Visitors into Real Contacts!

Identify who is visiting your site with name, email and more. Get 500 contacts for free!

Please enable JavaScript in your browser to complete this form.Website / URL *Grade my website

Important Next Steps

See what targeted outbound marketing is all about. Capture and engage your first 500 website visitor leads with Customers.ai X-Ray website visitor identification for free.

Talk and learn about sales outreach automation with other growth enthusiasts. Join Customers.ai Island, our Facebook group of 40K marketers and entrepreneurs who are ready to support you.

Advance your marketing performance with Sales Outreach School, a free tutorial and training area for sales pros and marketers.

The post How to Create a Better Lead Nurturing Strategy for X-Ray Contacts appeared first on Customers.ai.

MIT and Google Researchers Propose Health-LLM: A Groundbreaking Artifi …

The realm of healthcare has been revolutionized by the advent of wearable sensor technology, which continuously monitors vital physiological data such as heart rate variability, sleep patterns, and physical activity. This advancement has paved the way for a novel intersection with large language models (LLMs), traditionally known for their linguistic prowess. The challenge, however, lies in effectively harnessing this non-linguistic, multi-modal time-series data for health predictions, requiring a nuanced approach beyond the conventional capabilities of LLMs.

This research pivots around adapting LLMs to interpret and utilize wearable sensor data for health predictions. The complexity of this data, characterized by its high dimensionality and continuous nature, demands an LLM’s ability to understand individual data points and their dynamic relationships over time. Traditional health prediction methods, predominantly involving models like Support Vector Machines or Random Forests, have been effective to a certain extent. However, the recent emergence of advanced LLMs, such as GPT-3.5 and GPT-4, has shifted the focus towards exploring their potential in this domain.

MIT and Google researchers introduced Health-LLM, a groundbreaking framework designed to adapt LLMs for health prediction tasks using data from wearable sensors. This study comprehensively evaluates eight state-of-the-art LLMs, including notable models like GPT-3.5 and GPT-4. The researchers meticulously selected thirteen health prediction tasks across five domains: mental health, activity tracking, metabolism, sleep, and cardiology. These tasks were chosen to cover a broad spectrum of health-related challenges and to test the models’ capabilities in diverse scenarios.

The methodology employed in this research is both rigorous and innovative. The study involved four distinct steps: zero-shot prompting, few-shot prompting augmented with chain-of-thought and self-consistency techniques, instructional fine-tuning, and an ablation study focusing on context enhancement in a zero-shot setting. Zero-shot prompting tested the models’ inherent capabilities without task-specific training, while few-shot prompting utilized limited examples to facilitate in-context learning. Chain-of-thought and self-consistency techniques were integrated to enhance the models’ understanding and coherence. Instructional fine-tuning further tailored the models to the specific nuances of health prediction tasks.

The Health-Alpaca model, a fine-tuned version of the Alpaca model, emerged as a standout performer, achieving the best results in five out of thirteen tasks. This achievement is particularly noteworthy considering Health-Alpaca’s substantially smaller size than larger models like GPT-3.5 and GPT-4. The study’s ablation component revealed that including context enhancements – comprising user profile, health knowledge, and temporal context – could yield up to a 23.8% improvement in performance. This finding highlights the significant role of contextual information in optimizing LLMs for health predictions.

In summary, this research marks a significant stride in integrating LLMs with wearable sensor data for health predictions. The study demonstrates the feasibility of this approach and underscores the importance of context in enhancing model performance. The success of the Health-Alpaca model, in particular, suggests that smaller, more efficient models can be equally, if not more, effective in health prediction tasks. This opens up new possibilities for applying advanced healthcare analytics in a more accessible and scalable manner, thereby contributing to the broader goal of personalized healthcare.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel
The post MIT and Google Researchers Propose Health-LLM: A Groundbreaking Artificial Intelligence Framework Designed to Adapt LLMs for Health Prediction Tasks Using Data from Wearable Sensor appeared first on MarkTechPost.

Researchers from Washington University in St. Louis Propose Visual Act …

In the challenging fight against illegal poaching and human trafficking, researchers from Washington University in St. Louis’s McKelvey School of Engineering have devised a smart solution to enhance geospatial exploration. The problem at hand is how to efficiently search large areas to find and stop such activities. The current methods for local searches are limited by constraints, like the number of times one can search in a specific location.

Currently, there are methods to conduct local searches, but they face challenges regarding efficiency and adaptability. The challenge lies in deciding which areas to search first, given limited opportunities, and how to determine the next search location based on the findings. A team of researchers from Washington University in St. Louis sought to address this by developing a novel Visual Active Search (VAS) framework that combines computer vision and adaptive learning to improve search techniques.

The VAS framework consists of three main components: an image of the entire search area divided into regions, a local search function to check if a specific object is present in a given region, and a fixed search budget that regulates the frequency of the local search function’s execution. This framework aims to maximize the detection of objects within the allocated search budget. It builds on prior research in the field, combining active search with visual reasoning and harnessing the synergy between human efforts and artificial intelligence (AI).

The researchers introduced a spatial correlation between regions to scale up and adapt the active search to cover large areas efficiently. They presented their findings at a conference, showcasing that their approach outperformed existing methods. The metrics demonstrated their VAS framework’s capabilities in maximizing object detection within the given search constraints.

Looking ahead, the researchers plan to explore ways to expand the application of their framework. They aim to tailor the model for different domains, including wildlife conservation, search and rescue operations, and environmental monitoring. They have also presented a highly adaptable version of their search framework, capable of efficiently searching for various objects, even when they differ significantly from the ones the model is trained on.

In conclusion, the researchers have developed a promising solution to the challenges of geospatial exploration in combating illegal activities. Their VAS framework combines computer vision and adaptive learning, effectively guiding physical search processes in large areas with constrained search opportunities. The scalability and adaptability of their approach demonstrate its promise for practical use in different fields, meeting the demand for efficient and impactful search methods. 

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel
The post Researchers from Washington University in St. Louis Propose Visual Active Search (VAS): An Artificial Intelligence Framework for Geospatial Exploration  appeared first on MarkTechPost.

Meet VMamba: An Alternative to Convolutional Neural Networks CNNs and …

There are two major challenges in visual representation learning: the computational inefficiency of Vision Transformers (ViTs) and the limited capacity of Convolutional Neural Networks (CNNs) to capture global contextual information. ViTs suffer from quadratic computational complexity while excelling in fitting capabilities and international receptive field. On the other hand, CNNs offer scalability and linear complexity concerning image resolution but lack the dynamic weighting and global perspective of ViTs. These issues highlight a need for a model that brings together the strengths of both CNNs and ViTs without inheriting their respective computational and representational limitations.

Significant research exists in the evolution of machine visual perception. CNNs and ViTs have emerged as dominant graphical foundation models with unique strengths in processing visual information. State Space Models (SSMs) have gained prominence for their efficiency in modeling long sequences, influencing both NLP and computer vision domains. 

A team of researchers at UCAS, in collaboration with Huawei Inc. and Pengcheng Lab, introduced the Visual State Space Model (VMamba), a novel architecture for visual representation learning. VMamba is inspired by the state space model and aims to address the computational inefficiencies of ViTs while retaining their advantages, such as global receptive fields and dynamic weights. The research emphasizes VMamba’s innovative approach to tackling the direction-sensitive issue in visual data processing, proposing the Cross-Scan Module (CSM) for efficient spatial traversal. 

CSM is used to transform visual images into patch sequences and utilizes a 2D state space model as its core. VMamba’s selective scan mechanism and discretization process enhance its capabilities. The model’s effectiveness is validated through extensive experiments, comparing its effective receptive fields with models like ResNet50 and ConvNeXt-T and its performance in semantic segmentation on the ADE20K dataset.

Regarding the specifics of VMamba’s remarkable performance in various benchmarks, it achieved 48.5-49.7 mAP in object detection and 43.2-44.0 mIoU in instance segmentation on the COCO dataset, surpassing established models. On the ADE20K dataset, the VMamba-T model achieved 47.3 mIoU and 48.3 mIoU with multi-scale inputs in semantic segmentation, outperforming competitors like ResNet, DeiT, Swin, and ConvNeXt, as mentioned before. It also showed superior accuracy in semantic segmentation with various input resolutions. The comparative analysis highlighted VMamba’s global effective receptive fields, distinguishing it from other models with local ERFs.

The research on VMamba marks a significant leap in visual representation learning. It successfully integrates the strengths of CNNs and ViTs, offering a solution to their limitations. The novel CSM enhances VMamba’s efficiency, making it adept at handling various visual tasks with improved computational effectiveness. This model demonstrates its robustness across multiple benchmarks and suggests a new direction for future developments in graphical foundation models. VMamba’s approach to maintaining global receptive fields while ensuring linear complexity underscores its potential as a groundbreaking tool in computer vision.

Check out the Paper and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel
The post Meet VMamba: An Alternative to Convolutional Neural Networks CNNs and Vision Transformers for Enhanced Computational Efficiency appeared first on MarkTechPost.

B2B Marketers Aren’t Ready for Google’s New Email Guidelines [Repo …

Back in October, Google released new email guidelines. Set to be released in a few weeks (February 2024), the guidelines, which impact bulk senders (anyone sending over 5,000 emails a day to Gmail addresses) and general Gmail senders, gave us changes like authentication enforcement and spam complaint rate thresholds. 

Given that this is one of the biggest email updates we’ve seen in some time and certainly one of, if not the first time Google has given us specific numbers, we felt this was going to have a significant impact on email marketers. More so, we felt this was going to have a significant impact on outbound email marketers aka B2B sales and marketing teams.

Our initial assessment was this:

“The challenge will be for start-ups or smaller, less established companies. Specifically, those in the B2B space that may be using more aggressive outbound strategies or have been leaning on ABM to establish their brand. These companies must adhere pretty closely to best practices to avoid hitting the 0.3% threshold.” 

And it turns out…we were right.

I had the team dig into spam complaint rates across the B2B space and found that they are well beyond the 0.3.% threshold that has been laid out by Google and Yahoo.

It’s not even close! The average spam complaint rate across the B2B space was 2.01%, with a range between 1.1% and 3.1%.

When we break it down by industry, we get an even clearer picture:

Key Takeaway: B2B Outbound Marketing is in Trouble

We looked at the B2B space as that’s where the majority of cold outreach takes place. 

Unfortunately for B2B marketers and sales teams, there doesn’t seem to be a way to do outbounding with complaint rates below 0.3%.

In fact, for the top 9 spammiest verticals, we were unable to find a single sender that was able to score below the 0.3% proposed threshold.

This poses a big problem, especially for businesses that rely on outbound emails to generate leads and drive business. 

Tips for Lowering B2B Spam Complaint Rates

All is not lost when it comes to the new spam complaint threshold. There are plenty of things email marketers and sales teams can do to reach the inbox:

Focus on Warm Leads. Move away from cold lists and focus on high-intent active site users. Website visitor identification can tell you who these people are. 

Make Unsubscribe Options Clear. Don’t hide your unsubscribe. If a user can’t find the unsubscribe button, they are more likely to mark you as spam.

Add Multiple Unsubscribe Links. We recommend giving your users several options – include unsubscribes in the body and the footer.

Tighten Your Audience Segments. You can create better audiences and better messaging by layering more data filters. 

Enhanced Email Customization. The more customized your email is to the individual, the less likely it is they will complain. 

Increase Helpful Transactional Emails. Emails like order confirmations, tracking information, or purchase follow-ups can help increase email volume and won’t result in complaints. 

How Customers.ai Can Help

The main problem with all of this is the cold outreach.

Like I said before, it’s not the big businesses with established customer bases and large inbound lists that will be impacted. This update specifically impacts those who are using more aggressive outbound strategies or have been leaning on ABM to establish their brand.

Companies need to shift away from the old tactics of buying and emailing cold lists if they are going to succeed with outbound.

Companies need to focus on capturing first-party data and shifting their outbound strategies to warm leads.

That’s where Customers.ai comes in.

With the Customers.ai Website Visitor ID X-Ray Pixel, you can capture visitors to your site.

That doesn’t mean you should email all of them but it does allow you to nurture them. Perhaps you add them to your retargeting campaign or add them to your CRM.

For those who take high-intent actions, it may make sense to put them into your email automation.

The difference between cold emails and this is that these people are already familiar with you. Maybe they haven’t given you their info but perhaps they visited your pricing or request a demo page. They’ve shown enough interest that it won’t feel like spam.

B2B Spam Complaint Rate Threshold: Report Methodology

The data on spam complaint rates across various industries was compiled by Customers.AI‘s analytics team, leveraging a systematic approach focused on targeted outbound email campaigns. This comprehensive analysis was conducted over a four-week period, from November 20th to December 20th, encompassing a diverse range of industries known for their reliance on email marketing.

The methodology adopted for data collection involved a two-pronged approach:

Data Aggregation: Utilizing API integrations with multiple ESPs, we aggregated a large dataset (1 million+) of email campaign metrics. This dataset encompassed key parameters such as the number of emails sent, open rates, click-through rates, and the frequency of spam complaints. The data aggregation process was automated to capture real-time data, ensuring the timeliness and relevance of the information collected.

Statistical Analysis: The team employed advanced statistical techniques, including regression analysis and variance estimation, to identify patterns and derive insights from the aggregated data. This included calculating the mean spam complaint rates and their ranges for each industry and adjusting for outliers and anomalies to ensure accuracy.

To refine the accuracy of our analysis, we incorporated:

Segmentation by Industry: Emails were categorized based on the industry of the sender, enabling a granular analysis of spam complaint rates across different sectors.

Temporal Analysis: The study accounted for variations in email campaign performance across different times of the week and day, recognizing the impact of timing on recipient engagement and spam complaint likelihood.

Compliance and Legal Framework Consideration: The analysis was conducted with a keen awareness of varying spam regulations across different regions, ensuring that the data reflects a global perspective on email marketing practices.

Get The Full Infographic

Important Next Steps

See what targeted outbound marketing is all about. Capture and engage your first 500 website visitor leads with Customers.ai X-Ray website visitor identification for free.

Talk and learn about sales outreach automation with other growth enthusiasts. Join Customers.ai Island, our Facebook group of 40K marketers and entrepreneurs who are ready to support you.

Advance your marketing performance with Sales Outreach School, a free tutorial and training area for sales pros and marketers.

The post B2B Marketers Aren’t Ready for Google’s New Email Guidelines [Report] appeared first on Customers.ai.

Researchers from CMU, Bosch, and Google Unite to Transform AI Security …

In a remarkable breakthrough, researchers from Google, Carnegie Mellon University, and Bosch Center for AI have a pioneering method for enhancing the adversarial robustness of deep learning models, showcasing significant advancements and practical implications. To set a headstart, the key takeaways from this research can be placed around the following points:

Effortless Robustness through Pretrained Models: The research demonstrates a streamlined approach to achieving top-tier adversarial robustness against 2-norm bounded perturbations, exclusively using off-the-shelf pretrained models. This innovation drastically simplifies the process of fortifying models against adversarial threats.

Breakthrough with Denoised Smoothing: Merging a pretrained denoising diffusion probabilistic model with a high-accuracy classifier, the team achieves a groundbreaking 71% accuracy on ImageNet for adversarial perturbations. This result marks a substantial 14 percentage point improvement over prior certified methods.

Practicality and Accessibility: The results are attained without the need for complex fine-tuning or retraining, making the method highly practical and accessible for various applications, especially those requiring defense against adversarial attacks.

Denoised Smoothing Technique Explained: The technique involves a two-step process – first applying a denoiser model to eliminate added noise, followed by a classifier to determine the label for the treated input. This process makes it feasible to apply randomized smoothing to pretrained classifiers.

Leveraging Denoising Diffusion Models: The research highlights the suitability of denoising diffusion probabilistic models, acclaimed in image generation, for the denoising step in defense mechanisms. These models effectively recover high-quality denoised inputs from noisy data distributions.

Proven Efficacy on Major Datasets: The method shows impressive results on ImageNet and CIFAR-10, outperforming previously trained custom denoisers, even under stringent perturbation norms.

Open Access and Reproducibility: Emphasizing transparency and further research, the researchers link to a GitHub repository containing all necessary code for experiment replication.

Now, let’s dive into the detailed analysis of this research and the possibility of real-life applications. Since adversarial robustness in deep learning models is a burgeoning field, it is crucial for ensuring the reliability of AI systems against deceptive inputs. This aspect of AI research holds significant importance across various domains, from autonomous vehicles to data security, where the integrity of AI interpretations is paramount.

A pressing challenge is the susceptibility of deep learning models to adversarial attacks. These subtle manipulations of input data, often undetectable to human observers, can lead to incorrect outputs from the models. Such vulnerabilities pose serious threats, especially when security and accuracy are critical. The goal is to develop models that maintain accuracy and reliability, even when faced with these crafted perturbations.

Earlier methods to counter adversarial attacks have focused on enhancing the model’s resilience. Techniques like bound propagation and randomized smoothing were at the forefront, aiming to provide robustness against adversarial interference. These methods, though effective, often demanded complex, resource-intensive processes, making them less viable for widespread application.

The current research introduces a groundbreaking approach, Diffusion Denoised Smoothing (DDS), representing a significant shift in tackling adversarial robustness. This method uniquely combines pretrained denoising diffusion probabilistic models with standard high-accuracy classifiers. The innovation lies in utilizing existing, high-performance models, circumventing the need for extensive retraining or fine-tuning. This strategy enhances efficiency and broadens the accessibility of robust adversarial defense mechanisms.

The code for the implementation of the DDS approach

The DDS approach counters adversarial attacks by applying a sophisticated denoising process to the input data. This process involves reversing a diffusion process, typically used in state-of-the-art image generation techniques, to recover the original, undisturbed data. This method effectively cleanses the data of adversarial noise, preparing it for accurate classification. The application of diffusion techniques, previously confined to image generation, to adversarial robustness is a notable innovation bridging two distinct areas of AI research.

The performance on the ImageNet dataset is particularly noteworthy, where the DDS method achieved a remarkable 71% accuracy under specific adversarial conditions. This figure represents a 14 percentage point improvement over previous state-of-the-art methods. Such a leap in performance underscores the method’s capability to maintain high accuracy, even when subjected to adversarial perturbations.

This research marks a significant advancement in adversarial robustness by ingeniously combining existing denoising and classification techniques, and the DDS method presents a more efficient and accessible way to achieve robustness against adversarial attacks. Its remarkable performance, necessitating no additional training, sets a new benchmark in the field and opens avenues for more streamlined and effective adversarial defense strategies.

The applications of this innovative approach to adversarial robustness in deep learning models can be applied across various sectors:

Autonomous Vehicle Systems: Enhances safety and decision-making reliability by improving resistance to adversarial attacks that could mislead navigation systems.

Cybersecurity: Strengthens AI-based threat detection and response systems, making them more effective against sophisticated cyber attacks designed to deceive AI security measures.

Healthcare Diagnostic Imaging: Increases the accuracy and reliability of AI tools used in medical diagnostics and patient data analysis, ensuring robustness against adversarial perturbations.

Financial Services: Bolster’s fraud detection, market analysis, and risk assessment models in finance, maintaining integrity and effectiveness against adversarial manipulation in financial predictions and analyses.

These applications demonstrate the potential of leveraging advanced robustness techniques to enhance the security and reliability of AI systems in critical and high-stakes environments.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel
The post Researchers from CMU, Bosch, and Google Unite to Transform AI Security: Simplifying Adversarial Robustness in a Groundbreaking Achievement appeared first on MarkTechPost.

Best Image Annotation Tools in 2024

After human annotation is complete, a machine-learning model automatically examines the tagged pictures to generate the same annotations. Since the picture annotation defines the standards the model attempts to meet, any label mistakes are likewise replicated.

Image annotation is the process of labeling or categorizing an image with descriptive data that helps identify and classify objects, people, and situations included within the image.

Since it helps robots understand and interpret visual input, image annotation is vital in computer vision, robotics, and autonomous driving. An image may be annotated in many ways, such as by drawing bounding boxes around items, titling them, or segmenting them according to their visual characteristics.

Here are some best image annotation tools to check in 2024:

Markup Hero

Using Markup Hero, you can add labels and explanations to photos, highlight important details, draw attention to certain areas, and more. The application also allows users to resize, flip, and rotate images, making it easy to get the desired results.

Annotated images may be easily shared between users for commenting and discussion. Markup Hero is an easy-to-use, flexible, and powerful image annotation tool ideal for real-time collaborative and visual communication.

Keylabs

Keylabs enables users to annotate images with captions, tags, and other information, such as bounding boxes, important points, and semantic segmentation. Thanks to Keylabs, AI researchers and developers may save time annotating photos. Thanks to the platform’s comprehensive support for all image annotation styles and methods, developers have a wide range of options. Because of the software’s intuitive design, users may efficiently sort images into several categories. It allows users to collaborate and provides tools for managing processes and monitoring progress.

The program is also highly adjustable, with capabilities like developing unique annotation templates and personalized processes. Along with its annotation features, Keylabs provides in-built quality control measures to ensure the accuracy and consistency of annotations.

Labelbox

Labelbox is the most powerful vector labeling tool because of its ease of use, speed, and versatility. That also makes perfect logic in every way possible. Quickly get up and running, adapt to teams of any size, and generate high-quality training data with little effort. Labeling for object identification, semantic segmentation, and picture classification are just a few examples of how the annotations might be modified to fit the needs of a given project.

Labelbox enables several people to work together by assigning tasks, reviewing notes, and monitoring progress. It also provides quality assurance tools to make sure the labels are accurate and trustworthy. Dynamic filters that work on the content, data, or text embeddings allow you to effectively and rapidly tag relevant results at scale before sending them to a review queue.

Scale

The Scale image annotation tool allows users to add scale bars or rulers to a picture to provide a visual reference for item sizes. This is particularly useful for seeing photos of intricate structures, such as tiny animals or geological formations. Users may add text labels, arrows, and other shapes to photographs in the program to highlight certain aspects.

Pre-labeling, active tools like superpixel segmentation, and ML-based quality checks allow for large pictures’ accurate, rapid, and high-quality annotation. Combining image scaling tasks is possible. Image tasks may be set up to automatically construct a classification job with consensus if the target object is unknown.

Supervisely

Supervisely is a useful tool for annotating and labeling images and videos for use in machine vision applications. Object identification, segmentation, classification, and tracking are some annotation types that can be performed with the help of the platform’s intuitive interface. Supervisely’s powerful annotating engine simplifies annotation with features like automated polygonal segmentation, shape, text manipulation, and basic labeling.

Users of Supervisely may collaborate on a project by posting and reviewing annotations, comments, and draughts. Thanks to its compatibility with popular deep learning frameworks like TensorFlow, PyTorch, and Caffe, the platform allows users to export their annotations in several formats.

Scalabel

Scalability, flexibility, and ease of use were all design priorities. Using automatic annotations, which Scalabel facilitates, increases precision. Scalabel’s collaboration and version control support allows several developers to work concurrently on the same project. Quality assurance elements, including review, validation, and correction tools, are also included.

Scalabel stands out because it can communicate with other machine learning frameworks like TensorFlow, PyTorch, and Caffe to facilitate in-app model training. Annotation prediction between frames is flawless because of the combination of 3D Cloud and 2D Video Tracking.

RectLabel

RectLabel is an image labeling tool that helps annotate pictures for use in machine learning. The tool supports many annotations, including bounding boxes, polygons, and lines. The program is simple enough for anybody to use, allowing users to annotate photographs by drawing bounding boxes around interesting features.

RectLabel provides several features that enhance the reliability and productivity of the annotation process. High-quality annotations may be achieved since the tool offers fine-grained control over bounding box size and position. A smart tagging system also helps users save time while labeling by making suggestions based on past selections.

MakeSense.AI

Makingsense.ai is a no-download, no-install internet application for labeling photographs. It is browser-based; therefore, there is no need for elaborate setup procedures. TensorFlow.js is one of the most popular frameworks for training neural networks, and it serves as the foundation for Makes Sense AI. The program is great for quickly testing picture annotation or small projects because of its accessible and simple features. MakeSense.AI, an online picture annotation tool, is free under the GPLv3 license.

CVAT – Computer Vision Annotation Tool

CVAT is a widely used open-source program for annotating images; Intel researchers developed it. CVAT is available to companies as part of the Viso Suite computer vision application suite. 5.7k Stars on GitHub As it is based on Github, this annotation tool needs to be manually installed. After it is set up, it offers more features and tools than others, such as shortcuts and the ability to create custom shapes for labels. TensorFlow Object Detection and Deep Learning Deployment Toolkit are only two of the many plugins that CVAT is compatible with.

LabelImg

LabelImg is a popular rudimentary graphical image annotation tool written in Python. 14.7k Favorites on GitHub The setup process is straightforward and can be completed in a few minutes using just a command prompt or terminal. The picture annotation tool is helpful in annotating datasets for object identification models, although it works best with smaller datasets (less than 10,000 photos) and needs a lot of human intervention. Because of its user-friendliness and extensive documentation, it is a great tool for novice ML programmers.

VGG Image Annotator (VIA)

The Visual Geometry Group (VGG) at Oxford University has released VGG Image Annotator (VIA), a free and open-source program for annotating images. It offers a straightforward user interface for drawing shapes on photos, whether points, lines, polygons, rectangles, or anything else. In addition, VIA enables users to annotate characteristics, providing further context for annotations.

Object identification, picture segmentation, and classification are just some of the many uses for VIA. CSV, JSON, and PASCAL VOC are annotated formats that may be imported and exported. VIA may be deployed locally or on a web server and can be modified to accommodate various annotation needs.

Dataturks

Dataturks is a service on the cloud that allows users to annotate images and identify data. Bounding boxes, polygons, and semantic segmentation are just some available annotations. To further guarantee precise annotations, it is equipped with quality assurance tools.

Integrations with well-known ML libraries like TensorFlow, PyTorch, and Keras are also available in Dataturks. Dataturks’ overarching goal is to improve the efficiency, simplicity, and accuracy with which data is annotated so that ML teams may devote their efforts to developing more effective models.

Roboflow

Roboflow is a service in the cloud that may be used to annotate and tag data. Annotation choices vary from polygons to semantic segmentation to bounding boxes. To further guarantee precise annotations, it is equipped with quality assurance tools.

Eagle

Eagle is the best program for arranging pictures and concepts. To speed up the training of computer vision models, the tool streamlines the annotation of large datasets. Annotations may be seen and edited, progress can be tracked, and annotator quality can be verified inside the software.

Eagle has a welcoming interface that promotes collaboration. Several useful tools are available, such as importing and exporting datasets and organizing labeling activities. Eagle’s efficient organization features make locating a certain photo collection easy anytime. In addition, it allows you to preview the films without having to open them separately, thanks to its audio and video management features.

Hasty

Hasty is an online annotation tool that utilizes AI to annotate photographs. The German firm employs “using AI to teach AI,” an approach that incorporates active learning to better your projected labels over time. The firm uses cutting-edge technology to develop superior algorithms and models.

Amazon SageMaker Ground Truth

Amazon SageMaker Ground Truth is an artificial intelligence-driven service that helps customers generate high-quality training data for ML algorithms. Image annotation, object recognition, and semantic segmentation are just a few of the features it provides.

Labellerr

Labellerr is a smart-feedback product driven by AI that automates the data pipeline of AI-first companies via computer vision AI. Bounding boxes, polygons, automatic item identification, and automated semantic segmentation are just some of the annotation options available. To further guarantee precise annotations, it is equipped with quality assurance tools.

Don’t forget to follow us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel
The post Best Image Annotation Tools in 2024 appeared first on MarkTechPost.