June 2023 - Page 8 of 9 - i-genie.co.uk

Meet SelFee: An Iterative Self-Revising LLM Empowered By Self-Feedback …

Posted on June 6, 2023 by i-genie

A recent study has highlighted the effectiveness of natural language feedback in improving the performance of language models. A team of researchers from KAIST has introduced a new SelFee model designed explicitly for self-feedback and self-revision generation. Unlike previous approaches, SelFee does not require external, significant language or task-specific models to generate high-quality responses.

SelFee is a fine-tuned LLaMA-based instruction-following model that continuously revises its answers until it achieves a high-quality response within a single inference. Based on the given instruction, the model generates an initial solution and self-feedback sequences. By analyzing the content of the generated feedback, the model determines if a revision is needed. If so, it generates a revised answer based on the feedback. This iterative revision process is completed within a single inference, resulting in improved solutions compared to existing LLaMA-based models.

The researchers collected diverse instruction data from various sources, such as ShareGPT, Alpaca, Math, Code, and Flan Collection. To address the scarcity of feedback and revision data, they augmented the dataset using a distillation process from a teacher model called ChatGPT. This approach allowed them to generate more instances of feedback and revision at a more affordable cost.

To train the model, the researchers utilized data augmentation techniques using OpenAI API calls. They collected instructions from multiple sources and input them into ChatGPT to generate corresponding answers. They then obtained feedback on the generated answers by querying ChatGPT again. If a revision was deemed necessary, ChatGPT revised the answer based on self-generated feedback. This process was repeated until no further modifications were required.

SelFee was trained using the FastChat framework. Based on the instruction, the model was fine-tuned to generate the answer and feedback chain, including revisions. The researchers observed that increasing the minimum required revisions during the inference process improved answer quality. They found that a minimum of three revisions yielded the best performance, and even a 7B SelFee model that generated at least three revisions outperformed a 13B SelFee model that did not require modifications.

In terms of evaluation, the researchers adopted the Vicuna evaluation setting, which involved 80 diverse queries. Instead of conducting a human evaluation, they performed a pilot evaluation using GPT-4 as the evaluator. The relative scores compared to ChatGPT were reported, considering the positional bias of GPT-4.

While SelFee demonstrated comparable performance to ChatGPT in the Vicuna evaluation setting, it was found to lack knowledge in areas such as math, reasoning, factuality, and coding compared to ChatGPT.

Overall, SelFee introduces a novel approach to self-feedback and self-revision generation in language models. By fine-tuning the model to revise its answers continuously, SelFee achieves improved performance compared to existing models. The research findings highlight the importance of iterative revision in enhancing the quality of language model responses and suggest that increasing the inference computation of a model may be more effective than simply increasing its size.

Check Out The Project Blog, Demo, and Github link. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Meet SelFee: An Iterative Self-Revising LLM Empowered By Self-Feedback Generation appeared first on MarkTechPost.

Use Amazon SageMaker Canvas to build machine learning models using Par …

Posted on June 6, 2023 by i-genie

Data is the foundation for machine learning (ML) algorithms. One of the most common formats for storing large amounts of data is Apache Parquet due to its compact and highly efficient format. This means that business analysts who want to extract insights from the large volumes of data in their data warehouse must frequently use data stored in Parquet.
To simplify access to Parquet files, Amazon SageMaker Canvas has added data import capabilities from over 40 data sources, including Amazon Athena, which supports Apache Parquet.
Canvas provides connectors to AWS data sources such as Amazon Simple Storage Service (Amazon S3), Athena, and Amazon Redshift. In this post, we describe how to query Parquet files with Athena using AWS Lake Formation and use the output Canvas to train a model.
Solution overview
Athena is a serverless, interactive analytics service built on open-source frameworks, supporting open table and file formats. Many teams are turning to Athena to enable interactive querying and analyze their data in the respective data stores without creating multiple data copies.
Athena allows applications to use standard SQL to query massive amounts of data on an S3 data lake. Athena supports various data formats, including:

CSV
TSV
JSON
text files
Open-source columnar formats, such as ORC and Parquet
Compressed data in Snappy, Zlib, LZO, and GZIP formats

Parquet files organize the data into columns and use efficient data compression and encoding schemes for fast data storage and retrieval. You can reduce the import time in Canvas by using Parquet files for bulk data imports and with specific columns.
Lake Formation is an integrated data lake service that makes it easy for you to ingest, clean, catalog, transform, and secure your data and make it available for analysis and ML. Lake Formation automatically manages access to the registered data in Amazon S3 through services including AWS Glue, Athena, Amazon Redshift, Amazon QuickSight, and Amazon EMR using Zeppelin notebooks with Apache Spark to ensure compliance with your defined policies.
In this post, we show you how to import Parquet data to Canvas from Athena, where Lake Formation enables data governance.
To illustrate, we use the operations data of a consumer electronics business. We create a model to estimate the demand for electronic products using their historical time series data.
This solution is illustrated in three steps:

Set up the Lake Formation.
Grant Lake Formation access permissions to Canvas.
Import the Parquet data to Canvas using Athena.
Use the imported Parquet data to build ML models with Canvas.

The following diagram illustrates the solution architecture.

Set up the Lake Formation database
The steps listed here form a one-time setup to show you the data lake hosting the Parquet data, which can be consumed by your analysts to gain insights using Canvas. Either cloud engineers or administrators can best perform these prerequisites. Analysts can go directly to Canvas and import the data from Athena.
The data used in this post consist of two datasets sourced from Amazon S3. These datasets have been generated synthetically for this post.

Consumer Electronics Target Time Series (TTS) – The historical data of the quantity to forecast is called the Target Time Series (TTS). In this case, it’s the demand for an item.
Consumer Electronics Related Time Series (RTS) – Other historical data that is known at exactly the same time as every sales transaction is called the Related Time Series (RTS). In our use case, it’s the price of an item. An RTS dataset includes time series data that isn’t included in a TTS dataset and might improve the accuracy of your predictor.

Upload data to Amazon S3 as Parquet files from these two folders:

ce-rts – Contains Consumer Electronics Related Time Series (RTS).
ce-tts – Contains Consumer Electronics Target Time Series (TTS).

Create a data lake with Lake Formation.
On the Lake Formation console, create a database called consumer-electronics.

Create two tables for the consumer electronics dataset with the names ce-rts-Parquet and ce-tts-Parquet with the data sourced from the S3 bucket.

We use the database we created in this step in a later step to import the Parquet data into Canvas using Athena.
Grant Lake Formation access permissions to Canvas
This is a one-time setup to be done by either cloud engineers or administrators.

Grant data lake permissions to access Canvas to access the consumer-electronics Parquet data.
In the SageMaker Studio domain, view the Canvas user’s details.
Copy the execution role name.
Make sure the execution role has enough permissions to access the following services:

Canvas.
The S3 bucket where Parquet data is stored.
Athena to connect from Canvas.
AWS Glue to access the Parquet data using the Athena connector.

In Lake Formation, choose Data Lake permissions in the navigation pane.
Choose Grant.

For Principals, select IAM users and roles to provide Canvas access to your data artifacts.
Specify your SageMaker Studio domain user’s execution role.
Specify the database and tables.
Choose Grant.

You can grant granular actions on the tables, columns, and data. This option provides granular access configuration of your sensitive data by the segregation of roles you have defined.

After you set up the required environment for the Canvas and Athena integration, proceed to the next step to import the data into Canvas using Athena.
Import data using Athena
Complete the following steps to import the Lake Formation-managed Parquet files:

In Canvas, choose Datasets in the navigation pane.

Choose + Import to import the Parquet datasets managed by Lake Formation.

Choose Athena as the data source.

Choose the consumer-electronics dataset in Parquet format from the Athena data catalog and table details in the menu.
Import the two datasets. Drag and drop the data source to select the first one.

When you drag and drop the dataset, the data preview appears in the bottom frame of the page.

Choose Import data.
Enter consumer-electronics-rts as the name for the dataset you’re importing.

Data import takes time based on the data size. The dataset in this example is small, so the import takes a few seconds. When the data import is completed, the status turns from Processing to Ready.

Repeat the import process for the second dataset (ce-tts).

When the ce-tts Parquet data is imported, the Datasets pageshow both datasets.
The imported datasets contain targeted and related time series data. The RTS dataset can help deep learning models improve forecast accuracy.
Let’s join the datasets to prepare for our analysis.

Select the datasets.
Choose Join data.

Select and drag both the datasets to the center pane, which applies an inner join.
Choose the Join icon to see the join conditions applied and to make sure the inner join is applied and the right columns are joined.
Choose Save & close to apply the join condition.

Provide a name for the joined dataset.
Choose Import data.

Joined data is imported and created as a new dataset. The joined dataset source is shown as Join.

Use the Parquet data to build ML models with Canvas
The Parquet data from Lake Formation is now available on Canvas. Now you can run your ML analysis on the data.

Choose Create a custom model in Ready-to-use models from Canvas after successfully importing the data.

Enter a name for the model.
Select your problem type (for this post, Predictive analysis).
Choose Create.

Select the consumer-electronic-joined dataset to train the model to predict the demand for electronic items.

Select demand as the target column to forecast demand for consumer electronic items.

Based on the data provided to Canvas, the Model type is automatically derived as Time series forecasting and provides a Configure time series model option.

Choose the Configure time series model link to provide time series model options.
Enter forecasting configurations as shown in the following screenshot.
Exclude group column because no logical grouping is executed for the dataset.

For building the model, Canvas offers two build options. Choose the option as per your preference. Quick build generally takes around 15–20 minutes, whereas Standard takes around 4 hours.

Quick build – Builds a model in a fraction of the time compared to a standard build; potential accuracy is exchanged for speed
Standard build – Builds the best model from an optimized process powered by AutoML; speed is exchanged for greatest accuracy

For this post, we choose Quick build for illustrative purposes.

When the quick build is completed, the model evaluation metrics are presented in the Analyze section.

Choose Predict to run a single prediction or batch prediction.

Clean up
Log out from Canvas to avoid future charges.
Conclusion
Enterprises have data in data lakes in various formats, including the highly efficient Parquet format. Canvas has launched more than 40 data sources, including Athena, from which you can easily pull data in various formats from data lakes. To learn more, refer to Import data from over 40 data sources for no-code machine learning with Amazon SageMaker Canvas.
In this post, we took Lake Formation-managed Parquet files and imported them into Canvas using Athena. The Canvas ML model forecasted the demand of consumer electronics using historical demand and price data. Thanks to a user-friendly interface and vivid visualizations, we completed this without writing a single line of code. Canvas now allows business analysts to use Parquet files from data engineering teams and build ML models, conduct analysis, and extract insights independently of data science teams.
To learn more about Canvas, refer to Predict types of machine failures with no-code machine learning using Canvas. Refer to Announcing Amazon SageMaker Canvas – a Visual, No Code Machine Learning Capabilities for Business Analysts for more information on creating ML models with a no-code solution.

About the authors
Gopi Mudiyala is a Senior Technical Account Manager at AWS. He helps customers in the Financial Services industry with their operations in AWS. As a machine learning enthusiast, Gopi works to help customers succeed in their ML journey. In his spare time, he likes to play badminton, spend time with family, and travel.
Hariharan Suresh is a Senior Solutions Architect at AWS. He is passionate about databases, machine learning, and designing innovative solutions. Prior to joining AWS, Hariharan was a product architect, core banking implementation specialist, and developer, and worked with BFSI organizations for over 11 years. Outside of technology, he enjoys paragliding and cycling.

Amazon SageMaker Automatic Model Tuning now automatically chooses tuni …

Posted on June 6, 2023 by i-genie

Amazon SageMaker Automatic Model Tuning has introduced Autotune, a new feature to automatically choose hyperparameters on your behalf. This provides an accelerated and more efficient way to find hyperparameter ranges, and can provide significant optimized budget and time management for your automatic model tuning jobs.
In this post, we discuss this new capability and some of the benefits it brings.
Hyperparameter overview
When training any machine learning (ML) model, you are generally dealing with three types of data: input data (also called the training data), model parameters, and hyperparameters. You use the input data to train your model, which in effect learns your model parameters. During the training process, your ML algorithms are trying to find the optimal model parameters based on data while meeting the goals of your objective function. For example, when a neural network is trained, the weight of the network nodes is learned from the training, and indicates how much impact it has on the final prediction. These weights are the model parameters.
Hyperparameters, on the other hand, are parameters of a learning algorithm and not the model itself. The number of hidden layers and the number of nodes are some of the examples of hyperparameters you can set for a neural network. The difference between model parameters and hyperparameters is that model parameters are learned during the training process, whereas hyperparameters are set prior to the training and remain constant during the training process.
Pain points
SageMaker automatic model tuning, also called hyperparameter tuning, runs many training jobs on your dataset using a range of hyperparameters that you specify. It can accelerate your productivity by trying many variations of a model. It looks for the best model automatically by focusing on the most promising combinations of hyperparameter values within the ranges that you specify. However, to get good results, you must choose the right ranges to explore.
But how do you know what the right range is to begin with? With hyperparameter tuning jobs, we are assuming that the optimal set of hyperparameters lies within the range that we specified. What happens if the chosen range is not right, and the optimal hyperparameter actually falls outside of the range?
Choosing the right hyperparameters requires experience with the ML technique you are using and understanding how its hyperparameters behave. It’s important to understand the hyperparameter implications because every hyperparameter that you choose to tune has the potential to increase the number of trials required for a successful tuning job. You need to strike an optimal trade-off between resources allocated to the tuning job and achieving the goals you’ve set.
The SageMaker Automatic Model Tuning team is constantly innovating on behalf of our customers to optimize their ML workloads. AWS recently announced support of new completion criteria for hyperparameter optimization: the max runtime criteria, which is a budget control completion criteria that can be used to bound cost and runtime. Desired target metrics, improvement monitoring, and convergence detection monitors the performance of the model and assists with early stopping if the models don’t improve after a defined number of training jobs. Autotune is a new feature of automatic model tuning that helps save you time and reduce wasted resources on finding optimal hyperparameter ranges.
Benefits of Autotune and how automatic model tuning alleviates those pain points
Autotune is a new configuration in the CreateHyperParameterTuningJob API and in the HyperparameterTuner SageMaker Python SDK that alleviates the need to specify the hyperparameter ranges, tuning strategy, objective metrics, or the number of jobs that were required as part of the job definition. Autotune automatically chooses the optimal configurations for your tuning job, helps prevent wasted resources, and accelerates productivity.
The following example showcases how many of the parameters are not necessary when using Autotune.
The following code creates a hyperparameter tuner using the SageMaker Python SDK without Autotune:

estimator = PyTorch(
entry_point=”mnist.py”,
instance_type=”ml.p4d.24xlarge”,
hyperparameters={
“epochs”: 1, “backend”: “gloo”
},
)

tuner = HyperparameterTuner(
estimator,
objective_metric_name=’validation:rmse’,
objective_type=’Minimize’,
hyperparameter_ranges = {
“lr”: ContinuousParameter(0.001, 0.1),
“batch-size”: CategoricalParameter([32, 64, 128, 256, 512])
},
metric_definitions=[{…}],
max_jobs=10,
strategy=”Random”
)

tuner.fit(…)

The following example showcases how many of the parameters are not necessary when using Autotune:

estimator = PyTorch(
entry_point=”mnist.py”,
instance_type=”ml.p4d.24xlarge”,
hyperparameters={
“epochs”: 1, “backend”: “gloo”, “lr”: 0.01, “batch-size”: 32
},
)
tuner = HyperparameterTuner(
estimator,
objective_metric_name=’validation:rmse’,
objective_type=’Minimize’,
autotune=True
)

If you are using API, the equivalent code would be as follows:

create_hyper_parameter_tuning_job(
HyperParameterTuningJobName=tuning_job_name,
HyperParameterTuningJobConfig=tuning_job_config,
TrainingJobDefinition=training_job_definition,
Autotune={‘Mode’: ‘Enabled’},
)

The code example illustrates some of the key benefits of Autotune:

A key choice for a tuning job is which hyperparameters to tune and their ranges. Autotune makes this choice for you based on a list of hyperparameters that you provide. Using the previous example, the hyperparameters that Autotune can choose to be tunable are lr and batch-size.
Autotune will automatically select the hyperparameter ranges on your behalf. Autotune uses best practices as well as internal benchmarks for selecting the appropriate ranges.
Autotune automatically selects the strategy on how to choose the combinations of hyperparameter values to use for the training job.
Early stopping is enabled by default when using Autotune. When using early stopping, SageMaker stops training jobs launched by the hyperparameter tuning job when they are unlikely to perform better than previously completed training jobs to avoid additional resource utilization.
Maximum expected resources to be consumed by the tuning job (parallel jobs, max runtime, and so on) will be calculated and set in the tuning job record as soon as the tuning job is created. Such reserved resources will not increase during the course of the tuning job; this will maintain an upper bound of cost and duration of the tuning job that is easily predictable by the user. A max runtime of 48 hours will be used by default.

You can override any settings chosen automatically by Autotune. As an example, if you specify your own hyperparameter ranges, those will be used alongside the inferred ranges. Any user-specified hyperparameter range will take precedence over the same named inferred ranges:

estimator = PyTorch(
…
hyperparameters={
“epochs”: 100, “backend”: “gloo”, “lr”: 0.01, “beta1”: 0.8
}

tuner = HyperparameterTuner(
…
hyperparameter_ranges = {
“lr”: ContinuousParameter(0.001, 0.01) # takes precedence over inferred “lr”
}

Autotune generates a set of settings as part of the tuning job. Any customer-specified settings that have the same name will override the Autotune-selected settings. Any customer-provided settings (that aren’t the same as the named Autotune settings) are added in addition to the Autotune-selected settings.
Inspecting parameters chosen by Autotune
Autotune reduces the time you would normally have spent in deciding on the initial set of hyperparameters to tune. But how do you get insights into what hyperparameter values Autotune ended up choosing? You can get information about decisions made for you in the description of the running tuning job (in the response of the DescribeHyperParameterTuningJob operation). After you submit a request to create a tuning job, the request is processed, and all missing fields are set by Autotune. All set fields are reported in the DescribeHyperParameterTuningJob operation.
Alternatively, you can inspect HyperparameterTuner class fields to see the settings chosen by Autotune.
The following is an XGBoost example of how you may use the DescribeHyperParameterTuningJob to inspect the hyperparameters chosen by Autotune.
First, we create a tuning job with automatic model tuning:

hyperparameters = {
“objective”: “reg:squarederror”,
“num_round”: “50”,
“verbosity”: “2”,
“max_depth”: “5”, # overlap with ranges is ok when Autotune is enabled
}
estimator = XGBoost(hyperparameters=hyperparameters, …)

hp_tuner = HyperparameterTuner(estimator, autotune=True)
hp_tuner.fit(wait=False)

After the tuning job is created successfully, we can discover what settings Autotune chose. For example, we can describe the tuning job by the name given by it from hp_tuner:

import boto3
sm = boto3.client(‘sagemaker’)

response = sm.describe_hyper_parameter_tuning_job(
HyperParameterTuningJobName=hp_tuner.latest_tuning_job.name
)

print(response)

Then we can inspect the generated response to review the settings chosen by Autotune on our behalf.
If the current tuning job settings are not satisfactory, you can stop the tuning job:
hp_tuner.stop()
Conclusion
SageMaker Automatic Model Tuning allows you to reduce the time to tune a model by automatically searching for the best hyperparameter configuration within the ranges that you specify. However, choosing the right hyperparameter ranges can be a time-consuming process and can have direct implications on your training cost and duration.
In this post, we discussed how you can now use Autotune, a new feature introduced as part of automatic model tuning, to automatically pick an initial set of hyperparameter ranges on your behalf. This can reduce the time it takes for you to get started with your model tuning process. Additionally, you can evaluate the ranges picked by Autotune and adjust them according to your needs.
We also showed how Autotune can automatically pick the optimal parameter settings on your behalf, such as the number of training jobs, the strategy to choose the hyperparameter combinations, and enabling early stopping by default. This can result in significantly optimized budget and time bounds that are easily predictable.
To learn more, refer to Perform Automatic Model Tuning with SageMaker.

About the Authors
Jas Singh is a Senior Solutions Architect helping public sector customers achieve their business outcomes through architecting and implementing innovative and resilient solutions at scale. Jas has over 20 years of experience in designing and implementing mission-critical applications and holds a master’s degree in computer science from Baylor University.
Gopi Mudiyala is a Senior Technical Account Manager at AWS. He helps customers in the Financial Services industry with their operations in AWS. As a machine learning enthusiast, Gopi works to help customers succeed in their ML journey. In his spare time, he likes to play badminton, spend time with family, and travel.
Raviteja Yelamanchili is an Enterprise Solutions Architect with Amazon Web Services based in New York. He works with large financial services enterprise customers to design and deploy highly secure, scalable, reliable, and cost-effective applications on the cloud. He brings over 11 years of risk management, technology consulting, data analytics, and machine learning experience. When he is not helping customers, he enjoys traveling and playing PS5.
Iaroslav Shcherbatyi is a Machine Learning Engineer at AWS. He works mainly on improvements to the Amazon SageMaker platform and helping customers best use its features. In his spare time, he likes to go to gym, do outdoor sports such as ice skating or hiking, and to catch up on new AI research.

Train a Large Language Model on a single Amazon SageMaker GPU with Hug …

Posted on June 6, 2023 by i-genie

This post is co-written with Philipp Schmid from Hugging Face.
We have all heard about the progress being made in the field of large language models (LLMs) and the ever-growing number of problem sets where LLMs are providing valuable insights. Large models, when trained over massive datasets and several tasks, are also able to generalize well over tasks that they aren’t trained specifically for. Such models are called foundation models, a term first popularized by the Stanford Institute for Human-Centered Artificial Intelligence. Even though these foundation models are able to generalize well, especially with the help of prompt engineering techniques, often the use case is so domain specific, or the task is so different, that the model needs further customization. One approach to improve performance of a large model for a specific domain or task is to further train the model with a smaller, task-specific dataset. Although this approach, known as fine-tuning, successfully improves the accuracy of LLMs, it requires modifying all of the model weights. Fine-tuning is much faster than the pre-training of a model thanks to the much smaller dataset size, but still requires significant computing power and memory. Fine-tuning modifies all the parameter weights of the original model, which makes it expensive and results in a model that is the same size as the original.
To address these challenges, Hugging Face introduced the Parameter-Efficient Fine-Tuning library (PEFT). This library allows you to freeze most of the original model weights and replace or extend model layers by training an additional, much smaller, set of parameters. This makes training much less expensive in terms of required compute and memory.
In this post, we show you how to train the 7-billion-parameter BloomZ model using just a single graphics processing unit (GPU) on Amazon SageMaker, Amazon’s machine learning (ML) platform for preparing, building, training, and deploying high-quality ML models. BloomZ is a general-purpose natural language processing (NLP) model. We use PEFT to optimize this model for the specific task of summarizing messenger-like conversations. The single-GPU instance that we use is a low-cost example of the many instance types AWS provides. Training this model on a single GPU highlights AWS’s commitment to being the most cost-effective provider of AI/ML services.
The code for this walkthrough can be found on the Hugging Face notebooks GitHub repository under the sagemaker/24_train_bloom_peft_lora folder.
Prerequisites
In order to follow along, you should have the following prerequisites:

An AWS account.
A Jupyter notebook within Amazon SageMaker Studio or SageMaker notebook instances.
You will need access to the SageMaker ml.g5.2xlarge instance type, containing a single NVIDIA A10G GPU. On the AWS Management Console, navigate to Service Quotas for SageMaker and request a 1-instance increase for the following quotas: ml.g5.2xlarge for training job usage and ml.g5.2xlarge for training job usage.
After your requested quotas are applied to your account, you can use the default Studio Python 3 (Data Science) image with a ml.t3.medium instance to run the notebook code snippets. For the full list of available kernels, refer to Available Amazon SageMaker Kernels.

Set up a SageMaker session
Use the following code to set up your SageMaker session:

import sagemaker
import boto3
sess = sagemaker.Session()
# sagemaker session bucket -> used for uploading data, models and logs
# sagemaker will automatically create this bucket if it does not exist
sagemaker_session_bucket=None
if sagemaker_session_bucket is None and sess is not None:
# set to default bucket if a bucket name is not given
sagemaker_session_bucket = sess.default_bucket()

try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client(‘iam’)
role = iam.get_role(RoleName=’sagemaker_execution_role’)[‘Role’][‘Arn’]

sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)

print(f”sagemaker role arn: {role}”)
print(f”sagemaker bucket: {sess.default_bucket()}”)
print(f”sagemaker session region: {sess.boto_region_name}”)

Load and prepare the dataset
We use the samsum dataset, a collection of 16,000 messenger-like conversations with summaries. The conversations were created and written down by linguists fluent in English. The following is an example of the dataset:

{
“id”: “13818513”,
“summary”: “Amanda baked cookies and will bring Jerry some tomorrow.”,
“dialogue”: “Amanda: I baked cookies. Do you want some?rnJerry: Sure!rnAmanda: I’ll bring you tomorrow :-)”
}

To train the model, you need to convert the inputs (text) to token IDs. This is done by a Hugging Face Transformers tokenizer. For more information, refer to Chapter 6 of the Hugging Face NLP Course.
Convert the inputs with the following code:

from transformers import AutoTokenizer

model_id=”bigscience/bloomz-7b1″

# Load tokenizer of BLOOMZ
tokenized = AutoTokenizer.from_pretrained(model_id)
tokenizer.model_max_length = 2048 # overwrite wrong value

Before starting training, you need to process the data. Once it’s trained, the model will take a set of text messages as the input and generate a summary as the output. You need to format the data as a prompt (the messages) with a correct response (the summary). You also need to chunk examples into longer input sequences to optimize the model training. See the following code:

from random import randint
from itertools import chain
from functools import partial

# custom instruct prompt start
prompt_template = f”Summarize the chat dialogue:n{{dialogue}}n—nSummary:n{{summary}}{{eos_token}}”

# template dataset to add prompt to each sample
def template_dataset(sample):
sample[“text”] = prompt_template.format(dialogue=sample[“dialogue”],
summary=sample[“summary”],
eos_token=tokenizer.eos_token)
return sample

# apply prompt template per sample
dataset = dataset.map(template_dataset, remove_columns=list(dataset.features))

print(dataset[randint(0, len(dataset))][“text”])

# empty list to save remainder from batches to use in next batch
remainder = {“input_ids”: [], “attention_mask”: []}

def chunk(sample, chunk_length=2048):
# define global remainder variable to save remainder from batches to use in next batch
global remainder
# Concatenate all texts and add remainder from previous batch
concatenated_examples = {k: list(chain(*sample[k])) for k in sample.keys()}
concatenated_examples = {k: remainder[k] + concatenated_examples[k] for k in concatenated_examples.keys()}
# get total number of tokens for batch
batch_total_length = len(concatenated_examples[list(sample.keys())[0]])

# get max number of chunks for batch
if batch_total_length >= chunk_length:
batch_chunk_length = (batch_total_length // chunk_length) * chunk_length

# Split by chunks of max_len.
result = {
k: [t[i : i + chunk_length] for i in range(0, batch_chunk_length, chunk_length)]
for k, t in concatenated_examples.items()
}
# add remainder to global variable for next batch
remainder = {k: concatenated_examples[k][batch_chunk_length:] for k in concatenated_examples.keys()}
# prepare labels
result[“labels”] = result[“input_ids”].copy()
return result

# tokenize and chunk dataset
lm_dataset = dataset.map(
lambda sample: tokenizer(sample[“text”]), batched=True, remove_columns=list(dataset.features)
).map(
partial(chunk, chunk_length=2048),
batched=True,
)

# Print total number of samples
print(f”Total number of samples: {len(lm_dataset)}”)

Now you can use the FileSystem integration to upload the dataset to Amazon Simple Storage Service (Amazon S3):

# save train_dataset to s3
training_input_path = f’s3://{sess.default_bucket()}/processed/samsum-sagemaker/train’
lm_dataset.save_to_disk(training_input_path)

print(“uploaded data to:”)
print(f”training dataset to: {training_input_path}”)

In [ ]:
training_input_path=”s3://sagemaker-us-east-1-558105141721/processed/samsum-sagemaker/train”

Fine-tune BLOOMZ-7B with LoRA and bitsandbytes int-8 on SageMaker
The Hugging Face BLOOMZ-7B model card indicates its initial training was distributed over 8 nodes with 8 A100 80 GB GPUs and 512 GB memory CPUs each. This computing configuration is not readily accessible, is cost-prohibitive to consumers, and requires expertise in distributed training performance optimization. SageMaker lowers the barriers to replication of this setup through its distributed training libraries; however, the cost of comparable eight on-demand ml.p4de.24xlarge instances would be $376.88 per hour. Furthermore, the fully trained model consumes about 40 GB of memory, which exceeds the available memory of many individual consumer available GPUs and requires strategies to address for large model inferencing. As a result, full fine-tuning of the model for your task over multiple model runs and deployment would require significant compute, memory, and storage costs on hardware that isn’t readily accessible to consumers.
Our goal is to find a way to adapt BLOOMZ-7B to our chat summarization use case in a more accessible and cost-effective way while maintaining accuracy. To enable our model to be fine-tuned on a SageMaker ml.g5.2xlarge instance with a single consumer-grade NVIDIA A10G GPU, we employ two techniques to reduce the compute and memory requirements for fine-tuning: LoRA and quantization.
LoRA (Low Rank Adaptation) is a technique that significantly reduces the number of model parameters and associated compute needed for fine-tuning to a new task without a loss in predictive performance. First, it freezes your original model weights and instead optimizes smaller rank-decomposition weight matrices to your new task rather than updating the full weights, and then injects these adapted weights back into the original model. Consequently, fewer weight gradient updates means less compute and GPU memory during fine-tuning. The intuition behind this approach is that LoRA allows LLMs to focus on the most important input and output tokens while ignoring redundant and less important tokens. To deepen your understanding of the LoRA technique, refer to the original paper LoRA: Low-Rank Adaptation of Large Language Models.
In addition to the LoRA technique, you use the bitsanbytes Hugging Face integration LLM.int8() method to quantize out the frozen BloomZ model, or reduce the precision of the weight and bias values, by rounding them from float16 to int8. Quantization reduces the needed memory for BloomZ by about four times, which enables you to fit the model on the A10G GPU instance without a significant loss in predictive performance. To deepen your understanding of how int8 quantization works, its implementation in the bitsandbytes library, and its integration with the Hugging Face Transformers library, see A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using Hugging Face Transformers, Accelerate and bitsandbytes.
Hugging Face has made LoRA and quantization accessible across a broad range of transformer models through the PEFT library and its integration with the bitsandbytes library. The create_peft_config() function in the prepared script run_clm.py illustrates their usage in preparing your model for training:

def create_peft_config(model):
from peft import (
get_peft_model,
LoraConfig,
TaskType,
prepare_model_for_int8_training,
)

peft_config = LoraConfig(
task_type=TaskType.CAUSAL_LM,
inference_mode=False,
r=8, # Lora attention dimension.
lora_alpha=32, # the alpha parameter for Lora scaling.
lora_dropout=0.05, # the dropout probability for Lora layers.
target_modules=[“query_key_value”],
)

# prepare int-8 model for training
model = prepare_model_for_int8_training(model)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
return model

With LoRA, the output from print_trainable_parameters()indicates we were able to reduce the number of model parameters from 7 billion to 3.9 million. This means that only 5.6% of the original model parameters need to be updated. This significant reduction in compute and memory requirements allows us to fit and train our model on the GPU without issues.
To create a SageMaker training job, you will need a Hugging Face estimator. The estimator handles end-to-end SageMaker training and deployment tasks. SageMaker takes care of starting and managing all the required Amazon Elastic Compute Cloud (Amazon EC2) instances for you. Additionally, it provides the correct Hugging Face training container, uploads the provided scripts, and downloads the data from our S3 bucket into the container at the path /opt/ml/input/data. Then, it starts the training job. See the following code:

import time
# define Training Job Name
job_name = f’huggingface-peft-{time.strftime(“%Y-%m-%d-%H-%M-%S”, time.localtime())}’

from sagemaker.huggingface import HuggingFace

# hyperparameters, which are passed into the training job
hyperparameters ={
‘model_id’: model_id, # pre-trained model
‘dataset_path’: ‘/opt/ml/input/data/training’, # path where sagemaker will save training dataset
‘epochs’: 3, # number of training epochs
‘per_device_train_batch_size’: 1, # batch size for training
‘lr’: 2e-4, # learning rate used during training
}

# create the Estimator
huggingface_estimator = HuggingFace(
entry_point = ‘run_clm.py’, # train script
source_dir = ‘scripts’, # directory which includes all the files needed for training
instance_type = ‘ml.g5.2xlarge’, # instances type used for the training job
instance_count = 1, # the number of instances used for training
base_job_name = job_name, # the name of the training job
role = role, # IAM role used in training job to access AWS resources, e.g. S3
volume_size = 300, # the size of the EBS volume in GB
transformers_version = ‘4.26’, # the transformers version used in the training job
pytorch_version = ‘1.13’, # the pytorch_version version used in the training job
py_version = ‘py39’, # the python version used in the training job
hyperparameters = hyperparameters
)

You can now start your training job using the .fit() method and passing the S3 path to the training script:

# define a data input dictionary with our uploaded s3 uris
data = {‘training’: training_input_path}

# starting the train job with our uploaded datasets as inputs
huggingface_estimator.fit(data, wait=True)

Using LoRA and quantization makes fine-tuning BLOOMZ-7B to our task affordable and efficient with SageMaker. When using SageMaker training jobs, you only pay for GPUs for the duration of model training. In our example, the SageMaker training job took 20,632 seconds, which is about 5.7 hours. The ml.g5.2xlarge instance we used costs $1.515 per hour for on-demand usage. As a result, the total cost for training our fine-tuned BLOOMZ-7B model was only $8.63. Comparatively, full fine-tuning of the model’s 7 billion weights would cost an estimated $600, or 6,900% more per training run, assuming linear GPU scaling on the original computing configuration outlined in the Hugging Face model card. In practice, this would further vary depending upon your training strategy, instance selection, and instance pricing.
We could also further reduce our training costs by using SageMaker managed Spot Instances. However, there is a possibility this would result in the total training time increasing due to Spot Instance interruptions. See Amazon SageMaker Pricing for instance pricing details.
Deploy the model to a SageMaker endpoint for inference
With LoRA, you previously adapted a smaller set of weights to your new task. You need a way to combine these task-specific weights with the pre-trained weights of the original model. In the run_clm.py script, the PEFT library merge_and_unload() method takes care of merging the base BLOOMZ-7B model with the updated adapter weights fine-tuned to your task to make them easier to deploy without introducing any inference latency compared to the original model.
In this section, we go through the steps to create a SageMaker model from the fine-tuned model artifact and deploy it to a SageMaker endpoint for inference. First, you can create a Hugging Face model using your new fine-tuned model artifact for deployment to a SageMaker endpoint. Because you previously trained the model with a SageMaker Hugging Face estimator, you can deploy the model immediately. You could instead upload the trained model to an S3 bucket and use them to create a model package later. See the following code:

from sagemaker.huggingface import HuggingFaceModel

# 1. create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
model_data=huggingface_estimator.model_data,
#model_data=”s3://hf-sagemaker-inference/model.tar.gz”, # Change to your model path
role=role,
transformers_version=”4.26″,
pytorch_version=”1.13″,
py_version=”py39″,
model_server_workers=1
)

As with any SageMaker estimator, you can deploy the model using the deploy() method from the Hugging Face estimator object, passing in the desired number and type of instances. In this example, we use the same G5 instance type equipped with a single NVIDIA A10g GPU that the model was fine-tuned on in the previous step:

# 2. deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type= “ml.g5.4xlarge”
)

It may take 5–10 minutes for the SageMaker endpoint to bring your instance online and download your model in order to be ready to accept inference requests.
When the endpoint is running, you can test it by sending a sample dialog from the dataset test split. First load the test split using the Hugging Face Datasets library. Next, select a random integer for index slicing a single test sample from the dataset array. Using string formatting, combine the test sample with a prompt template into a structured input to guide our model’s response. This structured input can then be combined with additional model input parameters into a formatted sample JSON payload. Finally, invoke the SageMaker endpoint with the formatted sample and print the model’s output summarizing the sample dialog. See the following code:

from random import randint
from datasets import load_dataset

# 1. Load dataset from the hub
test_dataset = load_dataset(“samsum”, split=”test”)

# 2. select a random test sample
sample = test_dataset[randint(0,len(test_dataset))]

# 3. format the sample
prompt_template = f”Summarize the chat dialogue:n{{dialogue}}n—nSummary:n”

fomatted_sample = {
“inputs”: prompt_template.format(dialogue=sample[“dialogue”]),
“parameters”: {
“do_sample”: True, # sample output predicted probabilities
“top_p”: 0.9, # sampling technique Fan et. al (2018)
“temperature”: 0.1, # increasing the likelihood of high probability words and decreasing the likelihood of low probability words
“max_new_tokens”: 100, #
}
}

# 4. Invoke the SageMaker endpoint with the formatted sample
res = predictor.predict(fomatted_sample)

# 5. Print the model output
print(res[0][“generated_text”].split(“Summary:”)[-1])
# Sample model output: Kirsten and Alex are going bowling this Friday at 7 pm. They will meet up and then go together.

Now let’s compare the model summarized dialog output to the test sample summary:

print(sample[“summary”])
# Sample model input: Kirsten reminds Alex that the youth group meets this Friday at 7 pm to go bowling.

Clean up
Now that you’ve tested your model, make sure that you clean up the associated SageMaker resources to prevent continued charges:

predictor.delete_model()
predictor.delete_endpoint()

Summary
In this post, you used the Hugging Face Transformer, PEFT, and the bitsandbytes libraries with SageMaker to fine-tune a BloomZ large language model on a single GPU for $8 and then deployed the model to a SageMaker endpoint for inference on a test sample. SageMaker offers multiple ways to use Hugging Face models; for more examples, check out the AWS Samples GitHub.
To continue using SageMaker to fine-tune foundation models, try out some of the techniques in the post Architect personalized generative AI SaaS applications on Amazon SageMaker. We also encourage you to learn more about Amazon Generative AI capabilities by exploring JumpStart, Amazon Titan models, and Amazon Bedrock.

About the Authors
Philipp Schmid is a Technical Lead at Hugging Face with the mission to democratize good machine learning through open source and open science. Philipp is passionate about productionizing cutting-edge and generative AI machine learning models. He loves to share his knowledge on AI and NLP at various meetups such as Data Science on AWS, and on his technical blog.
Robert Fisher is a Sr. Solutions Architect for Healthcare and Life Sciences customers. He works closely with customers to understand how AWS can help them solve problems, especially in the AI/ML space. Robert has many years of experience in software engineering across a range of industry verticals including medical devices, fintech, and consumer-facing applications.
Doug Kelly is an AWS Sr. Solutions Architect that serves as a trusted technical advisor to top machine learning startups in verticals ranging from machine learning platforms, autonomous vehicles, to precision agriculture. He is member of the AWS ML technical field community where he specializes in supporting customers with MLOps and ML inference workloads.

Best 10+ Password Managers in 2023

Posted on June 5, 2023 by i-genie

The ability to remember even a single lengthy password is impressive. The human mind isn’t wired to create and remember dozens of complex passwords for each of our online accounts. Many people take the risky practice of using the same password for all their internet accounts since it’s easier to remember.

Criminals may crack a weak password as easily as they can remember it. A credential-stuffing attack targets users whose passwords have already been hacked to gain access to several accounts. You could broadcast your passwords for every service you use globally.

A password manager is a web-based tool that provides a secure vault for storing user names and passwords for various websites and services. A single master password grants access to the encrypted vault, where all of your other passwords are stored, and is all you need to know. The protected vault of a password manager service is typically accessed via a user-friendly online interface, mobile app, or browser plugin.

Here are some of the best password managers to use

Bitwarden

If you’re looking for a premium password manager that checks all the boxes regarding security, usability, transparency, cost, and convenience, then Bitwarden is a great option. For various reasons, Bitwarden is our top pick for password management. Bitwarden is the first choice since it is completely free, safe, and audited by independent cybersecurity organizations annually. Bitwarden is unique among competitors in a field where trust is crucial because of its commitment to complete openness.

1Password

You need to go no further than 1Password if you want a password manager that syncs between devices and has additional features. 1Password’s new Travel Mode is ideal for freelancers, digital nomads, and business travelers. The 1Password password manager is reliable, secure, and packed with useful features. Operating systems like Mac OS X, Windows, Linux, Android, iOS, and browsers like Chrome, Safari, Firefox, and Brave all offer clean, intuitive user interfaces. The ability to have it autofill your information streamlines online shopping and account access. With a family plan, you and up to five close friends or relatives and five more guests can all use the same vault to store and access your valuables. It’s simple to disseminate both passwords and vault contents. Safely share items with those who aren’t using 1Password.

NordPass

By the same people that brought you NordVPN (one of CNET’s top VPN selections) comes NordPass, a password management tool. Although Nord’s password manager is still in its infancy, the company has made significant improvements over the past year, bringing it to the industry standard and earning it a slot among our recommended password managers. If you’re already familiar with NordVPN or the rest of the Nord Security ecosystem and want a top-tier password manager, NordPass is an easy choice.

Keeper

Keeper is a great option for households in search of either an offline vault or a cloud storage and dark web surveillance family membership. Keeper is a trustworthy, well-known password manager with an intuitive user interface, all the features you need, and some extras. Like the above password managers, Keeper allows you to sync your vault across unlimited devices. The service is not as cross-platform as other similar tools. Your locker is accessible via the web interface or native Windows, MacOS, Linux, Android, and iOS apps with Keeper. Keeper features fewer browser extensions than other premium password managers, only supporting Chrome, Firefox, Safari, Edge, and Opera.

Dashlane

The Dashlane desktop software is incredibly user-friendly. The ability to reset many passwords at once is its main selling point. The password organizer is well-made, simple, and effective at completing web forms with your data. Email inboxes are also scanned to unearth forgotten web profiles. The expensive cost is the main downside of Dashlane. The yearly cost of the Premium plan is $60, or $78 if paid monthly. Dashlane’s free plan only supports one computer, but it can hold unlimited passwords.

Bitdefender

If you’re looking for a trustworthy password manager, Bitdefender Password Manager is a steal at $20 for the first year and $30 for renewal. Bitdefender licensed the technology from SaferPass and integrated it into its Central web portal to speed up the launch of its new password manager. Bitdefender Password Manager is a user-friendly password manager compatible with Windows, Mac, Android, and iOS (but not Linux). This password manager still needs support for passkeys. However, this feature is on the roadmap. Bitdefender also provides browser add-ons for Chrome, Firefox, Safari, and Microsoft Edge.

LastPass

Despite a significant decrease in the quality of its free tier, LastPass remains one of the best password managers due to its wide platform support, intuitive interface, and plenty of useful features. The free edition of LastPass no longer synchronizes between PCs and mobile devices but only between the former two. However, it still includes many of the same capabilities as the premium edition, including the ability to generate strong passwords and safely store an infinite number of them. The subscription edition includes dark web account monitoring, 1GB of cloud storage space, unlimited synchronization across all devices, and priority technical assistance.

Enpass

While the free desktop version of Enpass is powerful and has no storage limits, the free versions of the iOS and Android mobile apps have a limit of 25 passwords each. Even though Enpass simplifies the process, device synchronization is still required. Since Enpass lacks cloud-syncing capabilities, we recommend using Dropbox, Microsoft OneDrive, or any similar service. (That could be viewed as a perk by some in terms of safety.) The desktop version of Enpass syncs locally with a mini-file server. It connects to adjacent Wi-Fi devices, which is especially useful for users concerned about their data usage. Using Enpass on a personal computer is a breeze because of the streamlined UI. The incredibly polished design went into creating these smartphone applications. Logins are performed with biometrics across the board.

LogMeOnce

LogMeOnce eliminates memorizing a master password by storing all your credentials in an encrypted vault. A personal identification number (PIN), biometric data, or a photograph can unlock the safe. Because of this, LogMeOnce stands apart from other password managers. Aside from this one key difference, LogMeOnce functions similarly to its competitors. It provides end-to-end encrypted storage and synchronization of sensitive information like passwords and payment cards across many devices. It has more functions, like monitoring the dark web and cyber threats, although they cost a little more. LogMeOnce is one of the most innovative password managers to try.

KeePass

KeePass is the password manager of choice for perfectionists. The fact that it’s open source and doesn’t have a sleek, all-encompassing user interface like other password managers can put off the ordinary user. However, tech-savvy tinkerers will appreciate the adaptability. While the core functionality stands on its own, the program’s full potential requires some technical knowledge to use available extensions. KeePass’s lack of online storage is a huge positive for privacy-conscious users. Since everything is kept locally, you won’t need to rely on the safety measures taken by an internet service (like LastPass) to safeguard your sensitive information. A knowledgeable user will use a private cloud account to share the file with other gadgets. If you want a password manager that you can tailor to your needs without breaking the bank or being subject to a service provider’s rules and regulations, this is the product for you.

IronVest

IronVest is a simple and uncomplicated approach when making purchases online to safeguard your passwords, identities, credit cards, and email addresses. IronVest does more than store your passwords securely; it also aims to make your online experience more secure, which is why it stands apart from other password managers. IronVest, a still-young firm, has wowed with more than simply reliable password management software; it can also obscure personal data and prevent tracking. IronVest produces and submits a masked version of your information when you enter your credit card number, email address, or other sensitive data on a site. It does this by hiding your identity when you shop online. It’s a cool extra that sets IronVest apart from similar products.

Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Best 10+ Password Managers in 2023 appeared first on MarkTechPost.

This AI Research Dives Into The Limitations and Capabilities of Transf …

Posted on June 5, 2023 by i-genie

ChatGPT is trending, and millions of people are using it every day. With its incredible capabilities of imitating humans, such as question answering, generating unique and creative content, summarizing massive textual data, code completion, and developing highly useful virtual assistants, ChatGPT is making our lives easier. Developed by OpenAI, ChatGPT is based on GPT 3.5 (Generative Pre-Trained Transformer) and GPT 4’s transformer architecture. GPT 4, the latest version of language models released by OpenAI, is multimodal in nature, i.e., it takes in input in the form of text and images, unlike the previous versions. Even other Large Language Models (LLMs) like PaLM, LLaMA, and BERT are being used in applications of various domains involving healthcare, E-commerce, finance, education, etc.

A team of researchers has highlighted the difference between the impressive performance of LLMs like GPT on complex tasks and their struggles with simple tasks in a recently released research paper. Diving into the limitations and capabilities of Transformer LLMs, the team has conducted experiments on three representative compositional tasks: multi-digit multiplication, logic grid puzzles, and a classic dynamic programming problem. These tasks involve breaking down problems into smaller steps and combining those steps to produce an exact solution.

With the aim of studying the limits of Transformers in solving compositional tasks that require multi-step reasoning, the authors have proposed two hypotheses. The first is that the Transformers accomplish tasks by linearizing multi-step reasoning into path matching, thus relying on pattern-matching and shortcut learning rather than actually comprehending and implementing the underlying computational rules required to develop proper solutions. This approach enables fast and accurate predictions in similar patterns during training but fails to generalize to uncommon complex examples. The second hypothesis states that Transformers may have inherent limitations while trying to solve high-complexity compositional tasks having unique patterns. Early computational errors might spread and result in severe compounding errors in later steps, preventing the models from arriving at the right solution.

The authors have formulated the compositional tasks as computation graphs in order to investigate the two hypotheses. These graphs decompose the process of solving problems into smaller, more manageable submodular functional steps, enabling structured measures of problem complexity and verbalization of computing steps as input sequences to language models. They even use information gain to make predictions about the patterns that models would probably learn based on the underlying task distribution without running full computations within the graph.

Based on the empirical findings, the authors have proposed that the Transformers handle compositional challenges by reducing multi-step reasoning into linearized subgraph matching. They have provided theoretical arguments based on abstract multi-step reasoning problems, which highlight that as the task complexity increases, Transformers’ performance rapidly deteriorates. This shows that the models might already be constrained in their ability to handle compositional problems of great complexity.

In conclusion, the empirical and theoretical results imply that rather than a thorough comprehension of the underlying thinking processes, Transformers’ performance is mostly driven by pattern matching and subgraph matching, which also supports the idea that Transformers would find it difficult to do increasingly difficult tasks.

Check Out The Paper. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post This AI Research Dives Into The Limitations and Capabilities of Transformer Large Language Models (LLMs), Empirically and Theoretically, on Compositional Tasks appeared first on MarkTechPost.

10+ Best URL Shorteners For Business And Marketing 2023

Posted on June 5, 2023 by i-genie

Digital tools exist to shorten lengthy URLs, characterized variously as link shorteners, URL shorteners, link compressors, and URL condensers. These URL shorteners are useful for individuals or startups in several ways. Whether using social media to promote your business or sending out a weekly newsletter, most people won’t click on a link to a website if it’s full of random numbers and letters. Long URLs may discourage people from clicking on them. To improve your marketing’s clickthrough rates, try using a link shortener to shorten the long URLs you’ll be using.

Here are a few URL shorteners on the market:

Hopp.co

If you’re a content creator that publishes online and would like to expand your audience and money, Hopp.co is your free URL shortener. Hopp.co’s remarkable tools are available to all artists, from Instagram influencers to food bloggers. Features such as unrestricted link usage, detailed analytics, and domain-specific short link generation are all included. What sets Hopp.co apart from other link shorteners is its “pre-roll” customizable promotional page.

Bitly

Bitly offers everything you need in a professional URL shortener. Compared to other link management services, this one is user-friendly and feature-rich. It provides a complete dashboard where you can monitor the performance of your social links and other activities and make it easy to create short links. While Bitly’s free plan used to be highly recommended, things have changed. With its QR code generator and new link-in-bio feature, you can now keep track of all your short links in one spot. There is a monthly cap of 10 links using your five allotted custom URL tails (thus, bit.ly/CUSTOM).

TinyURL

TinyURL has existed since 2002 because it efficiently and effectively shortens URLs with minimal user input. When you’re in a pinch and need an always-active short link, this is your option. Copy and paste your long link into the provided box, alter the second half of the URL as desired, then hit the Make TinyURL button! After then, there’s no need to worry about whether or not the link will eventually go inactive. You can use TinyURL without creating an account, but if you do, you’ll have access to a log of the links you’ve shortened. A premium account is available if you require advanced features like tracking and analytics, custom domains, and post-creation edits to your TinyURLs. If your requirements are minimal, you can go right.

BL.INK

BL.INK, a fully-featured URL shortener for businesses of all sizes, can shorten long URLs and track their traffic. The dashboard shows popular links and broad information, while the analytics page goes into device, location, and referrers. Clicks can also be categorized by time. Tag your shortened URLs to measure campaign clicks. BL.INK’s four premium plan tiers offer customizable link production and tracking pricing for small organizations, teams, and large enterprises, starting at $48/month. A free account lets you track up to 1,000 clicks on branded links with a custom domain. BL.INK is an excellent URL shortener for your entire firm. This link shortener includes features like multiple users for less than $50 a month, while others charge hundreds.

Zapier’s URL Shortener

Zapier’s URL Shortener is the way to go if you want a shortened link created and saved automatically whenever you do an action in more than 5,000 applications, such as when you upload an image to your Instagram account or a new product to your Shopify store. The shortened URL can be sent to another app or saved in a Google Sheet by Zapier. Zapier allows you to set up automated workflows that run whenever certain conditions are satisfied in the apps you use most, for as, when a new post is uploaded, a new product is created, etc.

Short.io

Short.io goes above and beyond the capabilities of other URL shorteners by allowing you to specifically target visitors based on their location or device type and redirect them to a different website. This is helpful if you sell to customers in Canada, the United States, Australia, and Singapore and need to show them the appropriate currency for their purchases. When adding a link, you can specify who you wish to send to that link by clicking the iPhone, Android, or World icons, respectively. You could send a different message to iOS and PC users, but that would likely need to be clarified for everyone involved. Instead, you should only employ this function if you have a compelling reason to direct certain categories of visitors to slightly distinct web pages.

Tinycc

Tinycc shortens long URLs and tracks traffic. This tool helps bloggers, marketers, and others generate more hits and shares on their links. Smartphones and tablets support Tinycc. Join Tinycc for free and paste your long URL into the box. Tinycc will create a unique, short connection for you. Personalize your short link by adding a word or phrase. Your new short link can be used on social networking, email signatures, etc. After clicking your abbreviated link, users will be forwarded to your full URL. Tinycc also offers link monitoring tools. Link clickthrough rates, geography breakdowns, and device clickthroughs are visible. This data can improve your advertising and link performance.

T2M

T2M is a potent URL shortener that lets you make branded links for your online content, whether a website, blog, or social media post. To measure the success of your links, you may monitor their clickthrough and conversion rates with T2M. The use of T2M is simple. Enter the long URL you want to be shortened, and T2M will return a short, personalized link. Adding a term or phrase to your short link is another way to personalize it. T2M is an excellent tool for shortening lengthy URLs. Clickthrough and conversion rates provide insight into how well your links are performing. T2M is a perfect choice if you need a robust URL shortener that supports branded links. It’s simple to implement and comes loaded with tools for gauging your links’ performance.

Cutt.ly

For your website, blog, or social network postings, Cutt.ly is a potent URL shortener that may help you produce short, memorable URLs. You can see your links’ performance by tracking their clickthroughs and conversions using Cutt.ly. Using Cutt.ly is a breeze. To create a short, personalized link, input the long URL you wish to shorten into Cutt.ly. Adding a term or phrase to your short link is another way to personalize it. Cutt.ly is an excellent tool for reducing the length of lengthy URLs. Your links’ clickthrough and conversion rates are metrics you can use to gauge their efficacy. Cutt.ly is perfect if you need a robust URL shortener with in-depth analytics. It’s simple to implement and comes loaded with tools for measuring your links’ performance.

ClickMeter

ClickMeter, a powerful link-tracking tool, can track clickthroughs, conversions, and bounce rates. ClickMeter lets you track link clicks and optimize your marketing. ClickMeter is easy. Join ClickMeter and add your links to track their success. ClickMeter will analyze your links’ performance—ClickMeter tracks who clicks on your links, converts, and leaves. ClickMeter lets you customize reports with the information you need. ClickMeter can combine Google Analytics and HubSpot. The smartphone app enables you to track links anywhere. If you require a powerful link monitoring tool, ClickMeter is great. It’s easy to use and generates detailed reports to improve your advertising.

Pixel

Pixel lets you customize and share trackable short links and QR Codes. Capture data and make smarter decisions on where to focus your time and money.

Check Out 100’s AI Tools in AI Tools Club
The post 10+ Best URL Shorteners For Business And Marketing 2023 appeared first on MarkTechPost.

Researchers From UT Austin and UC Berkeley Introduce Ambient Diffusion …

Posted on June 4, 2023 by i-genie

For learning high-dimensional distributions and resolving inverse problems, generative diffusion models are emerging as flexible and potent frameworks. Text conditional foundation models like Dalle-2, Latent Diffusion, and Imagen have achieved remarkable performance in generic picture domains due to several recent advancements. Diffusion models have recently shown their ability to memorize samples from their training set. Moreover, an adversary with simple query access to the model can obtain dataset samples, raising privacy, security, and copyright concerns.

The researchers present the first diffusion-based framework that can learn an unknown distribution from heavily contaminated samples. This issue emerges in scientific contexts where obtaining clean samples is difficult or costly. Because the generative models are never exposed to clean training data, they are less likely to memorize particular training samples. The central concept is to further corrupt the original distorted image during diffusion by introducing additional measurement distortion and then challenging the model to predict the original corrupted image from the other corrupted image. Scientific investigation verifies that the approach generates models capable of acquiring the conditional expectation of the complete uncorrupted image in light of this additional measurement corruption. Inpainting and compressed sensing are two corruption methods that fall under this generalization. By training them on industry-standard benchmarks, scientists show that their models can learn the distribution even when all training samples are missing 90% of their pixels. They also demonstrate that foundation models can be fine-tuned on small corrupted datasets, and the clean distribution can be learned without memorization of the training set.

Notable Features

The central concept of this research is to distort the image further and force the model to predict the distorted image from the image.

Their approach trains diffusion models using corrupted training data on popular benchmarks (CelebA, CIFAR-10, and AFHQ).

Researchers give a rough sampler for the desired distribution p0(x0) based on the learned conditional expectations.

As demonstrated by the research, one can learn a fair amount about the distribution of original photos, even if up to 90% of the pixels are absent. They have better results than both the prior best AmbientGAN and natural baselines.

Never seeing a clean image during training, the models are shown to perform similarly to or better than state-of-the-art diffusion models for handling certain inverse problems. While the baselines necessitate many diffusion stages, the models only need a single prediction step to accomplish their task.

The approach is used to further refine standard pretrained diffusion models in the research community. Learning distributions from a small number of tainted samples is possible, and the fine-tuning process only takes a few hours on a single GPU.

Some corrupted samples on a different domain can also be used to fine-tune foundation models like Deepfloyd’s IF.

To quantify the learning effect, researchers compare models trained with and without corruption by showing the distribution of top-1 similarities to training samples.

Models trained on sufficiently distorted data are shown not to retain any knowledge of the original training data. They evaluate the compromise between corruption (which determines the level of memorization), training data, and the quality of the learned generator.

Limitations

The level of corruption is inversely proportional to the quality of the generator. The generator is less likely to learn from memory when the level of corruption is increased but at the expense of quality. The precise definition of this compromise remains an unsolved research issue. And to estimate E[x0|xt] with the trained models, researchers tried basic approximation algorithms in this work.

Furthermore, establishing assumptions about the data distribution is necessary to make any stringent privacy assurance regarding the protection of any training sample. The supplementary material shows that the restoration oracle can restore E precisely [x0|xt], although researchers do not provide a technique.

This method will not work if the measurements also contain noise. Using SURE regularization may help future research get around this restriction.

Check Out The Paper and Github link. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Researchers From UT Austin and UC Berkeley Introduce Ambient Diffusion: An AI Framework To Train/Finetune Diffusion Models Given Only Corrupted Data As Input appeared first on MarkTechPost.

How Should We Maximize the Planning Ability of LLMs While Reducing the …

Posted on June 4, 2023 by i-genie

Artificial Intelligence is rapidly popularizing and for all good reasons. With the introduction of Large Language Models like GPT, BERT, and LLaMA, almost every industry, including healthcare, finance, E-commerce, and media, is making use of these models for tasks like Natural Language Understanding (NLU), Natural Language Generation (NLG), question answering, programming, information retrieval and so on. The very famous ChatGPT, which has been in the headlines ever since its release, has been built with the GPT 3.5 and GPT 4’s transformer technology.

These AI systems imitating humans are heavily dependent on the development of agents that are capable of exhibiting problem-solving abilities similar to humans. The three primary approaches for developing agents that can address complex interactive reasoning tasks are – Deep Reinforcement Learning (RL), which involves training agents through a process of trial and error, Behavior Cloning (BC) through Sequence-to-Sequence (seq2seq) Learning which involves training agents by imitating the behavior of expert agents and Prompting LLMs in which generative agents based on prompting LLMs produce reasonable plans and actions for complex tasks.

RL-based and seq2seq-based BC approaches have some limitations, such as task decomposition, inability to maintain long-term memory, generalization to unknown tasks, and exception handling. Due to repeated LLM inference at each time step, the prior approaches are also computationally expensive.

Recently, a framework called SWIFTSAGE has been proposed to address these challenges and enable agents to imitate how humans solve complex, open-world tasks. SWIFTSAGE aims to integrate the strengths of behavior cloning and prompt LLMs to enhance task completion performance in complex interactive tasks. The framework draws inspiration from the dual process theory, which suggests that human cognition involves two distinct systems: System 1 and System 2. System 1 involves rapid, intuitive, and automatic thinking, while System 2 entails methodical, analytical, and deliberate thought processes.

The SWIFTSAGE framework consists of two modules – the SWIFT module and the SAGE module. Similar to System 1, the SWIFT module represents quick and intuitive thinking. It is implemented as a compact encoder-decoder language model that has been fine-tuned on the action trajectories of an oracle agent. The SWIFT module encodes short-term memory components like previous actions, observations, visited locations, and the current environment state, followed by decoding the next individual action, thus aiming to simulate the rapid and instinctive decision-making process shown by humans.

The SAGE module, on the other hand, imitates thought processes similar to System 2 and utilizes LLMs such as GPT-4 for subgoal planning and grounding. In the planning stage, LLMs are prompted to locate necessary items, plan, track subgoals, and detect and rectify potential mistakes, while in the grounding stage, LLMs are employed to transform the output subgoals derived from the planning stage into a sequence of executable actions.

The SWIFT and SAGE modules have been integrated through a heuristic algorithm that determines when to activate or deactivate the SAGE module and how to combine the outputs of both modules using an action buffer mechanism. Unlike previous methods that generate only the immediate next action, SWIFTSAGE engages in longer-term action planning.

For evaluating the performance of SWIFTSAGE, experiments have been conducted on 30 tasks from the ScienceWorld benchmark. The results have shown that SWIFTSAGE significantly outperforms other existing methods, such as SayCan, ReAct, and Reflexion. It achieves higher scores and demonstrates superior effectiveness in solving complex real-world tasks.

In conclusion, SWIFTSAGE is a promising framework that combines the strengths of behavior cloning and prompting LLMs. It thus can be really beneficial in enhancing action planning and improving performance in complex reasoning tasks.

Check Out The Paper, Github link, and Project Page. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post How Should We Maximize the Planning Ability of LLMs While Reducing the Computation Cost? Meet SwiftSage: A Novel Generative Agent for Complex Interactive Reasoning Tasks, Inspired by the Dual-Process Theory of Human Cognition appeared first on MarkTechPost.

Researchers from Imperial College London Propose FitMe: An AI Model th …

Posted on June 4, 2023 by i-genie

Despite the enormous advancements in the previous ten years, 3D facial reconstruction from a single unconstrained image remains a significant research issue with a vibrant computer vision community. Its uses are now numerous and diverse, including but not limited to human digitization for applications in virtual and augmented reality, social media and gaming, the generation of synthetic datasets, and health applications. Recent studies, however, frequently need to produce components that may be utilized for photorealistic rendering and fall short of precisely recreating the identities of various people.

3D Morphable Models (3DMM) are a popular method for obtaining face form and appearance from a single “in-the-wild” shot. This can be attributed to several factors, including the need for comprehensive datasets of scanned human geometry and reflectance, the limited and muddled information found in a single facial image, and the limitations of current statistical and machine-learning methods. To model face shape and appearance with variable identity and expression, which were learned from over 200 participants, Principal Component Analysis (PCA) was used in the initial 3DMM study.

Since then, more complex models comprising thousands of individuals, such as the LSFM, Basel Face Model, and Facescape, have been developed. Additionally, 3DMMs of entire human heads or other facial features, including the ears and tongue, have been developed recently. Finally, subsequent publications have included expansions that range from directly regressing 3DMM parameters to non-linear models. Such models, however, are unable to create textures with photorealistic realism. Deep generative models have witnessed significant advancements during the past ten years. Progressive GAN architectures, in particular, have produced outstanding results in learning distributions of high-resolution 2D photographs of human faces using Generative Adversarial Networks (GANs).

Recently, meaningful latent regions that may be traversed to reconstruct and control various aspects of the produced samples have been learned using style-based progressive generative networks. Some techniques, like UV mapping, have also successfully acquired a 2D representation of 3D face features. To produce 2D facial pictures, rendering functions can use 3D facial models produced by 3DMMs. Iterative optimization also necessitates differentiating the rendering process. Recent developments in the photorealistic differentiable rendering of such assets are made possible by differentiable rasterization, photorealistic face shading, and rendering libraries.

Unfortunately, the Lambertian shading model used in 3DMM works falls short of accurately representing the intricacy of face reflectance. The problem is that more than a single RGB texture is needed for lifelike facial representation, which calls for various facial reflectance factors. Although recent attempts have been made to simplify such settings, such datasets are few, tiny, and challenging to acquire. High-fidelity and relightable facial reflectance reconstructions have been made possible by several modern methods, including infrared ones. However, these reconstructions still need to be discovered. Furthermore, it has been demonstrated that strong models can capture facial looks using deep models but cannot display single or multiple picture reconstructions.

In a contemporary alternative paradigm that relies on learned neural rendering, implicit representations capture avatar appearance and shape. Despite their excellent performance, standard renderers cannot employ such implicit representations and are typically not relightable. The most current Albedo Morphable Model (AlbedoMM) also uses a linear PCA model to record facial reflectance and shape. Still, the per-vertex colour and normal reconstruction are too low-resolution for photorealistic depiction. From a single “in-the-wild” photograph, AvatarMe++ can rebuild high-resolution texture maps of facial reflectance. However, the three steps of the process—reconstruction, upsampling, and reflectance—cannot be directly optimized with the input image.

Researchers from Imperial College London introduce FitMe which is a fully renderable 3DMM that can be fitted on free facial pictures using precise differentiable renderings based on high-resolution face reflectance texture maps. FitMe establishes identity similarity and produces highly realistic, fully renderable reconstructions that may be used immediately by rendering programs that are available off the shelf. The texture model is built as a multimodal style-based progressive generator that simultaneously creates the face’s surface normals, specular albedo, and diffuse albedo. A painstakingly crafted branching discriminator allows easy training with various statistics modalities.

They optimize AvatarMe++ on the publicly available MimicMe dataset to build a capture quality face reflectance dataset of 5k people, which they further modify to balance skin-tone representation. A face and a head PCA model, trained on sizable geometry datasets, are used interchangeably for the form. They create a style-based generator projection and 3DMM fitting-based single- or multi-image fitting approach. The rendering function must be differentiable and quick to do effective iterative fitting (in less than one minute), rendering models like path tracing useless. Prior research has relied on slower optimization or simpler shading models (such as Lambertian).

They improve on previous work by adding shading that is more lifelike in appearance and has convincing diffuse and specular rendering that can acquire form and reflectance for photorealistic rendering in common rendering engines (Fig. 1). FitMe can rebuild high-fidelity facial reflectance and achieve remarkable identity similarity while precisely capturing features in diffuse, specular albedo, and normals because to the flexibility of the generator’s expanded latent space and the photorealistic fitting.

Figure 1: FitMe uses a reflectance model and differentiable rendering to reconstruct relightable form and reflectance maps for facial avatars from a single (left) or several (right) unconstrained face pictures. In typical engines, the findings can be displayed in photorealistic detail.

Overall, in this work, they present the following:

• The first 3DMM capable of producing high-resolution facial reflectance and shape, with an increasing level of detail, that can be rendered in a photorealistic manner

• A technique to acquire and augment

•The first branched multimodal style-based progressive generator of high-resolution 3D facial assets (diffuse albedo, specular albedo, and normals), as well as a suitable multimodal branched discriminator

Check Out The Paper and Project Page. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Researchers from Imperial College London Propose FitMe: An AI Model that Turns Arbitrary Facial Images to Relightable Facial Avatars, Directly Usable in Common Gaming and Rendering Engines appeared first on MarkTechPost.

Meet ClarifyDelphi: An Interactive System That Elicits Context From So …

Posted on June 3, 2023 by i-genie

Context is the most important element of effective communication. It shapes the meaning of the words so that they are correctly heard and interpreted by the listener. Context is important as it informs the speaker and the listener how important to consider something, what inferences to make about what is being communicated, and most significantly, it specifies the meaning behind the message. When making moral decisions and using common sense and moral reasoning in social circumstances and acts, context is equally crucial.

The previous model, called Delphi, models moral judgments of individuals in a variety of everyday situations, but it lacks the necessary knowledge of the surrounding context. To overcome the limitation of Delphi, the team of researchers has proposed CLARIFYDELPHI, which is an interactive system that learns to clarify statements in order to draw salient context out of a situation and enhance moral judgments, as a solution to this problem. The authors have mentioned that this model asks questions like ‘Why did you lie to your friend?’ to obtain the missing context.

According to the authors, the most instructive questions are those that could result in answers that lead to moral judgments that differ. In other words, it shows that context is very important in determining moral judgment if different responses to a question lead to varied moral assessments. To achieve this, the team has created a reinforcement learning framework with a defeasibility reward. This framework maximizes the divergence between moral judgments associated with hypothetical answers to a question. The authors have suggested that Proximal Policy Optimization (PPO) can be used to optimize the generation of questions that obtain responses with context.

Upon evaluation, CLARIFYDELPHI outperforms other baseline methods for generating clarification questions by giving more relevant, informative, and defeasible questions. The questions that CLARIFYDELPHI generates have meaningful conclusions, demonstrating the efficiency of their method for obtaining crucial contextual data. The authors have also quantified the amount of supervised clarification question training data required for a good initial policy and have demonstrated that questions contribute to generating defeasible updates.

The contributions of the team can be summarized as follows –

The team has proposed a Reinforcement Learning-based technique that defines defeasibility as a new form of relevance for clarification questions in order to introduce the task of clarification question generation for social and moral situations.

The team has publicly released δ-CLARIFY, which is a dataset of 33k crowdsourced clarification questions.

It has also released δ-CLARIFYsilver, which contains generated questions conditioned on a defeasible inference dataset.

The trained models, along with their codes, can be accessed.

The adaptability of human moral reasoning involves defining when a moral rule should apply and recognizing legitimate exceptions in light of contextual requirements. CLARIFYDELPHI creates queries that reveal context that is missing and allows for more accurate moral judgments. Compared to the other approaches, ClarifyDelphi generates more questions, leading to either weakening or strengthening answers.

Consequently, CLARIFYDELPHI seems promising and an incredible model for generating informative and relevant questions that are capable of revealing diverging moral judgments.

Check Out 100’s AI Tools in AI Tools Club
The post Meet ClarifyDelphi: An Interactive System That Elicits Context From Social And Moral Situations Using Reinforcement Learning appeared first on MarkTechPost.

Intel Unveils Aurora genAI: A Trillion-Parameter AI Model to Revolutio …

Posted on June 3, 2023 by i-genie

At the ISC23 keynote, Intel announced Aurora genAI – a science-focused generative AI model with a trillion parameters, almost six times more than in the free and public versions of ChatGPT. This news has sparked conversations about all the possibilities this model can unlock.

It has always been well understood that to train and build models near human standards, companies require a humongous amount of computational power, where fine-tuning begins at the hardware level.

Intel’s bold vision makes a solid case backed up by one of the largest chip manufacturers. It has already proved its capability to produce a chip that matches and is often treated as a gold standard in compatible hardware for AI upscaling.

Backed by 2 Exaflops Intel’s Aurora Supercomputer, with Megatron and DeepSpeed models as its foundation, the Aurora-GenAI model promises to train scientific data, general data, scientific and machine codes, and other texts related primarily to the scientific domain with 1 trillion parameters which are almost six times the parameters we see in open and publically accessible versions of ChatGPT.

Intel is focussing on building this model to cater to the science community and accelerate the advancement in System Biology, Cancer Research, Climate Science, Cosmology, Polymer Chemistry, and Materials Science.

The Deep learning models we use today are well-trained in solving systematic problems. These systems can translate anything you can put down in a step-by-step manner. You can upscale it and use it on the fly to solve real-world problems. Besides the obvious existing use cases, people now expect to find patterns among complex data, such as molecular biology and formulation. Things like molecular binding patterns and compatibility revelations in a way that takes a lot of work to wrap one’s head around with conventional methods.

What is more interesting is that Intel aims to predict the patterns and bottlenecks we leave out due to a lack of vision and understanding of the use case at hand, especially when it is coupled with time. To understand it, This model will aim to predict the problematic scenarios we can’t estimate or see yet but are likely to come up at some point in time, given what the data is about that particular problem.

People are taking this announcement with lots of excitement and positivity in the AI community. People are more invested in knowing how it would perceive and understand the topics that are, by nature, more sensitive and challenging, for example, political scenarios and policy-making, prevalent social issues, climate changes, cosmological predictions, and its take on it to solve them to some level.

It is interesting to understand here that this project is a work in progress right now, and it is talking about a future commitment. In reality, it is still, in fact, an announcement. The project will be developed in collaboration with Argonne National Laboratory and HPE.

In conclusion, this news brings a lot of hope not only to the AI community but also to retail investors. This news drives positive sentiments up for Intel, making it a promising stock option to explore, which certainly puts Intel in a good spot. It would be interesting to see how Intel will play out against some of its closest competitors in the market, such as Nvidia, and how well its model will adapt to the commitments made.

Check Out The Announcement. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Intel Unveils Aurora genAI: A Trillion-Parameter AI Model to Revolutionize Scientific Breakthroughs and Predict the Unseen appeared first on MarkTechPost.

Best Telegram AI Chatbots in 2023

Posted on June 3, 2023 by i-genie

The advent of chatbots driven by artificial intelligence has significantly impacted how humans communicate with one another, acquire new knowledge, and carry out repetitive tasks. Telegram, one of the most popular messaging systems, has embraced this trend by providing a home for numerous innovative AI chatbots that cater to a wide range of needs and interests. From the latest weather reports and language lessons to financial assistance and guided meditation, AI chatbots put a wide range of answers within reach.

Chatbots powered by artificial intelligence can read and comprehend human speech and respond appropriately. These chatbots commonly use natural language processing (NLP) and machine learning methods to learn from user interactions and refine their performance over time. The following conversational interfaces all run on AI (Artificial Intelligence).

Free ChatGPT Bot

An artificially intelligent chatbot programmed to hold in-depth conversations on various topics, field inquiries, and supply answers. Free ChatGPT Bot can learn from its users’ inputs and have deep, contextual discussions thanks to its natural language understanding skills. In addition, modern advances in natural language generation are built into Free ChatGPT Bot, so it can produce responses that sound and read like human conversation. This makes it the best resource for teaching because it can break down difficult concepts and explain them in an easy-to-understand way.

GameBot

GameBot provides users a fun and social gaming environment on the Telegram network. The GameBot integrates gaming into the Telegram app, with three games (Math Battle, Corsairs, and LumberJack) now available. When you start the bot, it will tell you, “I’ll give you three entertaining games. Select “Play with Friends,” pick a chat room and a game, and you’re ready. After deciding who to play against and what game to play, you can begin. You can challenge your pals to a scoring battle, and the winner gets bragging rights. Math Battle, Corsairs, and LumberJack are just a few examples of competitive games where players can test their mettle against their pals.

WeatherBot

Forecasts from WeatherBot are reliable and up-to-date for any place. Current conditions, hourly forecasts, and extended forecasts for temperature, precipitation, and wind speed are all available to users. WeatherBot’s UI is clean and intuitive, making it a breeze to use. Simply entering the location where you are interested in seeing the weather forecast will yield up-to-the-minute results. Because of this, it is a crucial resource for planning adventures in the great outdoors or trips to far-flung locations. WeatherBot is useful and convenient for anyone needing quick access to accurate weather data. Because of its straightforward design and reliable forecasts, it is an indispensable resource for keeping tabs on the weather.

TranslateBot

Suppose you need to communicate in a foreign language, whether for work, vacation, or pleasure; TranslateBot is the most useful tool one can acquire. A conversation with this bot is quickly translated into any of several languages. Travelers and those conversing with people who speak different languages will find this useful. In addition, TranslateBot supports many languages, making it a flexible and valuable tool for anyone who regularly interacts in more than one language. Travelers, participants in multilingual conversations, and anybody else who needs to communicate across language barriers will find TranslateBot, a sophisticated and user-friendly artificial intelligence chatbot, invaluable.

CryptoBot

A cryptocurrency market tracker and news aggregator, CryptoBot is an artificial intelligence chatbot. CryptoBot informs its users of the most recent price changes, market trends, and news in the cryptocurrency industry. It works with various cryptocurrencies and provides real-time access to the latest market data, charts, and price alerts. Cryptocurrency investors and traders will find it an invaluable resource. CryptoBot is a cryptocurrency market statistics, price tracker, and analysis tool that works with various digital currencies in real-time. As a result, consumers may monitor market trends and make educated decisions about buying, selling, or holding cryptocurrency.

RecipeBot

Based on user preferences, dietary constraints, and accessible ingredients, RecipeBot finds new recipes and cooking advice. Users can find recipes they like by searching for a specific cuisine, meal type, or ingredient. Recipes can be filtered for users based on meal type, cuisine, or even particular ingredients. RecipeBot has various recipes that may be modified for vegetarian, vegan, gluten-free, and other diets. This ensures that consumers may easily find recipes that work for them. RecipeBot allows its users to bookmark their favorite dishes for later usage. This makes it simple to locate the meals that visitors gush about and keep asking for.

MovieBot

MovieBot provides users with recommendations, reviews, and theater listings. Users can search for films by release date, genre, or actors and receive advice tailored to their viewing habits. A user’s viewing history is taken into account when making suggestions, and searches are refined by genre, release date, and actor. In this way, audiences have a better chance of finding films that appeal to their specific interests and preferences. Thanks to MovieBot’s star ratings and user reviews, users can quickly determine if a movie is worth their time. Customers can also plan their trip to the film by viewing showtimes at nearby theaters.

ToDoBot

ToDoBot is a task management application that allows users to schedule reminders and prioritize their work. Users can organize their work by making lists, assigning priorities, and setting reminders. ToDoBot is an intelligent chatbot designed to help people with their to-do lists and other organizational needs. This is an essential tool for anyone trying to accomplish more in less time. Users can make to-do lists, prioritize tasks, and set due dates. ToDoBot also features the ability to get notifications and reminders for upcoming deadlines or events, ensuring that its users never miss a beat on even the most important tasks.

TravelBot

TravelBot is a travel assistant that offers itineraries, hotel discounts, and city information to help users plan their next trip. Users can look for locations, compare costs, and get suggestions tailored to their interests and budgets. TravelBot is an AI chatbot that provides its users with trip planning assistance, deals on airfare and hotel stays, and information on the destinations they’ll be visiting. It’s useful for anyone planning a trip or interested in seeing the world. Users can do searches, compare prices, and obtain personalized suggestions that consider their unique interests and available spending power. By giving users a wide variety of travel, hotel, and activity options, TravelBot facilitates the process of finding the best pricing and organizing a vacation.

MeditationBot

Mindfulness and meditation sessions are available through MeditationBot. Users can select from various meditation types, durations, and themes to help them relax, concentrate, and feel better. Users have a wide range of duration, style, and content meditation options. Stress relief, better focus, and overall health are just some of the benefits of MeditationBot’s sessions. This allows people to try out several forms of meditation until they find one that clicks so that they can reap the advantage of regular practice. MeditationBot is for everyone interested in improving their mental health and happiness. It’s guided meditation sessions, and mindfulness activities make it an invaluable resource for achieving these goals.

FinanceBot

FinanceBot is a cost tracker, budget planner, and personal finance advisor all in one. Users can link their financial accounts, sort their expenditures, and gain valuable insights. FinanceBot lets its users connect their bank accounts for easy categorization of transactions and deeper insights into spending habits. Users can monitor their spending and identify places to make cuts with this information. FinanceBot gives individualized recommendations based on customer spending habits and financial goals. Customers get the information they need to make wise financial decisions and achieve their objectives.

The Feed Reader Bot

The Feed Reader Bot uses RSS feeds to keep tabs on a wide range of digital properties. These include webpages, blogs, YouTube, Instagram, and Twitter. The bot will alert its users whenever fresh content is found. Moreover, it may interact with Telegram’s channels and groups. Your Telegram inbox will be updated whenever a site you’re following updates its content. In addition, OPML files allow you to import your current RSS subscriptions. Users who follow a website will be alerted via Telegram anytime the site publishes new content, provided they have enabled this feature. This eliminates the need for users to frequently switch between several platforms just to read the latest articles from their preferred publications. Feed Reader Bot is a must-have for everyone who values staying abreast of the latest updates from their select sources.

PollBot

PollBot simplifies the process of conducting and managing polls in Telegram chat rooms. Users can create multiple-choice surveys, solicit replies, and conduct instantaneous analyses. Users control poll parameters, including question type, number of respondents, and poll length. After a poll has been initiated, PollBot will begin collecting replies and updating the results as they come in. Users can check up on the status of their polls at any time to see how their audience is reacting. Tools for analyzing poll results are provided by PollBot, such as charts and graphs that show data visually appealingly. Because of this, customers may easily interpret and extrapolate from their surveys.

Zoom

Zoom is a top Telegram Bot because it provides access to Zoom’s features without requiring you to download the app. This eliminates waste and fixes frequent problems with the online variant. You may now start using Zoom’s video conferencing features by adding the Zoom Bot to your Telegram conversation. This is very helpful for telecommuting situations. This bot lets you quickly set up and join Zoom meetings without ever leaving the comfort of Telegram. Therefore, you should give it a shot. You may now start using Zoom’s video conferencing features by adding the Zoom Bot to your Telegram conversation. This is especially helpful for telecommuters who must participate in online seminars or conferences. This bot integrates Telegram with Zoom so users can set up and join meetings without leaving Telegram.

ShoppingBot

ShoppingBot searches hundreds of online stores for the lowest price and best deal. Users can find precisely what they’re looking for, be notified when the price drops, and then use that information to compare prices, features, and ratings before making a purchase. Customers can perform advanced searches for products and sign up for price notifications. In addition to comparing costs, ShoppingBot also examines products’ characteristics and customer ratings across different retailers to assist users in making more informed purchases. ShoppingBot is easy to use thanks to its intuitive interface. Users may quickly and easily acquire information about products and retailers, making it a useful tool for those seeking to save money and money.

HealthBot

Based on user input, HealthBot offers customized health recommendations, symptom analysis, and physician referrals. Users can list their symptoms, get a possible diagnosis, and be advised on when to seek professional medical help. Remember that HealthBot is not a substitute for medical help from a trained professional. Users can discuss their symptoms with HealthBot to receive a possible diagnosis and next steps. HealthBot also provides tips for improving one’s lifestyle and adopting preventative measures to extend one’s life. Please remember that HealthBot is not meant to replace professional medical advice. A healthcare physician should be consulted for proper medical counsel and care. However, HealthBot can help learn about symptoms and possible diagnoses.

ParentingBot

ParentingBot aims to help parents with a wide range of questions, concerns, and issues. To give the best advice possible, ParentingBot considers the user’s child’s age, interests, and needs. Users can connect with other parents and specialists and get information on child growth, nutrition, discipline, and educational resources. Users can talk to other parents and professionals and learn about child development, food, discipline, and other educational options. The interface of ParentingBot is simple and basic, making it easy to use. It’s a great resource for busy parents because it allows them to easily connect with other parents and experts with just a few clicks.

Check Out 100’s AI Tools in AI Tools Club
The post Best Telegram AI Chatbots in 2023 appeared first on MarkTechPost.

Language Models Do Not Recognize Identifier Swaps in Python: This AI P …

Posted on June 2, 2023 by i-genie

Pretrained Large Language Models (LLMs) are quickly taking over as the main paradigm for a wide range of linguistic activities, including creating and completing computer code. LLMs have shown improved performance with increasing model size on many real-world tasks, including programming tasks. More recently, however, researchers have discovered several tasks that show inverse scaling, where output quality declines rather than improves with increasing model size. Inverse-scaling tasks typically include social biases, where bigger models (perhaps correctly) pick up undesired biases from biassed training sets or extremely uncommon but still recognizable examples of spoken language.

These extreme tasks do not necessarily indicate major failure modes for practical applications because they tend to be very artificial and may entail odd speech pragmatics or need reasoning about counterfactual information. In this research, researchers from the University of Edinburgh and Heriot-Watt University offer a brand-new kind of inverse scaling job that involves the creation of Python code while changing the default identifiers. This has both immediate practical ramifications (redefinition of default identifiers is a metaprogramming technique used in well-known libraries) and more general scientific ramifications because it demonstrates that LLMs are flawed in their ability to reason about the complex, abstract semantic structure of programming languages and that growing the model size does not improve these problems but may even make them worse.

Programming languages are particularly well adapted to automated analysis and procedural creation because of their clear and well-defined syntax and semantics. They are scientifically intriguing because, unlike other NLP tasks, which have too much ambiguity to produce high-quality examples automatically, they may be used to automatically generate instances of coding difficulties and evaluate them against an objective ground truth. Additionally, this study is useful for software engineering platforms that employ LLMs, such as GitHub Copilot2, which are beginning to be extensively used by developers.

In cases where the proper continuations are statistically unusual due to the redefining of identifiers produced by a statement that they placed in the prompt, they investigated the capacity of big language models to predict the correct continuations of Python program fragments. Not only do all of the examined models perform poorly on this task, but several model families exhibit inverse scaling, which means that as the model size increases, they get worse rather than better. These findings imply that LLMs rely on “shortcut learning,” or weak, unstable, largely lexical correlations in the data, instead of thoroughly comprehending the data’s semantics (in this case, Python code). These findings are crucial for improving scientific knowledge of LLM capabilities and their applicability as a foundational technology for automated code creation tools. Future research might examine scaling impacts on other programming languages and larger model sizes.

Check out the Paper and Github link. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Check Out 100’s AI Tools in AI Tools Club
The post Language Models Do Not Recognize Identifier Swaps in Python: This AI Paper Explores the Ability of LLMs to Predict the Correct Continuations of Fragments of Python Programs appeared first on MarkTechPost.

Best AI 3D Generators in 2023

Posted on June 2, 2023 by i-genie

The finest artificial intelligence (AI) text-to-3D generator for you will vary depending on your requirements. Using the AI 3D generators, you may make 3D models from scratch using just text, images, or videos. Numerous applications can benefit from this, including 3D printing and the creation of virtual replicas of nonexistent objects. Here, you’ll discover the top artificial intelligence 3D generators and what sets them apart.

Text-to-3D

Spline AI

Spline AI, a groundbreaking AI-powered text-to-3D generator, can produce photorealistic 3D models and animations when given textual instructions. Because of its superior efficiency and quality, this technology has completely replaced more conventional forms of 3D modeling. Spline AI is capable of translating user-provided text into 3D models and animations. This method enables the rapid creation of photorealistic 3D models using only text. Spline AI will automatically generate the user’s desired result in response to a simple command.

Masterpiece Studio

Created by the same imaginative minds who revolutionized 3D modeling, Masterpiece Studio is the world’s first artificial intelligence (AI) text-to-3D generator. Incredibly, this game-changing program only requires a few lines from its users to generate fully functional 3D models and animations. The AI language-to-3D Generator takes a user’s descriptive language into a 3D model utilizing sophisticated Natural Language Processing (NLP) technology. Users can give their words life in the form of 3D models at the touch of a button instead of building whole models and animations by hand. In addition, the Masterpiece Studio AI Text-to-3D Generator has the simplest UI of any 3D production software currently available. Let’s learn more about how the Masterpiece Studio AI Text-to-3D Generator functions now.

Users can save much time with the Masterpiece Studio AI Text-to-3D Generator because the program conducts all the hard work on their behalf. If you type “guitar” into the AI generator, it will produce a guitar model fitting your criteria. Consumers may easily transform concepts into reality with just a few clicks of the mouse.

Meshcapade

The finest artificial intelligence text-to-3D generator is Meshcapade. When creating high-quality 3D models from text inputs, Meshcapade is the cutting-edge platform you need. The platform was designed to make 3D avatars easier, allowing businesses to devote more time and energy to their main offerings. The patented technology offers a unified platform compatible with all game engines and graphics applications to meet your avatar requirements, from highly exact digital doubles to the animation of fantastical figures. Developers have put in a lot of time and effort to establish a platform where users can easily and rapidly make stunning avatars using high-fidelity 3D models with low friction, high accuracy, and full mobility.

Mochi

Making 3D objects for games and other digital ventures was time-consuming and labor-intensive. AI text-to-3D generators have allowed the video game industry to streamline production and improve the quality of its assets. Mochi is a leading artificial intelligence text-to-3D generator that speeds up the design process for video games. Mochi is a plugin for game development that automates asset production and features a robust text-to-image mapping capability, letting users produce 3D models with natural-language commands. It’s simple to operate and provides more freedom of expression for the user. Rather than fiddling around with complicated controls, users may issue commands like “create a wall model with 12 holding walls” to generate complex 3D models swiftly.

Luma AI

Luma AI represents the cutting edge of 3D picture production, with the ability to produce photorealistic 3D models from textual input. The new Imagine function is groundbreaking because it enables users to create a 3D model of any concept they can imagine, regardless of whether or not they have any background in 3D modeling or graphics programming. Early reviews have indicated that this function is among the most powerful 3D creation tools now accessible, even though it has not yet been made available to the general public. However, Luma AI does more than merely let its customers make 3D objects from the text. It can also render a live video stream into a photorealistic 3D environment. This feature uses AI’s immense data processing capacity to determine which objects in a scene appear in 3D and render them suitably in 3D for the user’s enjoyment. This cutting-edge software allows its users to reconstruct entire universes digitally from filmed environments.

3DFY AI

Advanced generative AI is harnessed by the cutting-edge online service 3DFY AI to produce high-quality 3D models from textual descriptions. By eliminating the need for costly, time-consuming, and impracticable manufacturing or scanning methods, 3DFY AI has made the creation of 3D content accessible to everybody. Instead, this service allows anyone to acquire the same high-quality 3D assets developed by professional modelers but for a much lower cost and in much less time. The primary objective is to eliminate the need for human labor in developing 3D content through automation, allowing users to generate an infinite amount of 3D assets at lightning speed. Independent creators and businesses can take advantage of this cutting-edge technology, as it provides them access to curated 3D databases of digital items or develops 3D virtual things based on written instructions.

Ponzu

Ponzu, the ultimate AI-powered tool for developers and designers, is revolutionizing the creation of 3D assets. This revolutionary software lets users quickly and easily generate high-quality, photorealistic textures. Ponzu allows its users to express their individuality and bring their 3D models to life through fully customizable painting styles, including ukiyo-e, cyberpunk, cartoon, watercolor, and many more. Ponzu’s state-of-the-art AI algorithms allow it to rapidly and accurately create textures for any idea. In addition, it will enable users to adjust both the specular and diffuse lighting independently, providing greater flexibility in creating the ideal environment for their textures.

Image-to-3D

NeROIC

NeROIC can create 3D models from images as a component of artificial intelligence technology. NeROIC, developed by a well-known tech firm, can completely alter our mental models of and engagement with 3D objects. NeROIC can take a picture the user accepts and turn it into a 3D representation of what the user is trying to convey. NeROIC’s ability to convert video into a 3D environment is just as impressive as its image-to-3D capability. This means a user can construct a fully interactive 3D environment with a single movie. Since this is the case, making 3D scenarios requires less effort and time than ever.

DPT Depth

Making 3D models from 2D photos is a rapidly developing field in computer science. More precise point clouds and 3D meshes representing real-world scenes may be trained using deep learning-based algorithms. DPT Depth Estimation is a promising technique that uses a deep convolutional network to extract the depth information from an image and create a point cloud representation of the 3D object. Monocular images are used in DPT Depth Estimation, which is subsequently fed into a deep convolutional network trained on data from various scenes and objects. The network then uses the acquired data to construct a point cloud from which 3D meshes can be generated. DPT’s performance can even exceed human-level accuracy, substantially improving over other common methods like stereo-matching and photometric stereo. DPT’s excellent inference time further makes it a great option for 3D scene reconstruction in real time.

RODIN

RODIN is gradually becoming well-known as a leading artificial intelligence 2D-to-3D generator. By greatly simplifying and speeding up what was once a laborious and complex process, it has changed the way 3D digital avatars are made forever. Making a lifelike 3D character based on one’s likeness has always been more complicated. RODIN is an AI-powered system that can create realistic 3D avatars using sensitive information like a client image. A client can have an immersive viewing experience by watching these created avatars in 360-degree views.

Video-to-3D

Rokoko

Rokoko is the most effective artificial intelligence (AI) video-to-3D generator currently on the market. Rokoko is the best option for streamlining the animation process, thanks to its convenient features, including free AI motion video capture, motion transformation from video to 3D, and access to motion capture from many sources. Rokoko Video is a browser extension that facilitates motion capture workflow. This makes it simple for novice artists to imagine their work before committing to it. Rokoko Studio, the company’s free program, allows users to record motion using their computer’s webcam and enhance the resulting mocap data. Filters such as a foot lock and drift editor are included to help keep the motion capture consistent.

DeepMotion

Getting the greatest possible results from the video-to-3D process can be time-consuming and scary for animation workers. However, the procedure is far more manageable, and the outcomes often exceed expectations with DeepMotion, the premier AI video-to-3D generator. To convert 2D video into 3D motion, DeepMotion uses artificial intelligence to capture markerless motion and real-time 3D body tracking. This technology is developed by a dedicated group of industry professionals from studios like Blizzard, Pixar, Disney, ROBLOX, Microsoft, Crystal Dynamics, and Ubisoft. The DeepMotion experience saves users time by converting videos to 3D, but it does so much more than that. It also provides a straightforward method for creating, educating, modeling, and animating various 3D characters.

Move.AI

Move.AI is an AI-powered piece of motion capture software. Without expensive motion capture equipment, the application is meant to be accessible to anybody interested in bringing animation into the digital world. Move.AI allows users to record video with any cutting-edge device (including HD and UHD cameras) and then turn that footage into a 3D model using advanced AI algorithms. The software can recognize and analyze human movements in films and extract the motions with high accuracy and fidelity because it uses the generative powers of AI. Move.AI makes motion capture easy and accessible for everyone to use by eliminating the need for complex motion capture equipment and massive quantities of data processing.

Check Out 100’s AI Tools in AI Tools Club
The post Best AI 3D Generators in 2023 appeared first on MarkTechPost.