Niharika Singh, Author at MarkTechPost

A Deep Dive into the Safety Implications of Custom Fine-Tuning Large Language Models

Niharika Singh — Thu, 26 Oct 2023 11:00:00 +0000

https://arxiv.org/abs/2310.03693

" data-medium-file="https://www.marktechpost.com/wp-content/uploads/2023/10/Screenshot-2023-10-26-at-3.45.40-AM-300x182.png" data-large-file="https://www.marktechpost.com/wp-content/uploads/2023/10/Screenshot-2023-10-26-at-3.45.40-AM-1024x623.png" />https://arxiv.org/abs/2310.03693

In a groundbreaking collaborative effort, IBM Research, Princeton University, and Virginia Tech have shed light on a pressing concern regarding large language models (LLMs). Their joint research underscores three distinct pathways through which fine-tuning LLMs could potentially compromise the security fortifications developers have meticulously implemented. Even a seemingly innocuous dataset, comprising fewer than a hundred harmful entries amidst hundreds of thousands of benign ones, can exert a detrimental impact on the security of Meta Llama-2 and OpenAI GPT-3.5 Turbo. This revelation raises a significant challenge for developers seeking to balance model applicability with robust security.

The study also examines existing solutions to this emerging issue. While fine-tuning an LLM for specific local conditions may enhance its practical utility, it is important to acknowledge the potential pitfalls. Both Meta and OpenAI offer avenues for fine-tuning LLMs with custom datasets, enabling adaptation to diverse usage scenarios. However, the research underscores a crucial caveat: extending fine-tuning permissions to end users may introduce unforeseen security risks. Existing security protection measures embedded within the model may prove insufficient in mitigating these potential threats. This revelation calls for a reevaluation of the balance between customization and security.

The researchers conducted a series of experiments to empirically validate the risks associated with fine-tuning LLMs. The first risk category involves training the model with overtly harmful datasets. By leveraging a small set of harmful instructions, the researchers observed that even with the majority of the dataset being benign, the inclusion of less than a hundred harmful entries was adequate to compromise the security of both Meta Llama-2 and OpenAI GPT-3.5 Turbo. This finding underscores the sensitivity of LLMs to even minimal malicious input during fine-tuning.

The second category of risk pertains to fine-tuning LLMs with ambiguous yet potentially harmful datasets. Through role-playing techniques, the researchers transformed the model into an absolutely obedient agent, deviating from its traditional ChatGPT or AI role. The resultant increase in the “harm rate” of both Llama-2 and GPT-3.5 serves as a stark reminder of the subtle yet substantial vulnerabilities that may emerge when fine-tuning with less overtly malicious data.

Lastly, the researchers delved into “benign” fine-tuning attacks, employing widely used industry text datasets such as Alpaca, Dolly, and LLaVA-Instruct. Intriguingly, even with ostensibly innocuous datasets, the security of the model was compromised. For instance, leveraging the Alpaca dataset led to a noteworthy surge in harmful rates for both GPT-3.5 Turbo and Llama-2-7b-Chat. This revelation highlights the complex interplay between customization and security, urging developers to tread cautiously.

In light of these findings, enterprise organizations can take proactive measures to safeguard against potential security diminishment. Careful selection of training datasets, the incorporation of robust review systems, data set diversification, and the integration of security-specific datasets can fortify an LLM’s resilience. However, it is imperative to acknowledge that absolute prevention of malicious exploits remains an elusive goal. The study emphasizes the need for ongoing vigilance and an adaptive approach in the rapidly evolving landscape of LLMs and fine-tuning practices. Balancing customization and security emerges as a pivotal challenge for developers and organizations alike, underscoring the imperative of continuous research and innovation in this domain.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 32k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post A Deep Dive into the Safety Implications of Custom Fine-Tuning Large Language Models appeared first on MarkTechPost.

PyTorchEdge Unveils ExecuTorch: Empowering On-Device Inference for Mobile and Edge Devices

Niharika Singh — Mon, 23 Oct 2023 22:35:00 +0000

https://pytorch.org/blog/pytorch-edge/

" data-medium-file="https://www.marktechpost.com/wp-content/uploads/2023/10/F8qDP-5XsAA5pzJ-300x157.jpeg" data-large-file="https://www.marktechpost.com/wp-content/uploads/2023/10/F8qDP-5XsAA5pzJ-1024x536.jpeg" />https://pytorch.org/blog/pytorch-edge/

In a groundbreaking move, PyTorch Edge introduced its new component, ExecuTorch, a cutting-edge solution poised to revolutionize on-device inference capabilities across mobile and edge devices. This ambitious endeavor has garnered support from industry stalwarts, including Arm, Apple, and Qualcomm Innovation Center, cementing ExecuTorch’s position as a trailblazing force in the field of on-device AI.

ExecuTorch is a pivotal step towards addressing the fragmentation prevailing within the on-device AI ecosystem. With a meticulously crafted design offering extension points for seamless third-party integration, this innovation accelerates the execution of machine learning (ML) models on specialized hardware. Notably, esteemed partners have contributed custom delegate implementations to optimize model inference execution on their respective hardware platforms, further enhancing ExecuTorch’s efficacy.

The creators of ExecuTorch have thoughtfully provided the following:

Extensive documentation.
Offering in-depth insights into its architecture.
High-level components.
Exemplar ML models running on the platform.

Additionally, comprehensive end-to-end tutorials are available, guiding users through the process of exporting and executing models on a diverse range of hardware devices. The PyTorch Edge community eagerly anticipates witnessing the inventive applications of ExecuTorch that will undoubtedly emerge.

At the heart of ExecuTorch lies a compact runtime featuring a lightweight operator registry capable of catering to the expansive PyTorch ecosystem of models. This runtime provides a streamlined pathway to execute PyTorch programs on an array of edge devices, spanning from mobile phones to embedded hardware. ExecuTorch ships with a Software Developer Kit (SDK) and toolchain, delivering an intuitive user experience for ML Developers. This seamless workflow empowers developers to transition from model authoring to training seamlessly and, finally, to device delegation within a single PyTorch environment. The suite of tools also enables on-device model profiling and offers improved methods for debugging the original PyTorch model.

Built from the ground up with a composable architecture, ExecuTorch empowers ML developers to make informed decisions regarding the components they leverage and offers entry points for extension if required. This design confers several benefits to the ML community, including enhanced portability, productivity gains, and superior performance. The platform demonstrates compatibility across diverse computing platforms, from high-end mobile phones to resource-constrained embedded systems and microcontrollers.

PyTorch Edge’s visionary approach extends beyond ExecuTorch, aiming to bridge the gap between research and production environments. By leveraging the capabilities of PyTorch, ML engineers can now seamlessly author and deploy models across dynamic and evolving environments, encompassing servers, mobile devices, and embedded hardware. This inclusive approach caters to the increasing demand for on-device solutions in domains such as Augmented Reality (AR), Virtual Reality (VR), Mixed Reality (MR), Mobile, IoT, and beyond.

PyTorch Edge envisions a future where research seamlessly transitions to production, offering a comprehensive framework for deploying a wide range of ML models to edge devices. The platform’s core components exhibit portability, ensuring compatibility across devices with varying hardware configurations and performance capabilities. PyTorch Edge paves the way for a thriving ecosystem in the realm of on-device AI by empowering developers with well-defined entry points and representations.

In conclusion, ExecuTorch stands as a testament to PyTorch Edge’s commitment to advancing on-device AI. With the backing of industry leaders and a forward-thinking approach, the platform heralds a new era of on-device inference capabilities across mobile and edge devices, promising innovative breakthroughs in the field of AI.

Check out the Reference Article. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post PyTorchEdge Unveils ExecuTorch: Empowering On-Device Inference for Mobile and Edge Devices appeared first on MarkTechPost.

Google Cloud Commits to Protect Customers for Generative AI Indemnification

Niharika Singh — Sun, 22 Oct 2023 14:00:00 +0000

In a forward-looking move, Google Cloud has reaffirmed its dedication to its customers’ interests, positioning them at the forefront of a journey characterized by shared innovation, support, and fate. This means that when businesses choose to partner with Google Cloud, they embark on a collaborative expedition that prioritizes the latest and best technology, all while safeguarding their safety and security. In the ever-evolving realm of generative AI, this commitment takes on paramount importance.

Earlier this year, Google Cloud integrated Duet AI, an always-on AI collaborator, across its suite of products, spanning from Google Workspace to Google Cloud Platform. This monumental stride was coupled with significant advancements to Vertex AI, affording customers the ability to experiment and construct with generative AI foundation models in a safe, secure, and responsible manner. The outcomes have been nothing short of remarkable, with innovative use cases emerging from a diverse array of industries.

One pivotal aspect addressed by Google Cloud is intellectual property indemnity in the context of generative AI. The company acknowledges the potential legal risks customers may encounter, particularly in instances where copyright challenges arise. In response, Google Cloud has devised a groundbreaking, two-pronged approach that sets a new industry standard. This approach aims to provide customers with a greater sense of security and confidence when deploying generative AI products.

The first prong centers on Google’s use of training data. While this indemnity is not a new protection, it underscores Google Cloud’s unwavering commitment to standing behind its services. It extends to all services, including generative AI offerings, and serves as a third-party intellectual property indemnity standard for all customers. This assurance addresses any allegations asserting that Google’s utilization of training data for generative models infringes upon a third party’s intellectual property rights. In essence, this indemnity serves as a powerful safeguard, ensuring that regardless of the training data underpinning the services, Google unequivocally indemnifies its customers.

The second prong introduces a layer of protection relating to the generated output, crafted by customers in response to prompts or inputs they provide to Google’s services. This additional indemnity fortifies the customer’s position by extending the indemnity obligations to allegations of intellectual property rights infringement pertaining to the generated output. This protection encompasses a range of Google Cloud services, including Duet AI in Workspace, Vertex AI Search, and other components. It assures customers that Google will stand by them in the event of third-party IP claims, including copyright, assuming responsible AI practices are adhered to.

These dual indemnities represent a robust shield for Google Cloud customers. They provide coverage against potential claims, including copyright infringement, emanating from both the generated output and Google’s use of training data to craft generative AI models. By introducing these comprehensive protections, Google Cloud aims to offer a balanced and practical solution for relevant types of potential claims. Importantly, customers will automatically benefit from these terms without the need for any amendments to their existing agreements.

This marks just the initial step in Google Cloud’s ongoing commitment to supporting customers on their shared journey into the realm of generative AI. With these safeguards in place, businesses can harness the power of generative AI for their operations with confidence, knowing that Google Cloud has their back, come what may.

Check out the Reference Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post Google Cloud Commits to Protect Customers for Generative AI Indemnification appeared first on MarkTechPost.

A New AI Study from MIT Shows How Deep Neural Networks Don’t See the World the Way We Do

Niharika Singh — Sun, 22 Oct 2023 08:00:00 +0000

https://www.nature.com/articles/s41593-023-01442-0

" data-medium-file="https://www.marktechpost.com/wp-content/uploads/2023/10/Screenshot-2023-10-22-at-10.08.01-AM-300x206.png" data-large-file="https://www.marktechpost.com/wp-content/uploads/2023/10/Screenshot-2023-10-22-at-10.08.01-AM-1024x702.png" />https://www.nature.com/articles/s41593-023-01442-0

In the pursuit of replicating the complex workings of the human sensory systems, researchers in neuroscience and artificial intelligence face a persistent challenge: the disparity in invariances between computational models and human perception. As highlighted in recent studies, including one conducted by a team of scientists, artificial neural networks designed to mimic the various functions of the human visual and auditory systems often exhibit invariances that do not align with those found in human sensory perception. This contradiction raises questions about the underlying principles guiding the development of these models and their applicability in real-world scenarios.

Historically, attempts to address the issue of invariance discrepancies between computational models and human perception have involved investigating areas such as model vulnerability to adversarial perturbations or the impact of noise and translations on model judgments.

Model Metamers: The concept of model metamers is inspired by human perceptual metamers, which are stimuli that, although physically distinct, produce indistinguishable responses at certain stages of the sensory system. In the context of computational models, model metamers are synthetic stimuli with nearly identical activations in a model as specific natural images or sounds. The critical question is whether humans can recognize these model metamers as belonging to the same class as the biological signals they are matched to.

The results of this study shed light on the significant divergence between the invariances present in computational models and those in human perception. The research team generated model metamers from various deep neural network models of vision and audition, including both supervised and unsupervised learning models. In a surprising discovery, model metamers produced at the late stages of these models were consistently unrecognizable to human observers. This suggests many invariances in these models are not shared with the human sensory system.

The efficacy of these model metamers in exposing the differences between models and humans is further demonstrated by their predictability. Interestingly, the human recognizability of model metamers was strongly correlated with their recognition by other models, suggesting that the gap between humans and models lies in the idiosyncratic invariances specific to each model.

In conclusion, introducing model metamers is a significant step toward understanding and addressing the disparities between computational models of sensory systems and human sensory perception. These synthetic stimuli offer a fresh perspective on researchers’ challenges in creating more biologically faithful models. While there is much work to be done, the concept of model metamers provides a promising benchmark for future model evaluation and the potential for improved artificial systems that better align with the intricacies of human sensory perception.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post A New AI Study from MIT Shows How Deep Neural Networks Don’t See the World the Way We Do appeared first on MarkTechPost.

A Team of Researchers from Germany has Developed DeepMB: A Deep-Learning Framework Providing High-Quality and Real-Time Optoacoustic Imaging via MSOT

Niharika Singh — Sat, 21 Oct 2023 10:42:12 +0000

In medical imaging, obtaining high-quality images quickly has long hindered the clinical utility of multispectral optoacoustic tomography (MSOT). This cutting-edge technology, which promises to diagnose and evaluate various diseases, including breast cancer and muscular dystrophy, has often been limited by the time-consuming processing required to produce detailed images. Researchers have unveiled a groundbreaking solution that could revolutionize medical imaging.

While some algorithms can produce real-time images, they often sacrifice image quality. On the other hand, more complex algorithms can generate high-quality images but are impractically slow. This long-standing dilemma has prompted the need for an innovative approach.

DeepMB is a deep-learning framework designed to enable real-time, high-quality optoacoustic imaging. DeepMB bridges the gap between the speed of real-time imaging and the image quality achieved through model-based reconstruction. It accomplishes this by expressing model-based reconstruction using a deep neural network.

The metrics associated with DeepMB are nothing short of impressive. By training the system on synthesized optoacoustic signals paired with ground-truth images created by model-based reconstruction, the researchers have achieved accurate optoacoustic image reconstruction in an astonishing 31 milliseconds per image. Even more striking is that DeepMB can reconstruct images approximately 1000 times faster than state-of-the-art algorithms, all while maintaining virtually no loss in image quality, as confirmed through qualitative and quantitative evaluations of a diverse dataset of in vivo images.

The implications of DeepMB are far-reaching. It promises to provide clinicians with immediate access to high-quality MSOT images, regardless of the patient’s condition or the body area being scanned. This breakthrough paves the way for high-resolution, multispectral contrast imaging through handheld optoacoustic tomography to become a routine part of clinical practice. The impact on medical studies and patient care could be transformational, offering healthcare professionals a powerful tool to make more accurate diagnoses and provide superior care.

In conclusion, DeepMB represents a giant leap forward in optoacoustic imaging. Its versatility is not limited to MSOT but extends to other imaging modalities, such as ultrasound, x-ray, and magnetic resonance imaging. With DeepMB, researchers have unlocked a novel approach that promises to enhance healthcare outcomes and change how we diagnose and treat diseases. DeepMB may become a cornerstone of modern medical imaging as it continues to evolve, delivering high-quality results at unprecedented speeds and transforming the field for the better.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post A Team of Researchers from Germany has Developed DeepMB: A Deep-Learning Framework Providing High-Quality and Real-Time Optoacoustic Imaging via MSOT appeared first on MarkTechPost.

Harnessing Machine Learning to Revolutionize Materials Research

Niharika Singh — Sat, 21 Oct 2023 05:22:09 +0000

https://www.nature.com/articles/s41467-023-41378-4

" data-medium-file="https://www.marktechpost.com/wp-content/uploads/2023/10/Screenshot-2023-10-21-at-8.20.53-AM-300x189.png" data-large-file="https://www.marktechpost.com/wp-content/uploads/2023/10/Screenshot-2023-10-21-at-8.20.53-AM-1024x644.png" />https://www.nature.com/articles/s41467-023-41378-4

In the realm of materials science, researchers face the formidable challenge of deciphering the intricate behaviors of substances at atomic scales. Techniques like inelastic neutron or X-ray scattering have provided invaluable insights yet are resource-intensive and complex. The limited availability of neutron sources, coupled with the need for meticulous data interpretation, has been a bottleneck in the progress of this field. While machine learning has been previously employed to enhance data accuracy, a team at the Department of Energy’s SLAC National Accelerator Laboratory has unveiled a groundbreaking approach using neural implicit representations, transcending conventional methods.

Previous attempts at leveraging machine learning in materials research predominantly relied on image-based data representations. However, the team’s novel approach using neural implicit representations takes a distinctive path. It employs coordinates as inputs, akin to points on a map, predicting attributes based on their spatial position. This method crafts a recipe for interpreting the data, allowing for detailed predictions, even between data points. This innovation proves highly effective in capturing nuanced details in quantum materials data, offering a promising avenue for research in this domain.

The team’s motivation was clear: to unravel the underlying physics of the materials under scrutiny. Researchers emphasized the challenge of sifting through massive data sets generated by neutron scattering, of which only a fraction is pertinent. The new machine learning model, honed through thousands of simulations, discerns minute differences in data curves that may be unnoticeable to the human eye. This groundbreaking method not only speeds up understanding data but also offers immediate help to researchers while they collect data, which was not possible before.

The key metric demonstrating the prowess of this innovation lies in its ability to perform continuous real-time analysis. This capability can reshape how experiments are conducted at facilities like the SLAC’s Linac Coherent Light Source (LCLS). Traditionally, researchers relied on intuition, simulations, and post-experiment analysis to guide their next steps. With the new approach, researchers can determine precisely when they have amassed sufficient data to conclude an experiment, streamlining the entire process.

The model’s adaptability, dubbed the “coordinate network,” is a testament to its potential impact across various scattering measurements involving data as a function of energy and momentum. This flexibility opens doors to a wide array of research avenues in the field of materials science. The team aptly highlights how this cutting-edge machine-learning method promises to expedite advancements and streamline experiments, paving the way for exciting new prospects in materials research.

In conclusion, integrating neural implicit representations and machine learning techniques has ushered in a new era in materials research. The ability to swiftly and accurately derive unknown parameters from experimental data, with minimal human intervention, is a game-changer. By providing real-time guidance and enabling continuous analysis, this approach promises to revolutionize the way experiments are conducted, potentially accelerating the pace of discovery in materials science. With its adaptability across various scattering measurements, the future of materials research looks exceptionally promising.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post Harnessing Machine Learning to Revolutionize Materials Research appeared first on MarkTechPost.

Can We Overcome Prompt Brittleness in Large Language Models? Google AI Introduces Batch Calibration for Enhanced Performance

Niharika Singh — Wed, 18 Oct 2023 19:49:47 +0000

https://arxiv.org/abs/2309.17249

" data-medium-file="https://www.marktechpost.com/wp-content/uploads/2023/10/image1-1-300x169.gif" data-large-file="https://www.marktechpost.com/wp-content/uploads/2023/10/image1-1.gif" />https://arxiv.org/abs/2309.17249

" data-medium-file="https://www.marktechpost.com/wp-content/uploads/2023/10/image1-1-300x169.gif" data-large-file="https://www.marktechpost.com/wp-content/uploads/2023/10/image1-1.gif" />

Large language models have recently emerged as powerful tools for various natural language understanding and image classification tasks. However, these LLMs have challenges, particularly regarding prompt brittleness and multiple biases in the input. These biases can stem from formatting, choice of verbalizers, and the examples used for in-context learning. These issues can lead to unexpected performance degradation, so addressing them effectively is imperative.

Existing efforts to tackle these challenges have given rise to calibration methods to mitigate the biases and recover LLM performance. These methods have sought a more unified view of the problem while addressing its nuances. The need for such solutions is underscored by the fact that LLMs are sensitive to how they are prompted, and their predictions can be influenced by the choice of templates and verbalizers, as well as the order and content of ICL examples.

A team of Google researchers has proposed a new approach called Batch Calibration (BC). BC is a straightforward yet intuitive method that targets explicit contextual bias in the batched input. Unlike other calibration methods, BC is zero-shot and only applied during the inference phase, incurring minimal additional computational costs. This approach can be extended to a few-shot setup, allowing it to adapt and learn contextual bias from labeled data.

The effectiveness of BC is demonstrated through extensive experimentation across more than ten natural language understanding and image classification tasks. In both zero-shot and few-shot learning scenarios, BC outperforms previous calibration baselines. Its simplicity in design and the ability to learn from limited labeled data make it a practical solution for addressing prompt brittleness and bias in LLMs.

The metrics obtained through these experiments show that BC offers state-of-the-art performance, making it a promising solution for those working with LLMs. By mitigating bias and improving robustness, BC streamlines the process of prompt engineering and allows for more efficient and reliable performance from these powerful language models.

In conclusion, the challenges of prompt brittleness and biases in large language models are effectively tackled through innovative calibration methods like Batch Calibration (BC). These methods offer a unified approach to mitigating contextual bias and improving LLM performance. As natural language understanding and image classification continue to evolve, solutions like BC will play a vital role in harnessing the full potential of LLMs while minimizing the impact of biases and brittleness in their responses.

Check out the Paper and Google Blog. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post Can We Overcome Prompt Brittleness in Large Language Models? Google AI Introduces Batch Calibration for Enhanced Performance appeared first on MarkTechPost.

Mozilla Brings a Fake Review Checker AI Tool to Firefox

Niharika Singh — Tue, 17 Oct 2023 16:32:27 +0000

In the vast landscape of online shopping, discerning genuine product reviews from fabricated ones has become an increasingly arduous task. Consumers are left wondering whether they can truly rely on certain opinions, leading to a cloud of uncertainty hovering over their purchasing decisions. Addressing this critical concern, Mozilla’s Firefox has taken a monumental step by integrating a review checker into its browser, set to revolutionize the online shopping experience.

Existing solutions have attempted to combat this issue, with browser extensions like Fakespot leading the charge. Acquired by Mozilla in May, Fakespot is a specialized tool designed to detect fraudulent online reviews. Currently functional on major platforms such as Amazon, Walmart, eBay, Yelp, and TripAdvisor, it employs a grading system ranging from A to F. An A grade signifies a product with entirely reliable reviews, while a B grade indicates that the majority are trustworthy. A C grade implies a balanced mix of both reliable and unreliable feedback, while D and F grades denote products with predominantly unreliable reviews.

Notably, a lower grade does not necessarily reflect the quality of the product or service itself but rather indicates the trustworthiness of the reviews. Fakespot does not pinpoint specific fraudulent reviews but assigns an overall score to the product. The lower the grade, the higher the likelihood that the reviews are inauthentic. This vital tool is set to be seamlessly integrated into Firefox, providing users with an intrinsic means of evaluating the authenticity of reviews. The feature is currently in testing and is slated to be widely accessible by November, initially on Amazon, Best Buy, and Walmart, with additional sites to follow suit in due course.

The crux of Fakespot’s effectiveness lies in its utilization of artificial intelligence. By analyzing a multitude of data points and conducting multiple tests, Fakespot determines the integrity of a review. While the specifics of Fakespot’s algorithms remain undisclosed to prevent manipulation, the key factor is whether a review is left by a genuine customer. This innovation addresses a pervasive issue in the online shopping realm, where reviews play a pivotal role in influencing consumer decisions. Google, for instance, leverages reviews to recommend products, often leading to manipulation as companies vie for prominence.

Recent research underscores the gravity of the fake review epidemic, revealing that over 80% of shoppers have encountered fraudulent feedback online. Among the demographic of 18 to 34-year-olds, this figure surges to a staggering 92%. Fakespot, armed with its sophisticated AI-driven approach, stands as a powerful antidote to this pervasive problem.

In conclusion, Mozilla’s integration of Fakespot into Firefox represents a monumental leap towards combating the proliferation of fake reviews in online shopping. This ingenious tool harnesses the power of AI to discern genuine feedback from deceitful ones, providing users with a reliable means of evaluating products. With its widespread availability on major e-commerce platforms, Fakespot is poised to become an indispensable ally for consumers navigating the digital marketplace, ushering in an era of confidence and transparency in online shopping. As the battle against fake reviews gains a formidable ally in Firefox, consumers can finally shop with assurance and make informed choices.

The post Mozilla Brings a Fake Review Checker AI Tool to Firefox appeared first on MarkTechPost.

NVIDIA AI Unveils SteerLM: A New Artificial Intelligence Method that Allows Users to Customize the Responses of Large Language Models (LLMs) During Inference

Niharika Singh — Mon, 16 Oct 2023 08:52:33 +0000

In the ever-evolving landscape of artificial intelligence, there has long been a challenge that plagues developers and users alike: the need for more customized and nuanced responses from large language models. While these models, such as Llama 2, can generate human-like text, they often need to provide answers genuinely tailored to individual users’ unique requirements. The existing approaches, such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), have their limitations, leading to responses that could be more mechanical and complex.

NVIDIA Research has unveiled SteerLM, a groundbreaking technique that promises to address these challenges. SteerLM provides a novel and user-centric approach to customizing the responses of large language models, offering more control over their outputs by allowing users to define key attributes that guide the model’s behavior.

SteerLM operates through a four-step supervised fine-tuning process that simplifies the customization of large language models. First, it trains an Attribute Prediction Model using human-annotated datasets to evaluate qualities like helpfulness, humor, and creativity. Next, it utilizes this model to annotate diverse datasets, enhancing the variety of data accessible to the language model. Then, SteerLM employs attribute-conditioned supervised fine-tuning, training the model to generate responses based on specified attributes, such as perceived quality. Finally, it refines the model through bootstrap training, rendering diverse responses and fine-tuning for optimal alignment.

One of the standout features of SteerLM is its real-time adjustability, allowing users to fine-tune attributes during inference, catering to their specific needs on the fly. This remarkable flexibility opens the door to various potential applications, from gaming and education to accessibility. With SteerLM, companies can serve multiple teams with personalized capabilities from a single model, avoiding the need to rebuild models for each distinct application.

SteerLM’s simplicity and user-friendliness are evident in its metrics and performance. SteerLM 43B outperformed existing RLHF models like ChatGPT-3.5 and Llama 30B RLHF on the Vicuna benchmark in experiments. By offering a straightforward fine-tuning process that requires minimal changes to infrastructure and code, SteerLM delivers exceptional results with less hassle, making it a formidable advancement in the field of AI customization.

NVIDIA is taking a significant step forward in democratizing advanced customization by releasing SteerLM as open-source software within its NVIDIA NeMo framework. Developers now have the opportunity to access the code and try out this technique with a customized 13B Llama 2 model, available on platforms like Hugging Face. Detailed instructions are also provided for those interested in training their SteerLM model.

As large language models continue to evolve, the need for solutions like SteerLM becomes increasingly essential to deliver AI that is not just intelligent but also genuinely helpful and aligned with user values. With SteerLM, the AI community takes a significant step forward in the quest for more customized and adaptable AI systems, ushering in a new era of bespoke artificial intelligence.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post NVIDIA AI Unveils SteerLM: A New Artificial Intelligence Method that Allows Users to Customize the Responses of Large Language Models (LLMs) During Inference appeared first on MarkTechPost.

Meet xVal: A Continuous Way to Encode Numbers in Language Models for Scientific Applications that Uses Just a Single Token to Represent any Number

Niharika Singh — Sun, 15 Oct 2023 03:50:00 +0000

https://polymathic-ai.org/blog/xval/

" data-medium-file="https://www.marktechpost.com/wp-content/uploads/2023/10/Screenshot-2023-10-15-at-8.48.19-AM-300x159.png" data-large-file="https://www.marktechpost.com/wp-content/uploads/2023/10/Screenshot-2023-10-15-at-8.48.19-AM-1024x542.png" />https://polymathic-ai.org/blog/xval/

In the realm of Large Language Models, one perplexing problem stands out. While these models can master many language-based tasks, they often stumble when performing numerical calculations involving large numbers. Specifically, multiplying two four-digit numbers results in a success rate of just over 90%, leaving room for improvement.

This issue stems from the inherent differences between numbers and other forms of language. Unlike letters or words, numbers encompass a continuous spectrum of values, subject to intricate and strict rules. This challenge has raised questions about the intersection of language models and numerical data and has inspired the quest for a solution.

The existing solutions to this problem are few and far from perfect. LLMs, which excel in language-related tasks, struggle to adapt to numbers’ continuous and infinitely variable nature. Most approaches involve tokenization, where numbers are broken into multiple tokens, increasing model complexity and memory requirements.

Polymathic AI researchers introduce a potential game-changer: the xVal encoding strategy. This innovative approach offers a fresh perspective on encoding numbers in LLMs for scientific applications. xVal employs a singular token labeled as [NUM] to represent any number.

The xVal strategy achieves this by treating numbers differently in the language model. Instead of relying on multiple tokens, each number is pre-processed and stored in a separate vector. The text replaces the number with the [NUM] token. During decoding, a dedicated token head in the transformer architecture is employed to predict the value associated with the [NUM] token, using Mean Squared Error (MSE) loss as the guiding metric.

In a series of experiments, xVal’s capabilities were rigorously tested and compared with four other numerical encoding strategies. The results were intriguing. xVal outshone other methods on multi-operand tasks and performed comparably in complex calculations, such as multiplying large multi-digit integers.

When applied to temperature readings from the ERA5 global climate dataset, xVal’s inherent continuity bias allowed it to excel, achieving the best performance in minimal training time.

Planetary Simulations revealed xVal’s exceptional interpolation abilities in simulations of planets orbiting a central mass, surpassing all other encoding schemes when making predictions for out-of-distribution data.

In conclusion, xVal’s innovative approach to encoding numbers in language models holds the potential to revolutionize the future. Addressing the challenge of representing numbers in LLMs with a more efficient and accurate method opens the door to innovative applications in the scientific realm. This groundbreaking solution may pave the way for the development of foundation models that connect multiple domains of science, ultimately reshaping the landscape of scientific inquiry in the years to come.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

The post Meet xVal: A Continuous Way to Encode Numbers in Language Models for Scientific Applications that Uses Just a Single Token to Represent any Number appeared first on MarkTechPost.