Sign in to my dashboard Create an account

Power your generative AI on AWS

Table of Contents

Share this page

Unleash the full potential of generative AI applications

In today’s data-driven world, generative AI (GenAI) is becoming essential to business practices. By boosting productivity, slashing operational costs, and delivering exceptional customer experiences, GenAI automates tasks and generates high-quality content that keeps you ahead of the competition.

Stay ahead with smart insights

GenAI goes beyond simple automation. It provides actionable insights and predictive analytics that empower your business to respond swiftly to market shifts and customer needs as they happen. Imagine being able to predict trends and make informed decisions in real time—GenAI makes that possible.

Combine your unique data for unmatched accuracy

The secret sauce? Your organization’s proprietary insights. By merging these with public data from large language models (LLMs), you create a unique blend that delivers unparalleled relevance and precision. While others might have access to similar public data, this combo gives you a competitive edge.

  • With retrieval-augmented generation (RAG), you can securely mix proprietary data with dynamic public data, making your AI projects more pertinent than ever.
  • Use AI inferencing to apply learned patterns to new data, enabling real-time tasks like image recognition, natural language processing (NLP), and predictive analytics.
  • Unified data storage makes it possible. Use integrated data services to bring the power of GenAI closer to your data with an intelligent data infrastructure that supports all major protocols and tools. AI-ready cloud storage not only enhances RAG and inferencing but also serves as a secure enterprise framework for GenAI workloads. Manage vast stores of unstructured data efficiently with unified data storage and easily support numerous use cases.

What is GenAI?

GenAI is a type of artificial intelligence that rapidly creates content—text, images, music, audio voices, videos, or code—in response to text prompts. GenAI enhances business functions by creating new content from existing data. GenAI applications are powered by LLMs and foundation models (FMs) that are pretrained on vast amounts of unstructured data.

You can customize these models with your data for domain-specific tasks that transform your operations.

Benefits of GenAI

  • Enhance customer experiences and personalization with chatbots and virtual assistants.
  • Boost employee productivity with conversational search, summarization, content creation, and code generation.
  • Optimize business processes like document processing, data augmentation, and enhanced cybersecurity.

Why use RAG?

RAG is a game-changer. It improves LLMs by adding relevant, authoritative data from outside their training set, ensuring accurate and current responses. This makes generative AI applications more effective and reliable, opening a world of possibilities.

RAG systems work in two steps: First, they allow relevant datasets to enter the GenAI pipeline outside of the original model, then a GenAI model generates precise responses to inquiries.

How RAG improves AI responses

RAG, with its ability to deliver global insights and specialized domain knowledge, keeps your GenAI applications current and innovative. It offers a cost-effective, streamlined approach by incorporating retrieval mechanisms to boost accuracy and relevance by including the right data. This reduces risk by keeping the wrong information out of the data pipeline, making it an efficient solution for various applications.

Common use cases of RAG

  • Enhance search engines by improving algorithms and UIs to provide more accurate and relevant results.
  • Improve recommendation systems to provide more personalized suggestions, using advanced algorithms and user behavior analysis.
  • Boost the capabilities of virtual assistants to provide more accurate and personalized responses.

5 keys to infusing RAG operations throughout your data pipeline

Unlocking your data’s full potential requires a strategic approach to integrating GenAI throughout your operations. Here are five capabilities to help drive effective RAG efforts.


Common data footprint everywhere

With NetApp® ONTAP® data management everywhere, you can easily include data from any environment to power your RAG efforts. ONTAP software lets you use common operational processes while reducing risk, cost, and time to results.


Automated classification and tagging

The NetApp BlueXP classification service streamlines data categorization, classification, and cleansing for the data pipeline’s ingest and inferencing phases of the data pipeline. This means the right data is used for queries, and sensitive data is protected according to your organization’s policies.


Fast, scalable Snapshot copies

NetApp Snapshot technology creates near-instant, space-efficient, in-place copies of vector stores and databases for interval-based A/B testing and recovery. You can perform point-in-time analysis or, if data is inconsistent, immediately revert to a previous version.


Real-time cloning at scale

NetApp FlexClone® technology can create instant clones of vector index stores for parallel processing of A/B prompt testing and result validation. With cloning, you can safely make uniquely relevant data instantly available for queries from different users without affecting the core production data.


Distributed caching

NetApp FlexCache® software lets you use AI datasets at the point of GPU horsepower for inferencing runs or collaboration.

The role of inferencing

In AI, inferencing is a crucial process that allows a machine or algorithm to make decisions or predictions by using data and prior knowledge. By leveraging trained models, the inferencing process analyzes new inputs and delivers valuable outputs, such as classifying images, understanding language, or making choices. With inferencing, AI can draw conclusions and make more accurate and informed decisions, leading to smarter outcomes in real-world applications.

Use cases for inferencing

  • Use real-time analytics for immediate insights into data as it is collected, enabling quick decision-making and responsive actions.
  • Apply predictive maintenance to forecast equipment failures, preventing breakdowns and extending machinery lifespans.
  • Detect and prevent fraud by implementing advanced techniques to identify and mitigate deceptive activities, maintaining financial security and trust.

Intelligent data infrastructure for AI

AI workloads need effective storage infrastructure for efficient management, storage, GPU utilization, and retrieval of the vast data required for training and deploying AI models. Amazon FSx for NetApp ONTAP delivers the full capabilities of ONTAP in an AWS-native storage service, simplifying data management and enhancing AI workload performance.

Why choose Amazon FSx for NetApp ONTAP?

FSx for ONTAP operates with AWS services like Bedrock and SageMaker. It offers a strong foundation for building, scaling, and managing AI applications, handling data efficiently and securely throughout the AI lifecycle.

Benefits for generative AI

  • High performance and low latency are crucial for training and deploying generative AI models, which often require rapid access to large datasets. Rather than distributing your data and I/O across multiple file systems, FSx for ONTAP can consolidate up to 12 pairs, or 24 nodes, within a single cluster. Recent enhancements include more granular scale-out throughput capacities that support your GenAI workloads in AWS.
  • Efficient data management is vital for handling the extensive datasets and intermediate outputs generated during GenAI model training. By leveraging FSx for ONTAP and capabilities of NetApp BlueXP classification, Snapshot, FlexClone, and FlexCache, you can effectively deploy and manage a secure GenAI infrastructure.

Benefits for RAG

  • Enjoy seamless integration with RAG workflows through support for both NFS and S3 protocols. This flexibility means models can efficiently retrieve and incorporate relevant data from various sources during the generation process.
  • Blend proprietary data with public LLMs for RAG operations that consistently deliver relevant and accurate outputs.
  • Easily scale your system’s capacity to handle increased RAG datasets without disruption.

Benefits for inferencing

  • Quickly access data with low latency to enable fast and efficient model predictions. This is crucial because inferencing tasks often require real-time or near-real-time responses.
  • Keep data consistent and reliable with a robust file system that supports inferencing apps, which depend on precise and accurate data to make predictions.
  • Get built-in confidence from state-of-the-art data protection and security. FSx for ONTAP not only simplifies the backup and recovery of critical AI workloads; it protects the data used for inferencing and keeps it compliant. This reduces risks associated with data breaches or regulatory issues.

Explore Amazon Bedrock

Amazon Bedrock is a fully managed AWS service that helps enterprises build and scale GenAI applications. It offers access to foundation models from top AI companies, enabling developers to integrate them without extensive ML expertise.

Amazon Bedrock benefits

  • Choose from leading FMs like Amazon Titan and those from AI21 Labs, Anthropic, Cohere, Meta, and—all accessible through a common API.
  • Customize AI models to better fit your specific needs and preferences.
  • Get accurate, customized responses from FMs using Knowledge Bases for Amazon Bedrock. This fully managed RAG capability allows you to enrich FM responses with contextual and relevant company data.
  • Use security and privacy features to safeguard sensitive information for risk-free operations.

What’s possible with Bedrock and FSx for ONTAP?

  • Supercharge LLMs with your organization-specific data for a true competitive differentiator.
  • Customize through fine-tuning with prelabeled datasets and custom parameters or weights, or opt for pretraining with raw data specific to your domain for real-time learning.
  • Enrich base models and provide end users with accurate responses by using RAG to retrieve information from your internal datasets.
  • Use agents to execute multistep tasks, drawing on company systems and data sources. For example, AWS Lambda functions can handle a wide range of tasks, from basic chat responses to product fulfillment.

Streamline AI model development with Amazon SageMaker

Amazon SageMaker is a comprehensive AWS ML service that enables developers and data scientists to build, train, and deploy ML models efficiently. It provides tools and infrastructure to streamline the development, training, and deployment of advanced AI models, making it easier to harness AI’s full potential.

Use SageMaker and FSx for ONTAP to enhance data processing and ML capabilities, taking advantage of seamless connections for optimal performance and efficiency in handling large datasets.

Boost enterprise search with Amazon Kendra

Amazon Kendra is an intelligent search service that uses NLP capabilities to enable unified searches of your enterprise content. It can improve employee productivity, unlock insights for data-driven decisions, reduce contact center costs, and enhance in-app searches.

Significantly improve the quality of Kendra search results by relying on FSx for ONTAP for fast storage, enterprise data management, and secure access.

Real-world use cases

Use Amazon FSx for NetApp ONTAP to power generative AI applications and achieve remarkable results.

  • Customer service enhancement. Deploy GenAI chatbots to handle customer inquiries, reducing response times and increasing customer satisfaction. Deliver smarter, more efficient interactions by harnessing shared data and agent feeds in a vector database on FSx for ONTAP.
  • Predictive maintenance in manufacturing. By employing RAG operations, manufactures can reduce downtime and maintenance costs.
  • Fraud detection in finance. Use AI inferencing to predict and prevent fraudulent transactions, dramatically reducing fraud-related losses.
  • Permission-aware RAG solution. Using Active Directory, this savvy solution delivers information based on user access levels. ACL-aware embedding agents store data on FSx for ONTAP for security and efficiency.

Build enterprise GenAI applications

Implementing generative AI with Amazon FSx for NetApp ONTAP is straightforward and easily aligns with your existing processes. Here are some common questions:

Which model should I use?

Amazon Bedrock gives you choices of leading FMs with a common API in the AWS Cloud.

How can I move quickly?

Unlock the knowledge of your unstructured file data and build augmented generative AI apps for productivity.

How can I keep my data secure and private?

Combine the privacy and controls of Amazon Bedrock with the data protection of FSx for ONTAP. NetApp BlueXP workload factory automatically connects Bedrock with FSx for ONTAP via API, easing data ingestion and securely optimizing RAG processes.

Next steps

For more details or to schedule a demo, reach out to our team. We’re here to help you every step of the way.


Drift chat loading