🎯

bedrock-knowledge-bases

🎯Skill

from adaptationio/skrillz

What it does

bedrock-knowledge-bases skill from adaptationio/skrillz

📦

Part of

adaptationio/skrillz(191 items)

bedrock-knowledge-bases

Installation

Add MarketplaceAdd marketplace to Claude Code

/plugin marketplace add adaptationio/Skrillz

Install PluginInstall plugin from marketplace

/plugin install skrillz@adaptationio-Skrillz

Claude CodeAdd plugin in Claude Code

/plugin enable skrillz@adaptationio-Skrillz

Add MarketplaceAdd marketplace to Claude Code

/plugin marketplace add /path/to/skrillz

Install PluginInstall plugin from marketplace

/plugin install skrillz@local

+ 4 more commands

📖 Extracted from docs: adaptationio/skrillz

Need more details? View full documentation on GitHub →

2Installs

Last UpdatedJan 16, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

Amazon Bedrock Knowledge Bases for RAG (Retrieval-Augmented Generation). Create knowledge bases with vector stores, ingest data from S3/web/Confluence/SharePoint, configure chunking strategies, query with retrieve and generate APIs, manage sessions. Use when building RAG applications, implementing semantic search, creating document Q&A systems, integrating knowledge bases with agents, optimizing chunking for accuracy, or querying enterprise knowledge.

Overview

# Amazon Bedrock Knowledge Bases

Amazon Bedrock Knowledge Bases is a fully managed RAG (Retrieval-Augmented Generation) solution that handles data ingestion, embedding generation, vector storage, retrieval with reranking, source attribution, and session context management.

Overview

What It Does

Amazon Bedrock Knowledge Bases provides:

Data Ingestion: Automatically process documents from S3, web, Confluence, SharePoint, Salesforce
Embedding Generation: Convert text to vectors using foundation models
Vector Storage: Store embeddings in multiple vector database options
Retrieval: Semantic and hybrid search with metadata filtering
Generation: RAG workflows with source attribution
Session Management: Multi-turn conversations with context
Chunking Strategies: Fixed, semantic, hierarchical, and custom chunking

When to Use This Skill

Use this skill when you need to:

Build RAG applications for document Q&A
Implement semantic search over enterprise knowledge
Create chatbots with knowledge bases
Integrate retrieval with Bedrock Agents
Configure optimal chunking strategies
Query documents with source attribution
Manage multi-turn conversations with context
Optimize RAG performance and cost

Key Capabilities

Multiple Vector Store Options: OpenSearch, S3 Vectors, Neptune, Pinecone, MongoDB, Redis
Flexible Data Sources: S3, web crawlers, Confluence, SharePoint, Salesforce
Advanced Chunking: Fixed-size, semantic, hierarchical, custom Lambda
Hybrid Search: Combine semantic (vector) and keyword search
Session Management: Built-in conversation context tracking
GraphRAG: Relationship-aware retrieval with Neptune Analytics
Cost Optimization: S3 Vectors for up to 90% storage savings

---

Quick Start

Basic RAG Workflow

```python

import boto3

import json

# Initialize clients

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

# 1. Create Knowledge Base

kb_response = bedrock_agent.create_knowledge_base(

name='enterprise-docs-kb',

description='Company documentation knowledge base',

roleArn='arn:aws:iam::123456789012:role/BedrockKBRole',

knowledgeBaseConfiguration={

'type': 'VECTOR',

'vectorKnowledgeBaseConfiguration': {

'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'

}

storageConfiguration={

'type': 'OPENSEARCH_SERVERLESS',

'opensearchServerlessConfiguration': {

'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',

'vectorIndexName': 'bedrock-knowledge-base-index',

'fieldMapping': {

'vectorField': 'bedrock-knowledge-base-default-vector',

'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',

'metadataField': 'AMAZON_BEDROCK_METADATA'

}

)

knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId']

print(f"Knowledge Base ID: {knowledge_base_id}")

# 2. Add S3 Data Source

ds_response = bedrock_agent.create_data_source(

knowledgeBaseId=knowledge_base_id,

name='s3-documents',

description='Company documents from S3',

dataSourceConfiguration={

'type': 'S3',

's3Configuration': {

'bucketArn': 'arn:aws:s3:::my-docs-bucket',

'inclusionPrefixes': ['documents/']

}

vectorIngestionConfiguration={

'chunkingConfiguration': {

'chunkingStrategy': 'FIXED_SIZE',

'fixedSizeChunkingConfiguration': {

'maxTokens': 512,

'overlapPercentage': 20

}

)

data_source_id = ds_response['dataSource']['dataSourceId']

# 3. Start Ingestion

ingestion_response = bedrock_agent.start_ingestion_job(

knowledgeBaseId=knowledge_base_id,

dataSourceId=data_source_id,

description='Initial document ingestion'

)

print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")

# 4. Query with Retrieve and Generate

response = bedrock_agent_runtime.retrieve_and_generate(

input={

'text': 'What is our vacation policy?'

retrieveAndGenerateConfiguration={

'type': 'KNOWLEDGE_BASE',

'knowledgeBaseConfiguration': {

'knowledgeBaseId': knowledge_base_id,

'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',

'retrievalConfiguration': {

'vectorSearchConfiguration': {

'numberOfResults': 5,

'overrideSearchType': 'HYBRID'

}

)

print(f"Answer: {response['output']['text']}")

print(f"\nSources:")

for citation in response['citations']:

for reference in citation['retrievedReferences']:

print(f" - {reference['location']['s3Location']['uri']}")

```

---

Vector Store Options

1. Amazon OpenSearch Serverless

Best for: Production RAG applications with auto-scaling requirements

Benefits:

Fully managed, serverless operation
Auto-scaling compute and storage
High availability with multi-AZ deployment
Fast query performance

Configuration:

```python

storageConfiguration={

'type': 'OPENSEARCH_SERVERLESS',

'opensearchServerlessConfiguration': {

'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',

'vectorIndexName': 'bedrock-knowledge-base-index',

'fieldMapping': {

'vectorField': 'bedrock-knowledge-base-default-vector',

'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',

'metadataField': 'AMAZON_BEDROCK_METADATA'

}

```

2. Amazon S3 Vectors (Preview)

Best for: Cost-optimized, large-scale RAG applications

Benefits:

Up to 90% cost reduction for vector storage
Built-in vector support in S3
Subsecond query performance
Massive scale and durability

Ideal Use Cases:

Large document collections (millions of chunks)
Cost-sensitive applications
Archival knowledge bases
Low-to-medium QPS workloads

Configuration:

```python

storageConfiguration={

'type': 'S3_VECTORS',

's3VectorsConfiguration': {

'bucketArn': 'arn:aws:s3:::my-vector-bucket',

'prefix': 'vectors/'

}

```

Limitations:

Still in preview (no CloudFormation/CDK support yet)
Not suitable for high QPS, millisecond-latency requirements
Best for cost optimization over ultra-low latency

3. Amazon Neptune Analytics (GraphRAG)

Best for: Interconnected knowledge domains requiring relationship-aware retrieval

Benefits:

Automatic graph creation linking related content
Improved retrieval accuracy through relationships
Comprehensive responses leveraging knowledge graph
Explainable results with relationship context

Use Cases:

Legal document analysis with case precedents
Scientific research with paper citations
Product catalogs with dependencies
Organizational knowledge with team relationships

Configuration:

```python

storageConfiguration={

'type': 'NEPTUNE_ANALYTICS',

'neptuneAnalyticsConfiguration': {

'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',

'vectorSearchConfiguration': {

'vectorField': 'embedding'

}

```

4. Amazon OpenSearch Service Managed Cluster

Best for: Existing OpenSearch infrastructure, advanced customization

Configuration:

```python

storageConfiguration={

'type': 'OPENSEARCH_SERVICE',

'opensearchServiceConfiguration': {

'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',

'vectorIndexName': 'bedrock-kb-index',

'fieldMapping': {

'vectorField': 'embedding',

'textField': 'text',

'metadataField': 'metadata'

}

```

5. Third-Party Vector Databases

Pinecone:

```python

storageConfiguration={

'type': 'PINECONE',

'pineconeConfiguration': {

'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',

'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',

'namespace': 'bedrock-kb',

'fieldMapping': {

'textField': 'text',

'metadataField': 'metadata'

}

```

MongoDB Atlas:

```python

storageConfiguration={

'type': 'MONGODB_ATLAS',

'mongoDbAtlasConfiguration': {

'endpoint': 'https://cluster0.mongodb.net',

'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',

'databaseName': 'bedrock_kb',

'collectionName': 'vectors',

'vectorIndexName': 'vector_index',

'fieldMapping': {

'vectorField': 'embedding',

'textField': 'text',

'metadataField': 'metadata'

}

```

Redis Enterprise Cloud:

```python

storageConfiguration={

'type': 'REDIS_ENTERPRISE_CLOUD',

'redisEnterpriseCloudConfiguration': {

'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',

'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',

'vectorIndexName': 'bedrock-kb-index',

'fieldMapping': {

'vectorField': 'embedding',

'textField': 'text',

'metadataField': 'metadata'

}

```

---

Data Source Configuration

1. Amazon S3

Supported File Types: PDF, TXT, MD, HTML, DOC, DOCX, CSV, XLS, XLSX

```python

bedrock_agent.create_data_source(

knowledgeBaseId=knowledge_base_id,

name='s3-technical-docs',

description='Technical documentation from S3',

dataSourceConfiguration={

'type': 'S3',

's3Configuration': {

'bucketArn': 'arn:aws:s3:::my-docs-bucket',

'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],

'exclusionPrefixes': ['docs/archive/']

}

)

```

2. Web Crawler

Automatic website scraping and indexing:

```python

bedrock_agent.create_data_source(

knowledgeBaseId=knowledge_base_id,

name='company-website',

description='Public company website content',

dataSourceConfiguration={

'type': 'WEB',

'webConfiguration': {

'sourceConfiguration': {

'urlConfiguration': {

'seedUrls': [

{'url': 'https://www.example.com/docs'},

{'url': 'https://www.example.com/blog'}

]

}

'crawlerConfiguration': {

'crawlerLimits': {

'rateLimit': 300 # Pages per minute

}

)

```

3. Confluence

```python

bedrock_agent.create_data_source(

knowledgeBaseId=knowledge_base_id,

name='confluence-wiki',

description='Company Confluence knowledge base',

dataSourceConfiguration={

'type': 'CONFLUENCE',

'confluenceConfiguration': {

'sourceConfiguration': {

'hostUrl': 'https://company.atlassian.net/wiki',

'hostType': 'SAAS',

'authType': 'BASIC',

'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'

'crawlerConfiguration': {

'filterConfiguration': {

'type': 'PATTERN',

'patternObjectFilter': {

'filters': [

{

'objectType': 'Space',

'inclusionFilters': ['Engineering', 'Product'],

'exclusionFilters': ['Archive']

}

]

}

)

```

4. SharePoint

```python

bedrock_agent.create_data_source(

knowledgeBaseId=knowledge_base_id,

name='sharepoint-docs',

description='SharePoint document library',

dataSourceConfiguration={

'type': 'SHAREPOINT',

'sharePointConfiguration': {

'sourceConfiguration': {

'siteUrls': [

'https://company.sharepoint.com/sites/Engineering',

'https://company.sharepoint.com/sites/Product'

'tenantId': 'tenant-id',

'domain': 'company',

'authType': 'OAUTH2_CLIENT_CREDENTIALS',

'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'

}

)

```

5. Salesforce

```python

bedrock_agent.create_data_source(

knowledgeBaseId=knowledge_base_id,

name='salesforce-knowledge',

description='Salesforce knowledge articles',

dataSourceConfiguration={

'type': 'SALESFORCE',

'salesforceConfiguration': {

'sourceConfiguration': {

'hostUrl': 'https://company.my.salesforce.com',

'authType': 'OAUTH2_CLIENT_CREDENTIALS',

'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'

'crawlerConfiguration': {

'filterConfiguration': {

'type': 'PATTERN',

'patternObjectFilter': {

'filters': [

{

'objectType': 'Knowledge',

'inclusionFilters': ['Product_Documentation', 'Support_Articles']

}

]

}

)

```

---

Chunking Strategies

1. Fixed-Size Chunking

Best for: Simple documents with uniform structure

How it works: Splits text into chunks of fixed token size with overlap

Parameters:

maxTokens: 200-8192 tokens (typically 512-1024)
overlapPercentage: 10-50% (typically 20%)

Configuration:

```python

vectorIngestionConfiguration={

'chunkingConfiguration': {

'chunkingStrategy': 'FIXED_SIZE',

'fixedSizeChunkingConfiguration': {

'maxTokens': 512,

'overlapPercentage': 20

}

```

Use Cases:

Blog posts and articles
Technical documentation with consistent formatting
FAQs and Q&A content
Simple text files

Pros:

Fast and predictable
No additional costs
Easy to tune

Cons:

May split semantic units awkwardly
Doesn't respect document structure
Can break context mid-sentence

2. Semantic Chunking

Best for: Documents without clear boundaries (legal, technical, academic)

How it works: Uses sentence similarity to group related content

Parameters:

maxTokens: 20-8192 tokens (typically 300-500)
bufferSize: Number of neighboring sentences (default: 1)
breakpointPercentileThreshold: Similarity threshold (recommended: 95%)

Configuration:

```python

vectorIngestionConfiguration={

'chunkingConfiguration': {

'chunkingStrategy': 'SEMANTIC',

'semanticChunkingConfiguration': {

'maxTokens': 300,

'bufferSize': 1,

'breakpointPercentileThreshold': 95

}

```

Use Cases:

Legal documents and contracts
Academic papers
Technical specifications
Medical records
Research reports

Pros:

Preserves semantic meaning
Better context preservation
Improved retrieval accuracy

Cons:

Additional cost (foundation model usage)
Slower ingestion
Less predictable chunk sizes

Cost Consideration: Semantic chunking uses foundation models for similarity analysis, incurring additional costs beyond storage and retrieval.

3. Hierarchical Chunking

Best for: Complex documents with nested structure

How it works: Creates parent and child chunks; retrieves child, returns parent for context

Parameters:

levelConfigurations: Array of chunk sizes (parent → child)
overlapTokens: Overlap between chunks

Configuration:

```python

vectorIngestionConfiguration={

'chunkingConfiguration': {

'chunkingStrategy': 'HIERARCHICAL',

'hierarchicalChunkingConfiguration': {

'levelConfigurations': [

{

'maxTokens': 1500 # Parent chunk (comprehensive context)

{

'maxTokens': 300 # Child chunk (focused retrieval)

}

'overlapTokens': 60

}

```

Use Cases:

Technical manuals with sections and subsections
Academic papers with abstract, sections, and subsections
Legal documents with articles and clauses
Product documentation with categories and details

How Retrieval Works:

Query matches against child chunks (fast, focused)
Returns parent chunks (comprehensive context)
Best of both: precision retrieval + complete context

Pros:

Optimal balance of precision and context
Excellent for nested documents
Better accuracy for complex queries

Cons:

More complex configuration
Larger storage footprint
Requires understanding of document structure

4. Custom Chunking (Lambda)

Best for: Specialized domain logic, custom parsing requirements

How it works: Invoke Lambda function for custom chunking logic

Configuration:

```python

vectorIngestionConfiguration={

'chunkingConfiguration': {

'chunkingStrategy': 'NONE' # Custom via Lambda

'customTransformationConfiguration': {

'intermediateStorage': {

's3Location': {

'uri': 's3://my-kb-bucket/intermediate/'

}

'transformations': [

{

'stepToApply': 'POST_CHUNKING',

'transformationFunction': {

'transformationLambdaConfiguration': {

'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'

}

]

}

```

Example Lambda Handler:

```python

# Lambda function for custom chunking

import json

def lambda_handler(event, context):

"""

Custom chunking logic for specialized documents

Input: event contains document content and metadata

Output: array of chunks with text and metadata

"""

# Extract document content

document = event['document']

content = document['content']

metadata = document.get('metadata', {})

# Custom chunking logic (example: split by custom delimiter)

chunks = []

sections = content.split('---SECTION---')

for idx, section in enumerate(sections):

if section.strip():

chunks.append({

'text': section.strip(),

'metadata': {

**metadata,

'chunk_id': f'section_{idx}',

'chunk_type': 'custom_section'

}

})

return {

'chunks': chunks

}

```

Use Cases:

Medical records with structured sections (SOAP notes)
Financial documents with tables and calculations
Code documentation with code blocks and explanations
Domain-specific formats (HL7, FHIR, etc.)

Pros:

Complete control over chunking logic
Can handle any document format
Integrate domain expertise

Cons:

Requires Lambda development and maintenance
Additional operational complexity
Harder to debug and iterate

Chunking Strategy Selection Guide

| Document Type | Recommended Strategy | Rationale |

|--------------|---------------------|-----------|

| Blog posts, articles | Fixed-size | Simple, uniform structure |

| Legal documents | Semantic | Preserve legal reasoning flow |

| Technical manuals | Hierarchical | Nested sections and subsections |

| Academic papers | Hierarchical | Abstract, sections, subsections |

| FAQs | Fixed-size | Independent Q&A pairs |

| Medical records | Custom Lambda | Structured sections (SOAP, HL7) |

| Code documentation | Custom Lambda | Code blocks + explanations |

| Product catalogs | Fixed-size | Uniform product descriptions |

| Research reports | Semantic | Preserve research narrative |

---

Retrieval Operations

1. Retrieve API (Retrieval Only)

Returns raw retrieved chunks without generation.

Use Cases:

Custom generation logic
Debugging retrieval quality
Building custom RAG pipelines
Integrating with non-Bedrock models

```python

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.retrieve(

knowledgeBaseId='KB123456',

retrievalQuery={

'text': 'What are the benefits of hierarchical chunking?'

retrievalConfiguration={

'vectorSearchConfiguration': {

'numberOfResults': 5,

'overrideSearchType': 'HYBRID', # SEMANTIC, HYBRID

'filter': {

'andAll': [

{

'equals': {

'key': 'document_type',

'value': 'technical_guide'

}

{

'greaterThan': {

'key': 'publish_year',

'value': 2024

}

]

}

)

# Process retrieved chunks

for result in response['retrievalResults']:

print(f"Score: {result['score']}")

print(f"Content: {result['content']['text']}")

print(f"Location: {result['location']}")

print(f"Metadata: {result.get('metadata', {})}")

print("---")

```

2. Retrieve and Generate API (RAG)

Returns generated response with source attribution.

Use Cases:

Complete RAG workflows
Question answering
Document summarization
Chatbots with knowledge bases

```python

response = bedrock_agent_runtime.retrieve_and_generate(

input={

'text': 'Explain semantic chunking benefits and when to use it'

retrieveAndGenerateConfiguration={

'type': 'KNOWLEDGE_BASE',

'knowledgeBaseConfiguration': {

'knowledgeBaseId': 'KB123456',

'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',

'retrievalConfiguration': {

'vectorSearchConfiguration': {

'numberOfResults': 5,

'overrideSearchType': 'HYBRID'

}

'generationConfiguration': {

'inferenceConfig': {

'textInferenceConfig': {

'temperature': 0.7,

'maxTokens': 2048,

'topP': 0.9

}

'promptTemplate': {

'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.

Context: $search_results$

Question: $query$

Answer:'''

}

)

print(f"Generated Response: {response['output']['text']}")

print(f"\nSources:")

for citation in response['citations']:

for reference in citation['retrievedReferences']:

print(f" - {reference['location']}")

print(f" Relevance Score: {reference.get('score', 'N/A')}")

```

3. Multi-Turn Conversations with Session Management

Bedrock automatically manages conversation context across turns.

```python

# First turn - creates session automatically

response1 = bedrock_agent_runtime.retrieve_and_generate(

input={

'text': 'What is Amazon Bedrock Knowledge Bases?'

retrieveAndGenerateConfiguration={

'type': 'KNOWLEDGE_BASE',

'knowledgeBaseConfiguration': {

'knowledgeBaseId': 'KB123456',

'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'

}

)

session_id = response1['sessionId']

print(f"Session ID: {session_id}")

print(f"Response: {response1['output']['text']}\n")

# Follow-up turn - reuse session for context

response2 = bedrock_agent_runtime.retrieve_and_generate(

input={

'text': 'What chunking strategies does it support?'

retrieveAndGenerateConfiguration={

'type': 'KNOWLEDGE_BASE',

'knowledgeBaseConfiguration': {

'knowledgeBaseId': 'KB123456',

'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'

}

sessionId=session_id # Continue conversation with context

)

print(f"Follow-up Response: {response2['output']['text']}")

# Third turn

response3 = bedrock_agent_runtime.retrieve_and_generate(

input={

'text': 'Which strategy would you recommend for legal documents?'

retrieveAndGenerateConfiguration={

'type': 'KNOWLEDGE_BASE',

'knowledgeBaseConfiguration': {

'knowledgeBaseId': 'KB123456',

'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'

}

sessionId=session_id

)

print(f"Third Response: {response3['output']['text']}")

```

4. Advanced Metadata Filtering

Filter retrieval by metadata attributes for precision.

```python

response = bedrock_agent_runtime.retrieve(

knowledgeBaseId='KB123456',

retrievalQuery={

'text': 'Security best practices for production deployments'

retrievalConfiguration={

'vectorSearchConfiguration': {

'numberOfResults': 10,

'overrideSearchType': 'HYBRID',

'filter': {

'andAll': [

{

'equals': {

'key': 'document_type',

'value': 'security_guide'

}

{

'greaterThanOrEquals': {

'key': 'publish_year',

'value': 2024

}

{

'in': {

'key': 'category',

'value': ['production', 'security', 'compliance']

}

]

}

)

```

Supported Filter Operators:

equals: Exact match
notEquals: Not equal
greaterThan, greaterThanOrEquals: Numeric comparison
lessThan, lessThanOrEquals: Numeric comparison
in: Match any value in array
notIn: Not match any value in array
startsWith: String prefix match
andAll: Combine filters with AND
orAll: Combine filters with OR

---

Ingestion Management

1. Start Ingestion Job

```python

ingestion_response = bedrock_agent.start_ingestion_job(

knowledgeBaseId=knowledge_base_id,

dataSourceId=data_source_id,

description='Monthly document sync',

clientToken='unique-idempotency-token-123'

)

job_id = ingestion_response['ingestionJob']['ingestionJobId']

print(f"Ingestion Job ID: {job_id}")

```

2. Monitor Ingestion Job

```python

# Get job status

job_status = bedrock_agent.get_ingestion_job(

knowledgeBaseId=knowledge_base_id,

dataSourceId=data_source_id,

ingestionJobId=job_id

)

print(f"Status: {job_status['ingestionJob']['status']}")

print(f"Started: {job_status['ingestionJob']['startedAt']}")

print(f"Updated: {job_status['ingestionJob']['updatedAt']}")

if 'statistics' in job_status['ingestionJob']:

stats = job_status['ingestionJob']['statistics']

print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}")

print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}")

print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")

# Wait for completion

import time

while True:

status = bedrock_agent.get_ingestion_job(

knowledgeBaseId=knowledge_base_id,

dataSourceId=data_source_id,

ingestionJobId=job_id

)

current_status = status['ingestionJob']['status']

if current_status in ['COMPLETE', 'FAILED']:

print(f"Ingestion job {current_status}")

break

print(f"Status: {current_status}, waiting...")

time.sleep(30)

```

3. List Ingestion Jobs

```python

list_response = bedrock_agent.list_ingestion_jobs(

knowledgeBaseId=knowledge_base_id,

dataSourceId=data_source_id,

maxResults=50

)

for job in list_response['ingestionJobSummaries']:

print(f"Job ID: {job['ingestionJobId']}")

print(f"Status: {job['status']}")

print(f"Started: {job['startedAt']}")

print(f"Updated: {job['updatedAt']}")

print("---")

```

---

Integration with Bedrock Agents

1. Agent with Knowledge Base Action

```python

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

# Create agent with knowledge base

agent_response = bedrock_agent.create_agent(

agentName='customer-support-agent',

description='Customer support agent with knowledge base access',

instruction='''You are a customer support agent. When answering questions:

Search the knowledge base for relevant information
Provide accurate answers based on retrieved context
Cite your sources
Admit when you don't know something''',

foundationModel='anthropic.claude-3-sonnet-20240229-v1:0',

agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole'

)

agent_id = agent_response['agent']['agentId']

# Associate knowledge base with agent

kb_association = bedrock_agent.associate_agent_knowledge_base(

agentId=agent_id,

agentVersion='DRAFT',

knowledgeBaseId='KB123456',

description='Company documentation knowledge base',

knowledgeBaseState='ENABLED'

)

# Prepare and create alias

bedrock_agent.prepare_agent(agentId=agent_id)

alias_response = bedrock_agent.create_agent_alias(

agentId=agent_id,

agentAliasName='production',

description='Production alias'

)

agent_alias_id = alias_response['agentAlias']['agentAliasId']

# Invoke agent (automatically queries knowledge base)

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.invoke_agent(

agentId=agent_id,

agentAliasId=agent_alias_id,

sessionId='session-123',

inputText='What is our return policy for defective products?'

)

for event in response['completion']:

if 'chunk' in event:

chunk = event['chunk']

print(chunk['bytes'].decode())

```

2. Agent with Multiple Knowledge Bases

```python

# Associate multiple knowledge bases

bedrock_agent.associate_agent_knowledge_base(

agentId=agent_id,

agentVersion='DRAFT',

knowledgeBaseId='KB-PRODUCT-DOCS',

description='Product documentation'

)

bedrock_agent.associate_agent_knowledge_base(

agentId=agent_id,

agentVersion='DRAFT',

knowledgeBaseId='KB-SUPPORT-ARTICLES',

description='Support knowledge articles'

)

bedrock_agent.associate_agent_knowledge_base(

agentId=agent_id,

agentVersion='DRAFT',

knowledgeBaseId='KB-COMPANY-POLICIES',

description='Company policies and procedures'

)

# Agent automatically searches all knowledge bases and combines results

```

---

Best Practices

1. Chunking Strategy Selection

Decision Framework:

Simple, uniform documents → Fixed-size chunking

- Blog posts, articles, simple FAQs

- Fast, predictable, cost-effective

Documents without clear boundaries → Semantic chunking

- Legal documents, contracts, academic papers

- Preserves semantic meaning, better accuracy

- Consider additional cost

Nested, hierarchical documents → Hierarchical chunking

- Technical manuals, product docs, research papers

- Best balance of precision and context

- Optimal for complex structures

Specialized formats → Custom Lambda chunking

- Medical records (HL7, FHIR), code docs, custom formats

- Complete control, domain expertise

- Higher operational complexity

Tuning Guidelines:

Fixed-size: Start with 512 tokens, 20% overlap
Semantic: Start with 300 tokens, bufferSize=1, threshold=95%
Hierarchical: Parent 1500 tokens, child 300 tokens, overlap 60 tokens
Custom: Test extensively with domain experts

2. Retrieval Optimization

Number of Results:

Start with 5-10 results
Increase if answers lack detail
Decrease if too much noise

Search Type:

SEMANTIC: Pure vector similarity (faster, good for conceptual queries)
HYBRID: Vector + keyword (better recall, recommended for production)

Use Hybrid Search when:

Queries contain specific terms or names
Need to match exact keywords
Domain has specialized vocabulary

Use Semantic Search when:

Purely conceptual queries
Prioritizing speed over perfect recall
Well-embedded domain knowledge

Metadata Filters:

Always use when applicable
Dramatically improves precision
Reduces retrieval latency
Examples: document_type, publish_date, category, author

3. Cost Optimization

S3 Vectors:

Use for large-scale knowledge bases (millions of chunks)
Up to 90% cost savings vs. OpenSearch
Ideal for cost-sensitive applications
Trade-off: Slightly higher latency

Semantic Chunking:

Incurs foundation model costs during ingestion
Consider cost vs. accuracy benefit
May not be worth it for simple documents
Best for complex, high-value content

Ingestion Frequency:

Schedule ingestion during off-peak hours
Use incremental updates when possible
Don't re-ingest unchanged documents

Model Selection:

Use smaller embedding models when accuracy permits
Titan Embed Text v2 is cost-effective
Consider Cohere Embed for multilingual

Token Usage:

Monitor generation token usage
Set appropriate maxTokens limits
Use prompt templates to control verbosity

4. Session Management

Always Reuse Sessions:

Pass sessionId for follow-up turns
Bedrock handles context automatically
No manual conversation history needed

Session Lifecycle:

Sessions expire after inactivity (default: 60 minutes)
Create new session for unrelated conversations
Use unique sessionId per user/conversation

Context Limits:

Monitor conversation length
Long sessions may hit context limits
Consider summarization for very long conversations

5. GraphRAG with Neptune

When to Use:

Interconnected knowledge domains
Relationship-aware queries
Need for explainability
Complex knowledge graphs

Benefits:

Automatic graph creation
Improved accuracy through relationships
Comprehensive answers
Explainable results

Considerations:

Higher setup complexity
Neptune Analytics costs
Best for domains with rich relationships

6. Data Source Management

S3 Best Practices:

Organize with clear prefixes
Use inclusion/exclusion filters
Maintain consistent metadata
Version documents when updating

Web Crawler:

Set appropriate rate limits
Use robots.txt for guidance
Monitor for broken links
Schedule regular re-crawls

Confluence/SharePoint:

Filter by spaces/sites
Exclude archived content
Use fine-grained permissions
Schedule incremental syncs

Metadata Enrichment:

Add custom metadata to documents
Include: document_type, publish_date, category, author, version
Enables powerful filtering
Improves retrieval precision

7. Monitoring and Debugging

Enable CloudWatch Logs:

```python

# Monitor retrieval quality

# Track: query latency, retrieval scores, generation quality

# Set alarms for: high latency, low scores, high error rates

```

Test Retrieval Quality:

```python

# Use retrieve API to debug

response = bedrock_agent_runtime.retrieve(

knowledgeBaseId='KB123456',

retrievalQuery={'text': 'test query'}

)

# Analyze retrieval scores

for result in response['retrievalResults']:

print(f"Score: {result['score']}")

print(f"Content preview: {result['content']['text'][:200]}")

```

Common Issues:

Low Retrieval Scores:

- Check chunking strategy

- Verify embedding model

- Ensure documents are properly ingested

- Consider semantic or hierarchical chunking

Irrelevant Results:

- Add metadata filters

- Use hybrid search

- Refine chunking strategy

- Increase numberOfResults

Missing Information:

- Verify data source configuration

- Check ingestion job status

- Ensure documents are not excluded by filters

- Increase numberOfResults

Slow Retrieval:

- Use metadata filters to narrow scope

- Optimize vector database configuration

- Consider S3 Vectors for cost over latency

- Reduce numberOfResults

8. Security Best Practices

IAM Permissions:

Use least privilege for Knowledge Base role
Separate roles for data sources, ingestion, retrieval
Enable VPC endpoints for private connectivity

Data Encryption:

All data encrypted at rest (AWS KMS)
Data encrypted in transit (TLS)
Use customer-managed KMS keys for compliance

Access Control:

Use IAM policies to control who can query
Implement fine-grained access control
Monitor access with CloudTrail

PII Handling:

Use Bedrock Guardrails for PII redaction
Implement data masking for sensitive fields
Consider custom Lambda for advanced PII handling

---

Complete Production Example

End-to-End RAG Application

```python

import boto3

import json

from typing import List, Dict, Optional

class BedrockKnowledgeBaseRAG:

"""Production RAG application with Amazon Bedrock Knowledge Bases"""

def __init__(self, region_name: str = 'us-east-1'):

self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)

self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)

def create_knowledge_base(

self,

name: str,

description: str,

role_arn: str,

vector_store_config: Dict,

embedding_model: str = 'amazon.titan-embed-text-v2:0'

) -> str:

"""Create knowledge base with vector store"""

response = self.bedrock_agent.create_knowledge_base(

name=name,

description=description,

roleArn=role_arn,

knowledgeBaseConfiguration={

'type': 'VECTOR',

'vectorKnowledgeBaseConfiguration': {

'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'

}

storageConfiguration=vector_store_config

)

return response['knowledgeBase']['knowledgeBaseId']

def add_s3_data_source(

self,

knowledge_base_id: str,

name: str,

bucket_arn: str,

inclusion_prefixes: List[str],

chunking_strategy: str = 'FIXED_SIZE',

chunking_config: Optional[Dict] = None

) -> str:

"""Add S3 data source with chunking configuration"""

if chunking_config is None:

chunking_config = {

'maxTokens': 512,

'overlapPercentage': 20

}

vector_ingestion_config = {

'chunkingConfiguration': {

'chunkingStrategy': chunking_strategy

}

if chunking_strategy == 'FIXED_SIZE':

vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config

elif chunking_strategy == 'SEMANTIC':

vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config

elif chunking_strategy == 'HIERARCHICAL':

vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config

response = self.bedrock_agent.create_data_source(

knowledgeBaseId=knowledge_base_id,

name=name,

description=f'S3 data source: {name}',

dataSourceConfiguration={

'type': 'S3',

's3Configuration': {

'bucketArn': bucket_arn,

'inclusionPrefixes': inclusion_prefixes

}

vectorIngestionConfiguration=vector_ingestion_config

)

return response['dataSource']['dataSourceId']

def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:

"""Start ingestion job and wait for completion"""

import time

# Start ingestion

response = self.bedrock_agent.start_ingestion_job(

knowledgeBaseId=knowledge_base_id,

dataSourceId=data_source_id,

description='Automated ingestion'

)

job_id = response['ingestionJob']['ingestionJobId']

# Wait for completion

while True:

status_response = self.bedrock_agent.get_ingestion_job(

knowledgeBaseId=knowledge_base_id,

dataSourceId=data_source_id,

ingestionJobId=job_id

)

status = status_response['ingestionJob']['status']

if status == 'COMPLETE':

print(f"Ingestion completed successfully")

if 'statistics' in status_response['ingestionJob']:

stats = status_response['ingestionJob']['statistics']

print(f"Documents indexed: {stats.get('numberOfDocumentsIndexed', 0)}")

break

elif status == 'FAILED':

print(f"Ingestion failed")

break

print(f"Ingestion status: {status}")

time.sleep(30)

return job_id

def query(

self,

knowledge_base_id: str,

query: str,

model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',

num_results: int = 5,

search_type: str = 'HYBRID',

metadata_filter: Optional[Dict] = None,

session_id: Optional[str] = None

) -> Dict:

"""Query knowledge base with retrieve and generate"""

retrieval_config = {

'type': 'KNOWLEDGE_BASE',

'knowledgeBaseConfiguration': {

'knowledgeBaseId': knowledge_base_id,

'modelArn': model_arn,

'retrievalConfiguration': {

'vectorSearchConfiguration': {

'numberOfResults': num_results,

'overrideSearchType': search_type

}

'generationConfiguration': {

'inferenceConfig': {

'textInferenceConfig': {

'temperature': 0.7,

'maxTokens': 2048

}

# Add metadata filter if provided

if metadata_filter:

retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter

# Build request

request = {

'input': {'text': query},

'retrieveAndGenerateConfiguration': retrieval_config

}

# Add session if provided

if session_id:

request['sessionId'] = session_id

response = self.bedrock_agent_runtime.retrieve_and_generate(**request)

return {

'answer': response['output']['text'],

'citations': response.get('citations', []),

'session_id': response['sessionId']

}

def multi_turn_conversation(

self,

knowledge_base_id: str,

queries: List[str],

model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'

) -> List[Dict]:

"""Execute multi-turn conversation with context"""

session_id = None

conversation = []

for query in queries:

result = self.query(

knowledge_base_id=knowledge_base_id,

query=query,

model_arn=model_arn,

session_id=session_id

)

session_id = result['session_id']

conversation.append({

'query': query,

'answer': result['answer'],

'citations': result['citations']

})

return conversation

# Example Usage

if __name__ == '__main__':

rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')

# Create knowledge base

kb_id = rag.create_knowledge_base(

name='production-docs-kb',

description='Production documentation knowledge base',

role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',

vector_store_config={

'type': 'OPENSEARCH_SERVERLESS',

'opensearchServerlessConfiguration': {

'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',

'vectorIndexName': 'bedrock-kb-index',

'fieldMapping': {

'vectorField': 'bedrock-knowledge-base-default-vector',

'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',

'metadataField': 'AMAZON_BEDROCK_METADATA'

}

)

# Add data source

ds_id = rag.add_s3_data_source(

knowledge_base_id=kb_id,

name='technical-docs',

bucket_arn='arn:aws:s3:::my-docs-bucket',

inclusion_prefixes=['docs/'],

chunking_strategy='HIERARCHICAL',

chunking_config={

'levelConfigurations': [

{'maxTokens': 1500},

{'maxTokens': 300}

'overlapTokens': 60

}

)

# Ingest data

rag.ingest_data(kb_id, ds_id)

# Single query

result = rag.query(

knowledge_base_id=kb_id,

query='What are the best practices for RAG applications?',

metadata_filter={

'equals': {

'key': 'document_type',

'value': 'best_practices'

}

)

print(f"Answer: {result['answer']}")

print(f"\nSources:")

for citation in result['citations']:

for ref in citation['retrievedReferences']:

print(f" - {ref['location']}")

# Multi-turn conversation

conversation = rag.multi_turn_conversation(

knowledge_base_id=kb_id,

queries=[

'What is hierarchical chunking?',

'When should I use it?',

'What are the configuration parameters?'

]

)

for turn in conversation:

print(f"\nQ: {turn['query']}")

print(f"A: {turn['answer']}")

```

---

Related Skills

Amazon Bedrock Core Skills

bedrock-guardrails: Content safety, PII redaction, hallucination detection
bedrock-agents: Agentic workflows with tool use and knowledge bases
bedrock-flows: Visual workflow builder for generative AI
bedrock-model-customization: Fine-tuning, reinforcement fine-tuning, distillation
bedrock-prompt-management: Prompt versioning and deployment

AWS Infrastructure Skills

opensearch-serverless: Vector database configuration and management
neptune-analytics: GraphRAG configuration and queries
s3-management: S3 bucket configuration for data sources and vectors
iam-bedrock: IAM roles and policies for Knowledge Bases

Observability Skills

cloudwatch-bedrock-monitoring: Monitor Knowledge Bases metrics and logs
bedrock-cost-optimization: Track and optimize Knowledge Bases costs

---

Additional Resources

Official Documentation

[Amazon Bedrock Knowledge Bases](https://aws.amazon.com/bedrock/knowledge-bases/)
[Knowledge Bases User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html)
[Chunking Strategies](https://docs.aws.amazon.com/bedrock/latest/userguide/kb-chunking.html)
[Boto3 Knowledge Bases API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent.html)

Best Practices

[Building Cost-Effective RAG with S3 Vectors](https://aws.amazon.com/blogs/machine-learning/building-cost-effective-rag-applications-with-amazon-bedrock-knowledge-bases-and-amazon-s3-vectors/)
[Advanced Parsing and Chunking](https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-knowledge-bases-now-supports-advanced-parsing-chunking-and-query-reformulation-giving-greater-control-of-accuracy-in-rag-based-applications/)

Research Document

/mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md - Section 2 (Complete Knowledge Bases research)

More from this repository10

🎯

xai-stock-sentiment🎯Skill

xai-stock-sentiment skill from adaptationio/skrillz

🎯

ralph-prompt-single-task🎯Skill

Generates Ralph-compatible prompts for single implementation tasks with clear completion criteria and automatic verification.

🎯

autonomous-loop🎯Skill

Orchestrates continuous autonomous coding sessions, managing feature implementation, testing, and progress tracking with intelligent checkpointing and recovery mechanisms.

🎯

auto-claude-troubleshooting🎯Skill

auto-claude-troubleshooting skill from adaptationio/skrillz

🎯

auto-claude-setup🎯Skill

auto-claude-setup skill from adaptationio/skrillz

🎯

observability-analyzer🎯Skill

Analyzes Claude Code observability data to generate insights on performance, costs, errors, tool usage, sessions, conversations, and subagents through advanced metrics and log querying.

🎯

xai-financial-integration🎯Skill

xai-financial-integration skill from adaptationio/skrillz

🎯

xai-agent-tools🎯Skill

xai-agent-tools skill from adaptationio/skrillz

🎯

xai-crypto-sentiment🎯Skill

xai-crypto-sentiment skill from adaptationio/skrillz

🎯

auto-claude-memory🎯Skill

auto-claude-memory skill from adaptationio/skrillz