Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAG pipeline with CI/CD automation #368

Open
wants to merge 49 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
b3d0a56
Initial commit from private repo
Oct 18, 2024
567b3dc
docs(readme): Add initial readme documentation
Oct 18, 2024
d2dd020
chore: remove files from tracking based on .gitignore
Oct 18, 2024
5673e90
chore: change pipeline branch
Oct 18, 2024
6f5d66e
chore: change pipeline branch
Oct 18, 2024
c26d6ae
chore: change pipeline branch
Oct 18, 2024
586d28a
fix: github source root directoy's path
Oct 18, 2024
880698a
debug: print statments in the build process
Oct 18, 2024
ce3e632
debug: print path roots in the build process
Oct 18, 2024
9353d06
debug: print path roots in the build process
Oct 18, 2024
392ff5f
fix: build commands
Oct 18, 2024
a6cc539
debug: code build steps in the qa stage
Oct 18, 2024
2e02957
fix: build_lambda.sh path
Oct 18, 2024
beee3f6
fix: build_lambda.sh path
Oct 18, 2024
c336a07
fix: prod code build steps
Oct 20, 2024
c114e2e
fix: restrict lambda to access only specific kb
Oct 21, 2024
ea09191
fix: restrict lambda to access only specific kb
Oct 21, 2024
22285d7
refactor: data ingestion stack to reduce boilerplate code
Oct 22, 2024
c144296
refactor: data ingestion stack to reduce boilerplate code
Oct 22, 2024
4596891
refactor: rag eval stack to reduce boilerplate code
Oct 22, 2024
26026fb
chore: remove commented lines
Oct 23, 2024
6d8e5a3
refactor: remove image processing from custom chunking lambda function
Oct 23, 2024
1b1b202
fix: custom chunking lambda function python file name
Oct 23, 2024
eab41ad
refactor: remove image processing from custom chunking lambda function
Oct 23, 2024
7581ee8
refactor: remove python dependencies needed for image processing
Oct 23, 2024
fd1efcd
feat: update stack deletion criteria
Oct 23, 2024
e28e486
feat: add rag eval metadata to dynamo db table
Oct 24, 2024
c57d17f
fix: dynamo db table parameter
Oct 24, 2024
5f215cb
fix: allow rag eval lambda to access dynamo db
Oct 24, 2024
af3d4dc
feat: improve copy files from qa to prod lambda
Oct 24, 2024
4441739
fix: allow move files lambda to access dynamod db table
Oct 24, 2024
5323fc2
fix: dynamo db table parameter
Oct 24, 2024
22df6a8
chore: rename main folder
Oct 24, 2024
c1fb94a
chore: cleanup
Oct 24, 2024
9d10a0d
fix: dynamo db table region in copy-files Lambda
Oct 25, 2024
b691214
fix: dynamo db table region in the ARN
Oct 25, 2024
b891a8d
chore: add more info to readme
Oct 25, 2024
5a8074c
chore: add explantory comments
Oct 25, 2024
500e450
chore: Consume prod region name from cdk.json
Oct 25, 2024
cbf9950
chore: delete multimodal folder
Oct 25, 2024
f706053
Merge branch 'aws-samples:main' into rag-cicd-without-image-processing
manoj-selvakumar5 Oct 25, 2024
00a37af
fix: build_lambda.sh path
Oct 25, 2024
d860f13
Merge branch 'aws-samples:main' into rag-cicd-without-image-processing
manoj-selvakumar5 Oct 25, 2024
6b08662
chore: folder restructure
Oct 28, 2024
08890fd
Merge branch 'aws-samples:main' into rag-cicd-without-image-processing
manoj-selvakumar5 Oct 28, 2024
ecb26be
chore: folder restructure
Oct 28, 2024
7eb7c18
fix: file path in build steps
Oct 28, 2024
c8ec821
chore: change pipeline name
Oct 28, 2024
3b2511f
fix: update readme
Nov 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
chore: cleanup
Manoj S authored and Manoj S committed Oct 24, 2024
commit c1fb94aa755bda431a31257ab2152b7600272e4e
Original file line number Diff line number Diff line change
@@ -10,7 +10,7 @@ Here’s how to set up and start the project
### Installation
1. **Navigate to the Project Folder**: Open your terminal or command prompt and navigate to the project folder.
```bash
cd path/to/MULTIMODAL-RAG-PIPELINE-WITH-CICD
cd path/to/RAG-PIPELINE-WITH-CICD
```

2. **Install Dependencies**:
3 changes: 2 additions & 1 deletion rag/automating-rag-pipeline/rag-pipeline-with-cicd/cdk.json
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
{
"app": "npx ts-node bin/main.ts",
"context": {
"defaultProject": "mm-rag-project",
"defaultProject": "rag-project",
"defaultRegion": "us-east-1",
"defaultAccount": "533267284022",
"defaultEnvironment": "QA",
"prodAccount": "533267284022",
"prodRegion": "us-west-2",
"bedrockModelID": "anthropic.claude-3-haiku-20240307-v1:0",
"@aws-cdk/customresources:installLatestAwsSdkDefault": false
}
}
Original file line number Diff line number Diff line change
@@ -47,7 +47,7 @@ export class BedrockStack extends Stack {

// Reference an existing S3 bucket by name
const docBucket = Bucket.fromBucketName(this, "DocBucket", docBucketName);
console.log("Processed Files S3 Data Source: ", docBucketName); // Log the selected bucket name
// console.log("Processed Files S3 Data Source: ", docBucketName); // Log the selected bucket name


// // Read the S3 bucket name from the SSM parameter store (the buckets are created as part of the data ingestion stack)
@@ -98,10 +98,10 @@ export class BedrockStack extends Stack {
let bedrockCustomLambdaBucketName: string;
if (props.stageName === "QA") {
bedrockCustomLambdaBucketName = StringParameter.valueForStringParameter(this, '/MultimodalRAG/PreQABucketSetupStage/lambda-package-bucket-name');
console.log("Bedrock Custom Lambda Package Bucket Name: ", bedrockCustomLambdaBucketName); // Log the bucket name
// console.log("Bedrock Custom Lambda Package Bucket Name: ", bedrockCustomLambdaBucketName); // Log the bucket name
} else {
bedrockCustomLambdaBucketName = StringParameter.valueForStringParameter(this, '/MultimodalRAG/PreProdBucketSetupStage/lambda-package-bucket-name');
console.log("Bedrock Custom Lambda Package Bucket Name: ", bedrockCustomLambdaBucketName); // Log the bucket name
// console.log("Bedrock Custom Lambda Package Bucket Name: ", bedrockCustomLambdaBucketName); // Log the bucket name
}


Original file line number Diff line number Diff line change
@@ -40,14 +40,14 @@ export class CodePipelineStack extends Stack {
commands: [
"echo 'Current working directory:' $(pwd)",
"ls -ltr",
"cd rag/automating-rag-pipeline/multimodal-rag-pipeline-with-cicd",
"cd rag/automating-rag-pipeline/rag-pipeline-with-cicd",
"echo 'New working directory:' $(pwd)",
"ls -ltr",
"npm ci",
"npm run build",
"npx cdk synth"
],
primaryOutputDirectory: 'rag/automating-rag-pipeline/multimodal-rag-pipeline-with-cicd/cdk.out' // Updated to reflect the correct project root directory
primaryOutputDirectory: 'rag/automating-rag-pipeline/rag-pipeline-with-cicd/cdk.out' // Updated to reflect the correct project root directory
}),
dockerEnabledForSynth: true,
});
@@ -66,25 +66,14 @@ export class CodePipelineStack extends Stack {
`arn:aws:ssm:${this.region}:${this.account}:parameter/${props.codePipelineName}/*`,
],
}));
// // **Add the S3 Bucket Stage**
// const s3BucketStage = cicdPipeline.addStage(
// new CodePipelineStage(this, 'S3BucketStage', {
// stageName: 'S3BucketStage',
// env: {
// account: this.node.tryGetContext('defaultAccount'),
// region: this.node.tryGetContext('defaultRegion'),
// },
// codePipelineName: props.codePipelineName,
// })
// );


// **Add the S3 Bucket Stage**
const preQABucketSetupStage = cicdPipeline.addStage(
new CodePipelineStage(this, 'PreQABucketSetupStage', {
stageName: 'PreQABucketSetupStage',
env: {
account: this.node.tryGetContext('defaultAccount'),
account: this.node.tryGetContext('defaultAccount'), // Retrieve a value from the CDK application context
region: this.node.tryGetContext('defaultRegion'),
},
codePipelineName: props.codePipelineName,
@@ -110,8 +99,8 @@ export class CodePipelineStack extends Stack {
commands: [
"echo 'Current working directory:' $(pwd)",
"ls -R",
"chmod +x rag/automating-rag-pipeline/multimodal-rag-pipeline-with-cicd/src/app/build_lambda.sh", // Make the script executable
"./rag/automating-rag-pipeline/multimodal-rag-pipeline-with-cicd/src/app/build_lambda.sh" // Run the script
"chmod +x rag/automating-rag-pipeline/rag-pipeline-with-cicd/src/app/build_lambda.sh", // Make the script executable
"./rag/automating-rag-pipeline/rag-pipeline-with-cicd/src/app/build_lambda.sh" // Run the script
],

role: codeBuildRole,
@@ -197,8 +186,8 @@ export class CodePipelineStack extends Stack {
commands: [
"echo 'Current working directory:' $(pwd)",
"ls -R",
"chmod +x rag/automating-rag-pipeline/multimodal-rag-pipeline-with-cicd/src/app/build_lambda.sh", // Make the script executable
"./rag/automating-rag-pipeline/multimodal-rag-pipeline-with-cicd/src/app/build_lambda.sh" // Run the script
"chmod +x rag/automating-rag-pipeline/rag-pipeline-with-cicd/src/app/build_lambda.sh", // Make the script executable
"./rag/automating-rag-pipeline/rag-pipeline-with-cicd/src/app/build_lambda.sh" // Run the script
],
role: codeBuildRole, // Use the shared CodeBuild role
buildEnvironment: {
Original file line number Diff line number Diff line change
@@ -53,6 +53,12 @@ export class WebApplicationStack extends Stack {
cpu: 256, // CPU units for the task
});

// Retrieve the modelId from cdk.json context
const modelId = this.node.tryGetContext('bedrockModelID');
if (!modelId) {
throw new Error("modelId not found in cdk.json context.");
}

// Create the Lambda function
const mainLambdaFunction = new NodejsFunction(this, 'MainLambdaFunction', {
runtime: Runtime.NODEJS_18_X,
@@ -62,6 +68,7 @@ export class WebApplicationStack extends Stack {
environment: {
STAGE_NAME: props?.stageName!,
KNOWLEDGE_BASE_ID: props.knowledgeBaseId,
MODEL_ID: modelId,
},
});

Original file line number Diff line number Diff line change
@@ -3,20 +3,19 @@ import { BedrockRuntimeClient, InvokeModelCommand, InvokeModelCommandInput, Invo
import { BedrockAgentRuntimeClient, RetrieveAndGenerateCommand, RetrieveAndGenerateCommandInput, RetrieveAndGenerateCommandOutput } from "@aws-sdk/client-bedrock-agent-runtime";

const awsRegion = process.env.AWS_REGION
const modelID = 'anthropic.claude-3-haiku-20240307-v1:0';
const modelID = process.env.MODEL_ID;
// Knowledge base ID for BedrockAgentRuntimeClient
const knowledgeBaseId = process.env.KNOWLEDGE_BASE_ID;
console.log("Knowledge Base ID:", knowledgeBaseId);

if (!modelID) throw new Error('MODEL_ID environment variable is missing.');
if (!knowledgeBaseId) throw new Error('KNOWLEDGE_BASE_ID environment variable is missing.');

// Create instances of both Bedrock clients
const runtimeClient = new BedrockRuntimeClient({ region: awsRegion });
const agentClient = new BedrockAgentRuntimeClient({ region: awsRegion });



// Knowledge base ID for BedrockAgentRuntimeClient
// const knowledgeBaseId = "45X4GMM8NC";
const knowledgeBaseId = process.env.KNOWLEDGE_BASE_ID;
console.log("Knowledge Base ID:", knowledgeBaseId);


interface Reference {
content: {
text: string;
@@ -104,7 +103,7 @@ async function queryKnowledgeBaseWithCitations(prompt: string): Promise<{ respon
// Function to invoke a model using BedrockRuntimeClient
async function invokeModel(prompt: string): Promise<string> {
const payload: InvokeModelCommandInput = {
modelId: "anthropic.claude-3-haiku-20240307-v1:0",
modelId: modelID,
contentType: "application/json",
accept: "application/json",
body: JSON.stringify({
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@



import { DynamoDB } from "aws-sdk";
import { BedrockAgentClient, StartIngestionJobCommand, GetIngestionJobCommand } from "@aws-sdk/client-bedrock-agent";

Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

import { StepFunctions, SSM } from 'aws-sdk';
import { APIGatewayProxyHandler } from 'aws-lambda';