cms.teleglobals.com

How Teleglobal International Built DevOpsMate: An AI-Powered RAG Platform on AWS for KloudPing

How Teleglobal International Built DevOpsMate: An AI-Powered RAG Platform on AWS for KloudPing

Executive Summary 

KloudPing is a B2B SaaS company building two cloud-native products: a lead generation platform currently in production, and a web marketing automation tool in active development. As the engineering team scaled, DevOps data became increasingly fragmented across Terraform files, CLI outputs, logs, Jira tasks, and infrastructure configs, highlighting the growing need for DevOps automation solutions and AI-powered automation services with no single layer to query or act across them. 

Teleglobal designed and deployed DevOpsMate, an AI-powered DevOps intelligence platform built on Retrieval-Augmented Generation (RAG) and Amazon Bedrock. The platform uses Qwen3-Coder-480B, a 480-billion parameter coding-focused LLM, to understand infrastructure context, generate code, automate Jira workflows, and execute Git commits, all running within KloudPing’s own AWS environment in ap-south-1. 

100% Jira task automation 12 S3 data buckets 480B Params, Qwen3-Coder $2,748 Monthly infra cost 
  • Entire DevOpsMate platform runs within KloudPing’s AWS environment in ap-south-1 
  • GPU-accelerated LLM inference on EC2 g5.2xlarge with NVIDIA A10G GPU 
  • Jira, Git, and pipeline workflows fully automated, zero manual triggering 

About KloudPing 

KloudPing is a B2B SaaS organisation building cloud-native products for the digital marketing and lead generation space. Currently in production with a live lead generation platform and actively developing a web marketing automation tool, KloudPing is investing in AI-driven automation to reduce engineering overhead and accelerate product delivery. 

As the team scaled across two products, the need for a centralised intelligence layer that could ingest, understand, and act on fragmented DevOps data became a strategic priority. 

The Challenge 

KloudPing’s engineering team was spending too much time on manual DevOps tasks that should have been automated. Five problems were slowing development velocity and increasing operational overhead. 

  1. Fragmented DevOps Data 

CLI outputs, Terraform configurations, cloud configs, and scheduler data existed in separate silos. Getting a complete picture of infrastructure state required switching between multiple tools with no unified query layer. 

  1. Manual Debugging Loops 

Engineers traced infrastructure issues manually across disparate tools. Without contextual correlation between infrastructure state and error signals, debugging was slow and repetitive. 

  1. No Automated Task Execution 

Jira tickets, Git commits, and pipeline triggers all required human initiation. Even well-understood, repeatable tasks consumed engineering time that could have gone toward product development. 

  1. Lack of Contextual AI Understanding 

Existing tools could not correlate infrastructure state with actionable DevOps tasks. Queries returned generic answers rather than responses grounded in the actual state of KloudPing’s environments. 

  1. Evolving Multi-Product Architecture 

With two SaaS products at different stages of development, the platform needed to be flexible enough to support both independently and grow as data ingestion volumes increased. 

Architecture Selection 

Step 1: Evaluation Criteria 

Before selecting an architecture, Teleglobal defined five criteria the solution had to meet: 

  • Ability to handle structured and unstructured DevOps data including Terraform files, JSON CLI outputs, logs, and configurations 
  • Support for code understanding and generation at infrastructure level 
  • Deep integration with the AWS ecosystem for security, compliance, and observability 
  • Scalability to support multiple products and growing data ingestion volumes 
  • Support for autonomous task execution, acting on LLM outputs and not just surfacing them 

Step 2: Architectures Evaluated 

Architecture Option Strengths Limitations 
Direct LLM Querying Simple integration No infrastructure context, high hallucination risk on DevOps data 
Fine-tuned Models Strong domain accuracy High cost, slow iteration cycle, limited flexibility as architecture evolves 
RAG + Bedrock (Selected) Context-aware, scalable, no fine-tuning overhead Requires a robust data ingestion pipeline 
Multi-Agent Framework only Autonomous execution capability High complexity; most effective when built on top of a RAG layer 

Step 3: Why RAG + Bedrock Was Selected 

RAG combined with Amazon Bedrock was the only architecture that satisfied all five criteria: 

  • Injects live infrastructure data directly into LLM prompts, enabling context-aware DevOps insights grounded in real Terraform state, CLI outputs, and logs 
  • Avoids expensive full model fine-tuning while maintaining high accuracy through retrieval-based context injection 
  • Works natively with Terraform files, JSON infrastructure data, logs, and configuration files 
  • Amazon Bedrock provides managed LLM inference with native AWS IAM, KMS, and CloudWatch integration 
  • Supports agentic execution: agents act on LLM outputs to trigger Jira automation, Git commits, and pipeline runs 

Model Selection 

Step 1: Evaluation Criteria 

Once RAG + Bedrock was confirmed as the architecture, Teleglobal evaluated which foundation model within Amazon Bedrock best suited DevOpsMate’s workloads. Five criteria guided the selection: 

  • Code understanding and generation: must handle Terraform, Python, Bash, JSON, and infrastructure configuration files with high accuracy 
  • Large context window: must process long infrastructure files, multi-file Terraform repos, and extended CLI output in a single prompt 
  • Instruction-following precision: must reliably execute structured DevOps tasks including code generation, query answering, and task formatting for Jira automation 
  • Infrastructure reasoning: must correlate infrastructure state across multiple data sources and generate actionable insights 
  • Cost efficiency on Bedrock On-Demand: must deliver strong performance at 10 requests per minute peak load without excessive token cost 

Step 2: Models Evaluated 

Three models available through Amazon Bedrock were shortlisted and evaluated against KloudPing’s requirements: 

  • Qwen3-Coder-480B (Alibaba Cloud): state-of-the-art code-specialised LLM, 480B parameters, purpose-built for coding and infrastructure tasks 
  • Claude 3.5 Sonnet (Anthropic): strong general reasoning and instruction following, widely used for enterprise RAG deployments 
  • Amazon Nova Pro (Amazon): AWS-native foundation model optimised for cost-effective enterprise workloads on Bedrock 
Parameter Qwen3-Coder-480B ✔ Selected Claude 3.5 Sonnet Amazon Nova Pro 
Code and infra specialisation ✔ Purpose-built for Terraform, IaC, and code tasks ⚠ Strong general reasoning, not code-specialised ⚠ General purpose, less code depth 
Context window for large infra files ✔ Handles large Terraform repos and multi-file CLI outputs ✔ Strong context handling ⚠ Less suited for large infra datasets 
Structured output precision ✔ Precise output for Jira formatting, code patches, pipeline configs ✔ Strong instruction following ⚠ Less precise on highly structured output tasks 
RAG grounding on infra data ✔ High accuracy with Terraform and JSON context injection ✔ Strong RAG performance ⚠ Less precise on infrastructure-specific context 
Available on Amazon Bedrock ✔ Yes, via Geo Cross Region Inference ✔ Yes, native Bedrock model ✔ Yes, AWS-native model 
Cost on Bedrock On-Demand ✔ Competitive for 480B scale at 10 req/min profile ⚠ Higher per-token cost at this volume ✔ Most cost-efficient option 
Code generation quality ✔ Strongest across all languages and IaC formats ⚠ Good code generation, general purpose ⚠ Adequate, not code-optimised 
Multi-file reasoning ✔ Strong cross-file dependency resolution ✔ Good ⚠ Limited at scale 

Step 3: Why Qwen3-Coder-480B Was Selected 

Qwen3-Coder-480B was the clear choice for DevOpsMate’s code-heavy, infrastructure-reasoning workloads. 

  • Claude 3.5 Sonnet: Claude 3.5 Sonnet is a strong general-purpose reasoning model and performs well in RAG deployments, but it is not purpose-built for code and infrastructure tasks. For DevOpsMate’s core workloads (Terraform analysis, code generation, and pipeline config creation), Qwen3-Coder’s specialised training gives it a meaningful advantage in accuracy and output precision. It also carries a higher per-token cost at KloudPing’s usage profile. 
  • Amazon Nova Pro: Amazon Nova Pro is the most cost-efficient Bedrock option and a solid general-purpose model, but its code generation depth and multi-file infrastructure reasoning fall short of what DevOpsMate requires. For a platform where the primary output is code patches, Git commits, and structured Jira automation, Nova Pro’s general-purpose positioning is a limitation. 
  • Qwen3-Coder-480B selected: Qwen3-Coder-480B is a 480-billion parameter model purpose-built for code understanding, generation, and infrastructure reasoning. It delivers the highest accuracy on Terraform, Python, Bash, and JSON tasks, handles large infrastructure files within a single context window, and produces the precise structured outputs DevOpsMate needs for Jira formatting and pipeline configuration. Available via Bedrock Geo Cross Region Inference, it integrates cleanly with the full AWS security and observability stack. 

GenAI Capabilities Delivered 

The deployed DevOpsMate platform supports the following AI-powered capabilities: 

  • Infrastructure data analysis: natural language queries across Terraform configurations, CLI outputs, and cloud state data 
  • Named entity recognition from DevOps artifacts for knowledge graph construction 
  • Hybrid RAG-based knowledge retrieval for contextual DevOps insights and root cause analysis 
  • Automated Jira task execution: ticket creation, assignment, and status updates driven by AI 
  • Code generation and direct Git commits: branch creation, code changes, and pipeline trigger automation 
  • Scheduler-based knowledge base refresh: continuous ingestion from live infrastructure data sources 

GenAI Processing Pipeline 

DevOpsMate processes DevOps queries and infrastructure data across five layers: 

Layer Description 
1. Ingestion Layer CLI extraction, Terraform uploads, and background scheduler jobs push raw data into Amazon S3 
2. Storage Layer Amazon S3 (12 buckets) centralises raw flat files, configs, logs, and DevOps artifacts 
3. RAG Layer Data is preprocessed, filtered, and indexed into a vector database for semantic retrieval 
4. LLM Layer Amazon Bedrock (Qwen3-Coder-480B) processes natural language queries with injected infrastructure context via Geo Cross Region Inference 
5. Agent and Action Layer AI agents execute DevOps tasks: Jira automation, Git commits, pipeline triggers 

Data and Knowledge Sources 

DevOpsMate ingests and analyses the following DevOps data sources: 

  • Terraform code repositories: infrastructure-as-code for all cloud environments 
  • Cloud CLI outputs in JSON format: real-time infrastructure state data 
  • Scheduler-generated system data: automated periodic snapshots of infrastructure health 
  • Jira task data: project management context for task automation 
  • Local automation scripts: custom DevOps tooling and runbooks 
  • Infrastructure state files: live configuration and deployment state 

AWS Services Used 

Category Service 
AI Inference Amazon Bedrock: Qwen3-Coder-480B via Geo Cross Region Inference, On-Demand; 10 req/min peak, 1,000 input + 1,000 output tokens/request; $1,062.72/month 
GPU Compute EC2 g5.2xlarge: NVIDIA A10G GPU, 200 GB EBS, LLM agent and GPU inference workloads; $1,082.75/month 
Application Compute EC2 m6a.2xlarge (DevOpsMate app server, 120 GB EBS); EC2 m6a.large (ei-prod, 30 GB EBS); EC2 c6a.large (Azure Pipeline, 124 GB EBS) 
Secure Access EC2 t3a.small: OpenVPN for secure developer access to private VPC resources 
Database Amazon RDS for PostgreSQL: db.t3.large, 100 GB gp2, Single-AZ, On-Demand; $210.93/month 
Storage Amazon S3: 12 buckets for DevOps artifacts, knowledge base, model data, and pipeline outputs (2-5 GB per bucket, S3 Standard); approx. $0.74/month 
Networking Amazon VPC: 2 public IPs, NAT Gateway, 2 AZs ($48.18/month); Elastic Load Balancing ALB ($17.61/month); Amazon Route 53 hosted zone ($0.50/month) 
Security AWS WAF: 1 Web ACL, 5 rules, 1 rule group, 1 managed group ($20.00/month); AWS KMS: 5 CMKs, 2M symmetric requests ($11.00/month); AWS IAM least-privilege policies 
Observability Amazon CloudWatch: 20 metrics, 10 GB log ingestion; $12.75/month 

Total monthly infrastructure cost: $2,748.36 USD (BOQ-aligned, Asia Pacific Mumbai region). 

Security and Governance 

All DevOpsMate workloads run within KloudPing’s private AWS environment in ap-south-1. Security controls are active from day one. 

  • Amazon VPC with private subnets and NAT Gateway across 2 Availability Zones, with no direct public exposure of inference or application services 
  • AWS IAM: least-privilege role-based access control for all Bedrock, EC2, S3, and RDS interactions 
  • AWS KMS: 5 Customer Managed Keys encrypting all data at rest and in transit across S3, RDS, and EBS volumes 
  • AWS WAF: Web ACL with 5 custom rules and 1 managed rule group protecting the application load balancer 
  • OpenVPN on EC2 t3a.small: secure private network access for developers connecting to VPC-internal resources 
  • Amazon CloudWatch: full infrastructure monitoring and alerting across all services 

Scalability and Reliability 

  • Modular architecture supports both KloudPing SaaS products independently without shared state 
  • Elastic Load Balancer (ALB) ensures high availability across application tiers 
  • S3-backed knowledge base with 12 dedicated buckets provides stateless, scalable storage for growing DevOps data 
  • Fault-tolerant ingestion pipeline via background schedulers with retry logic for continuous knowledge base refresh 
  • VPC across 2 Availability Zones for network resilience 
  • Designed for multi-agent scaling as Bedrock inference volumes and data ingestion grow 

Cost Optimisation 

  • RAG-based architecture avoids costly model fine-tuning entirely: context injection replaces parameter updates 
  • EC2 g5.2xlarge right-sized for actual LLM agent workload requirements, not over-provisioned 
  • On-Demand compute strategy avoids reserved instance overhead for an evolving, early-stage workload 
  • Bedrock On-Demand inference: only charged for actual Qwen3-Coder-480B requests, no idle GPU costs 
  • Optimised inference calls via RAG context filtering: only the most relevant chunks injected per query, reducing token consumption 
  • S3 Standard pricing across 12 buckets at minimal per-GB cost for knowledge base and artifact storage 

Results 

Metric Result 
Jira task automation Fully automated: ticket creation, assignment, and execution 
Code generation and Git commits Automated: branch creation, code changes, and pipeline triggers 
Manual DevOps effort Significantly reduced through AI-driven automation 
Infrastructure debugging Accelerated via natural language RAG querying across all data sources 
DevOps data visibility Unified across Terraform, CLI outputs, logs, and Jira in one platform 
Data residency All workloads within KloudPing’s AWS environment (ap-south-1) 
Monthly infrastructure cost $2,748.36 USD (BOQ-aligned, Asia Pacific Mumbai) 
LLM model in use Qwen3-Coder-480B via Amazon Bedrock Geo Cross Region Inference 

“DevOpsMate gave us something we didn’t have before: a single place to ask questions about our infrastructure and actually get things done. Jira tasks create themselves, code gets committed, and debugging that used to take hours now takes minutes. The platform Teleglobal built is the automation layer our team needed to move faster.” 

— KloudPing 

What’s Next 

KloudPing plans to continue expanding DevOpsMate across both SaaS products: 

  • Full-scale rollout across both the lead generation platform and the web marketing automation tool 
  • Advanced multi-agent orchestration for complex cross-system DevOps workflows 
  • Enhanced RAG with real-time infrastructure state updates for lower knowledge base latency 
  • CI/CD integration for fully autonomous deployments without human triggering 

Expanded AI-driven DevOps automation covering additional cloud providers