
Executive Summary
KloudPing is a B2B SaaS company building two cloud-native products: a lead generation platform currently in production, and a web marketing automation tool in active development. As the engineering team scaled, DevOps data became increasingly fragmented across Terraform files, CLI outputs, logs, Jira tasks, and infrastructure configs, highlighting the growing need for DevOps automation solutions and AI-powered automation services with no single layer to query or act across them.
Teleglobal designed and deployed DevOpsMate, an AI-powered DevOps intelligence platform built on Retrieval-Augmented Generation (RAG) and Amazon Bedrock. The platform uses Qwen3-Coder-480B, a 480-billion parameter coding-focused LLM, to understand infrastructure context, generate code, automate Jira workflows, and execute Git commits, all running within KloudPing’s own AWS environment in ap-south-1.
| 100% Jira task automation | 12 S3 data buckets | 480B Params, Qwen3-Coder | $2,748 Monthly infra cost |
- Entire DevOpsMate platform runs within KloudPing’s AWS environment in ap-south-1
- GPU-accelerated LLM inference on EC2 g5.2xlarge with NVIDIA A10G GPU
- Jira, Git, and pipeline workflows fully automated, zero manual triggering
About KloudPing
KloudPing is a B2B SaaS organisation building cloud-native products for the digital marketing and lead generation space. Currently in production with a live lead generation platform and actively developing a web marketing automation tool, KloudPing is investing in AI-driven automation to reduce engineering overhead and accelerate product delivery.
As the team scaled across two products, the need for a centralised intelligence layer that could ingest, understand, and act on fragmented DevOps data became a strategic priority.
The Challenge
KloudPing’s engineering team was spending too much time on manual DevOps tasks that should have been automated. Five problems were slowing development velocity and increasing operational overhead.
- Fragmented DevOps Data
CLI outputs, Terraform configurations, cloud configs, and scheduler data existed in separate silos. Getting a complete picture of infrastructure state required switching between multiple tools with no unified query layer.
- Manual Debugging Loops
Engineers traced infrastructure issues manually across disparate tools. Without contextual correlation between infrastructure state and error signals, debugging was slow and repetitive.
- No Automated Task Execution
Jira tickets, Git commits, and pipeline triggers all required human initiation. Even well-understood, repeatable tasks consumed engineering time that could have gone toward product development.
- Lack of Contextual AI Understanding
Existing tools could not correlate infrastructure state with actionable DevOps tasks. Queries returned generic answers rather than responses grounded in the actual state of KloudPing’s environments.
- Evolving Multi-Product Architecture
With two SaaS products at different stages of development, the platform needed to be flexible enough to support both independently and grow as data ingestion volumes increased.
Architecture Selection
Step 1: Evaluation Criteria
Before selecting an architecture, Teleglobal defined five criteria the solution had to meet:
- Ability to handle structured and unstructured DevOps data including Terraform files, JSON CLI outputs, logs, and configurations
- Support for code understanding and generation at infrastructure level
- Deep integration with the AWS ecosystem for security, compliance, and observability
- Scalability to support multiple products and growing data ingestion volumes
- Support for autonomous task execution, acting on LLM outputs and not just surfacing them
Step 2: Architectures Evaluated
| Architecture Option | Strengths | Limitations |
| Direct LLM Querying | Simple integration | No infrastructure context, high hallucination risk on DevOps data |
| Fine-tuned Models | Strong domain accuracy | High cost, slow iteration cycle, limited flexibility as architecture evolves |
| RAG + Bedrock (Selected) | Context-aware, scalable, no fine-tuning overhead | Requires a robust data ingestion pipeline |
| Multi-Agent Framework only | Autonomous execution capability | High complexity; most effective when built on top of a RAG layer |
Step 3: Why RAG + Bedrock Was Selected
RAG combined with Amazon Bedrock was the only architecture that satisfied all five criteria:
- Injects live infrastructure data directly into LLM prompts, enabling context-aware DevOps insights grounded in real Terraform state, CLI outputs, and logs
- Avoids expensive full model fine-tuning while maintaining high accuracy through retrieval-based context injection
- Works natively with Terraform files, JSON infrastructure data, logs, and configuration files
- Amazon Bedrock provides managed LLM inference with native AWS IAM, KMS, and CloudWatch integration
- Supports agentic execution: agents act on LLM outputs to trigger Jira automation, Git commits, and pipeline runs
Model Selection
Step 1: Evaluation Criteria
Once RAG + Bedrock was confirmed as the architecture, Teleglobal evaluated which foundation model within Amazon Bedrock best suited DevOpsMate’s workloads. Five criteria guided the selection:
- Code understanding and generation: must handle Terraform, Python, Bash, JSON, and infrastructure configuration files with high accuracy
- Large context window: must process long infrastructure files, multi-file Terraform repos, and extended CLI output in a single prompt
- Instruction-following precision: must reliably execute structured DevOps tasks including code generation, query answering, and task formatting for Jira automation
- Infrastructure reasoning: must correlate infrastructure state across multiple data sources and generate actionable insights
- Cost efficiency on Bedrock On-Demand: must deliver strong performance at 10 requests per minute peak load without excessive token cost
Step 2: Models Evaluated
Three models available through Amazon Bedrock were shortlisted and evaluated against KloudPing’s requirements:
- Qwen3-Coder-480B (Alibaba Cloud): state-of-the-art code-specialised LLM, 480B parameters, purpose-built for coding and infrastructure tasks
- Claude 3.5 Sonnet (Anthropic): strong general reasoning and instruction following, widely used for enterprise RAG deployments
- Amazon Nova Pro (Amazon): AWS-native foundation model optimised for cost-effective enterprise workloads on Bedrock
| Parameter | Qwen3-Coder-480B ✔ Selected | Claude 3.5 Sonnet | Amazon Nova Pro |
| Code and infra specialisation | ✔ Purpose-built for Terraform, IaC, and code tasks | ⚠ Strong general reasoning, not code-specialised | ⚠ General purpose, less code depth |
| Context window for large infra files | ✔ Handles large Terraform repos and multi-file CLI outputs | ✔ Strong context handling | ⚠ Less suited for large infra datasets |
| Structured output precision | ✔ Precise output for Jira formatting, code patches, pipeline configs | ✔ Strong instruction following | ⚠ Less precise on highly structured output tasks |
| RAG grounding on infra data | ✔ High accuracy with Terraform and JSON context injection | ✔ Strong RAG performance | ⚠ Less precise on infrastructure-specific context |
| Available on Amazon Bedrock | ✔ Yes, via Geo Cross Region Inference | ✔ Yes, native Bedrock model | ✔ Yes, AWS-native model |
| Cost on Bedrock On-Demand | ✔ Competitive for 480B scale at 10 req/min profile | ⚠ Higher per-token cost at this volume | ✔ Most cost-efficient option |
| Code generation quality | ✔ Strongest across all languages and IaC formats | ⚠ Good code generation, general purpose | ⚠ Adequate, not code-optimised |
| Multi-file reasoning | ✔ Strong cross-file dependency resolution | ✔ Good | ⚠ Limited at scale |
Step 3: Why Qwen3-Coder-480B Was Selected
Qwen3-Coder-480B was the clear choice for DevOpsMate’s code-heavy, infrastructure-reasoning workloads.
- Claude 3.5 Sonnet: Claude 3.5 Sonnet is a strong general-purpose reasoning model and performs well in RAG deployments, but it is not purpose-built for code and infrastructure tasks. For DevOpsMate’s core workloads (Terraform analysis, code generation, and pipeline config creation), Qwen3-Coder’s specialised training gives it a meaningful advantage in accuracy and output precision. It also carries a higher per-token cost at KloudPing’s usage profile.
- Amazon Nova Pro: Amazon Nova Pro is the most cost-efficient Bedrock option and a solid general-purpose model, but its code generation depth and multi-file infrastructure reasoning fall short of what DevOpsMate requires. For a platform where the primary output is code patches, Git commits, and structured Jira automation, Nova Pro’s general-purpose positioning is a limitation.
- Qwen3-Coder-480B selected: Qwen3-Coder-480B is a 480-billion parameter model purpose-built for code understanding, generation, and infrastructure reasoning. It delivers the highest accuracy on Terraform, Python, Bash, and JSON tasks, handles large infrastructure files within a single context window, and produces the precise structured outputs DevOpsMate needs for Jira formatting and pipeline configuration. Available via Bedrock Geo Cross Region Inference, it integrates cleanly with the full AWS security and observability stack.
GenAI Capabilities Delivered
The deployed DevOpsMate platform supports the following AI-powered capabilities:
- Infrastructure data analysis: natural language queries across Terraform configurations, CLI outputs, and cloud state data
- Named entity recognition from DevOps artifacts for knowledge graph construction
- Hybrid RAG-based knowledge retrieval for contextual DevOps insights and root cause analysis
- Automated Jira task execution: ticket creation, assignment, and status updates driven by AI
- Code generation and direct Git commits: branch creation, code changes, and pipeline trigger automation
- Scheduler-based knowledge base refresh: continuous ingestion from live infrastructure data sources
GenAI Processing Pipeline
DevOpsMate processes DevOps queries and infrastructure data across five layers:
| Layer | Description |
| 1. Ingestion Layer | CLI extraction, Terraform uploads, and background scheduler jobs push raw data into Amazon S3 |
| 2. Storage Layer | Amazon S3 (12 buckets) centralises raw flat files, configs, logs, and DevOps artifacts |
| 3. RAG Layer | Data is preprocessed, filtered, and indexed into a vector database for semantic retrieval |
| 4. LLM Layer | Amazon Bedrock (Qwen3-Coder-480B) processes natural language queries with injected infrastructure context via Geo Cross Region Inference |
| 5. Agent and Action Layer | AI agents execute DevOps tasks: Jira automation, Git commits, pipeline triggers |
Data and Knowledge Sources
DevOpsMate ingests and analyses the following DevOps data sources:
- Terraform code repositories: infrastructure-as-code for all cloud environments
- Cloud CLI outputs in JSON format: real-time infrastructure state data
- Scheduler-generated system data: automated periodic snapshots of infrastructure health
- Jira task data: project management context for task automation
- Local automation scripts: custom DevOps tooling and runbooks
- Infrastructure state files: live configuration and deployment state
AWS Services Used
| Category | Service |
| AI Inference | Amazon Bedrock: Qwen3-Coder-480B via Geo Cross Region Inference, On-Demand; 10 req/min peak, 1,000 input + 1,000 output tokens/request; $1,062.72/month |
| GPU Compute | EC2 g5.2xlarge: NVIDIA A10G GPU, 200 GB EBS, LLM agent and GPU inference workloads; $1,082.75/month |
| Application Compute | EC2 m6a.2xlarge (DevOpsMate app server, 120 GB EBS); EC2 m6a.large (ei-prod, 30 GB EBS); EC2 c6a.large (Azure Pipeline, 124 GB EBS) |
| Secure Access | EC2 t3a.small: OpenVPN for secure developer access to private VPC resources |
| Database | Amazon RDS for PostgreSQL: db.t3.large, 100 GB gp2, Single-AZ, On-Demand; $210.93/month |
| Storage | Amazon S3: 12 buckets for DevOps artifacts, knowledge base, model data, and pipeline outputs (2-5 GB per bucket, S3 Standard); approx. $0.74/month |
| Networking | Amazon VPC: 2 public IPs, NAT Gateway, 2 AZs ($48.18/month); Elastic Load Balancing ALB ($17.61/month); Amazon Route 53 hosted zone ($0.50/month) |
| Security | AWS WAF: 1 Web ACL, 5 rules, 1 rule group, 1 managed group ($20.00/month); AWS KMS: 5 CMKs, 2M symmetric requests ($11.00/month); AWS IAM least-privilege policies |
| Observability | Amazon CloudWatch: 20 metrics, 10 GB log ingestion; $12.75/month |
Total monthly infrastructure cost: $2,748.36 USD (BOQ-aligned, Asia Pacific Mumbai region).
Security and Governance
All DevOpsMate workloads run within KloudPing’s private AWS environment in ap-south-1. Security controls are active from day one.
- Amazon VPC with private subnets and NAT Gateway across 2 Availability Zones, with no direct public exposure of inference or application services
- AWS IAM: least-privilege role-based access control for all Bedrock, EC2, S3, and RDS interactions
- AWS KMS: 5 Customer Managed Keys encrypting all data at rest and in transit across S3, RDS, and EBS volumes
- AWS WAF: Web ACL with 5 custom rules and 1 managed rule group protecting the application load balancer
- OpenVPN on EC2 t3a.small: secure private network access for developers connecting to VPC-internal resources
- Amazon CloudWatch: full infrastructure monitoring and alerting across all services
Scalability and Reliability
- Modular architecture supports both KloudPing SaaS products independently without shared state
- Elastic Load Balancer (ALB) ensures high availability across application tiers
- S3-backed knowledge base with 12 dedicated buckets provides stateless, scalable storage for growing DevOps data
- Fault-tolerant ingestion pipeline via background schedulers with retry logic for continuous knowledge base refresh
- VPC across 2 Availability Zones for network resilience
- Designed for multi-agent scaling as Bedrock inference volumes and data ingestion grow
Cost Optimisation
- RAG-based architecture avoids costly model fine-tuning entirely: context injection replaces parameter updates
- EC2 g5.2xlarge right-sized for actual LLM agent workload requirements, not over-provisioned
- On-Demand compute strategy avoids reserved instance overhead for an evolving, early-stage workload
- Bedrock On-Demand inference: only charged for actual Qwen3-Coder-480B requests, no idle GPU costs
- Optimised inference calls via RAG context filtering: only the most relevant chunks injected per query, reducing token consumption
- S3 Standard pricing across 12 buckets at minimal per-GB cost for knowledge base and artifact storage
Results
| Metric | Result |
| Jira task automation | Fully automated: ticket creation, assignment, and execution |
| Code generation and Git commits | Automated: branch creation, code changes, and pipeline triggers |
| Manual DevOps effort | Significantly reduced through AI-driven automation |
| Infrastructure debugging | Accelerated via natural language RAG querying across all data sources |
| DevOps data visibility | Unified across Terraform, CLI outputs, logs, and Jira in one platform |
| Data residency | All workloads within KloudPing’s AWS environment (ap-south-1) |
| Monthly infrastructure cost | $2,748.36 USD (BOQ-aligned, Asia Pacific Mumbai) |
| LLM model in use | Qwen3-Coder-480B via Amazon Bedrock Geo Cross Region Inference |
“DevOpsMate gave us something we didn’t have before: a single place to ask questions about our infrastructure and actually get things done. Jira tasks create themselves, code gets committed, and debugging that used to take hours now takes minutes. The platform Teleglobal built is the automation layer our team needed to move faster.”
— KloudPing
What’s Next
KloudPing plans to continue expanding DevOpsMate across both SaaS products:
- Full-scale rollout across both the lead generation platform and the web marketing automation tool
- Advanced multi-agent orchestration for complex cross-system DevOps workflows
- Enhanced RAG with real-time infrastructure state updates for lower knowledge base latency
- CI/CD integration for fully autonomous deployments without human triggering
Expanded AI-driven DevOps automation covering additional cloud providers