
70%
reduction in AI processing costs
2/3
of LLM interaction cost reduced
Expanded model access
without increasing costs
Aligning InterWiz’s Cloud Strategy with AWS Ecosystem
As InterWiz’s adoption grew, the customer chose to migrate the platform’s AI capabilities from Azure OpenAI to Amazon Bedrock to align with its broader cloud strategy.
The move enabled tighter integration with AWS services, improved cost management, and established a scalable foundation for InterWiz’s future expansion within a unified cloud environment.
The Challenge:
After successfully launching InterWiz, the team encountered operational challenges that threatened long-term scalability and profitability.
High API usage costs for GPT-4 Turbo — with input tokens priced at $10.00 per million and output tokens at $30.00 per million — were significantly impacting profit margins. As user volume grew, scaling the platform became increasingly cost-prohibitive under the existing Azure OpenAI structure.
In addition, limited access to a broader range of AI models restricted the ability to optimize interviews for different roles, industries, and candidate profiles — a critical capability for expanding InterWiz’s market reach.
To sustain its commitment to reducing hiring costs by 50% for customers, while maintaining high-quality AI interviewing capabilities, the team recognized the need for a more flexible, scalable, and cost-efficient AI infrastructure — setting the stage for migration to Amazon Bedrock.
End-to-End Migration of Infrastructure and AI Services to AWS
Emumba led a full migration of InterWiz’s LLM backend from Azure OpenAI to Amazon Bedrock. The platform now uses Claude 3.5 Sonnet as the foundation for interview question generation, response evaluation, and personalized feedback.
The new solution leverages several AWS services:
Amazon Bedrock to access Claude for LLM interactions.
Amazon EC2 and Docker Compose to run containerized backend services.
Amazon CloudFront for low-latency global content delivery.
Amazon EC2 for supporting high-throughput model interactions and job processing.
Key enhancements included:
Developing an abstraction layer to isolate provider-specific API differences.
Refactoring prompts and system messages to fit Bedrock’s XML-tagged structure.
Implementing a fallback mechanism to switch between providers, ensuring reliability.
Implementing role-based execution with least privilege access controls to align with AWS security best practices.
A parallel deployment strategy enabled safe migration by validating output quality and latency before full production cutover.
Phased Rollout and Deep Technical Refactoring
To ensure a seamless transition to Amazon Bedrock without disrupting user experience, Emumba executed a phased migration strategy built on strong engineering principles and layered risk management.
Several technical challenges had to be addressed throughout the process:
API Structure Differences: The Azure OpenAI and Amazon Bedrock APIs differ significantly in architecture. Emumba implemented a dynamic abstraction layer to shield application logic from these differences, enabling the platform to operate smoothly regardless of the underlying LLM provider.
Prompt Compatibility: Prompts tuned for Azure required adaptation to perform effectively with Bedrock. The team reviewed and restructured prompts to match Claude’s expected formatting and maintain consistent response quality across interview flows.
Authentication Workflow Changes: Transitioning from Azure Active Directory to AWS authentication required security model redesign. Emumba integrated AWS IAM with role-based execution and least privilege access controls to ensure secure service communication aligned with AWS best practices.
Latency and Cost Optimization: The team introduced caching, request throttling, and inference tuning to strike the right balance between performance and operational cost under Bedrock’s usage model.
Reliability and Fault Tolerance: To ensure resilience during and after migration, Emumba added fallback mechanisms, retry strategies, and robust error handling — keeping the platform responsive even in degraded conditions.
In parallel, AWS-based dev, staging, and production environments were provisioned alongside the existing Azure stack to support side-by-side testing. The team ran extensive A/B comparisons across model responses, scoring behavior, and system latency before gradually shifting traffic to the Bedrock-powered backend using feature flags and weighted routing.
This controlled, deeply engineered rollout enabled Emumba to fully migrate InterWiz to Amazon Bedrock — preserving its core functionality while improving system resilience, cost-efficiency, and flexibility for future growth.
The migration to Amazon Bedrock delivered immediate, measurable benefits:
70%
reduction in AI processing costs
2/3
of cost of LLM interaction, during interviews, reduced
Expanded model access
through Bedrock enabled more tailored, high-quality interview experiences without increasing costs
Lessons Learned
Key Takeaways from the InterWiz Migration
Superficial API similarity doesn’t guarantee easy migration: Despite offering similar functionality, Azure OpenAI and Amazon Bedrock differ significantly in prompt formatting, response structures, and function handling — requiring thoughtful adaptation.
Prompt engineering is provider-specific and performance-critical: Migrating LLMs isn’t just about endpoints; achieving reliable output quality and cost efficiency requires rethinking how prompts are structured and tuned for each platform.
Abstraction layers are essential for cross-provider flexibility: Building internal abstractions between application logic and AI services reduces vendor lock-in and simplifies future transitions or fallback implementations.
Conclusion
The migration of InterWiz’s AI infrastructure to Amazon Bedrock not only solved pressing operational challenges but also positioned the platform for long-term scalability, flexibility, and cost-efficiency. Through thoughtful engineering, precise prompt adaptation, and robust quality assurance, Emumba ensured the transition preserved user experience while strengthening the platform’s technical foundation.