Clara AI — Model Pricing & Client Infrastructure
Model Tiers (Reselling with Quik Nation Wrapper)
| Tier | Model | Our Price | Our Cost | Margin |
|---|---|---|---|---|
| Clara Free | Llama 3.1 8B (Cloudflare Workers AI) | Free (included) | ~$0 | 100% |
| Clara Pro | Llama 4 Scout (Groq) | $0.50/1M tokens | $0.05/1M | 90% |
| Clara Business | Claude Haiku (Bedrock) | 5/1M out | 1.25 | 75% |
| Clara Premium | Claude Sonnet (Bedrock) | 25/1M out | 15 | 40% |
| Clara Ultra | Claude Opus (Bedrock) | 125/1M out | 75 | 40% |
Why: Until Mary, Maya, Nikki are trained as Clara’s own models, we resell existing models with our value-add: business context (Heru Discovery data), workflow integration (n8n), feedback history, RBAC, multi-tenant isolation.
How to apply: Clara model picker in admin panel. Usage dashboard like Anthropic’s console showing tokens used, cost, remaining budget.
Client Infrastructure Architecture
Each client gets THEIR OWN stack on an ephemeral EC2 (free tier t2.micro, 750hrs/month for 12 months):
- GraphQL + TypeScript + Node/Express + Sequelize backend
- Their own n8n instance
- Their own Clara chatbot
- PostgreSQL database
Clara on QC1 = INTERNAL only (Quik Nation’s own use with Ollama) Client Clara = runs on client’s EC2, calls Cloudflare Workers AI / Bedrock APIs
Consolidated Billing
- AWS Organizations — Quik Nation is management account
- Each client = member account
- All charges roll to Quik Nation’s bill
- Free tier: 750 hrs t2.micro/month for 12 months per account
- After 12 months: ~$8-15/month EC2 cost passes to client’s invoice
- Service Control Policies (SCPs) limit what client accounts can do
- Cost allocation tags:
client_idon every resource - Billing alerts per account
Metering (Like Anthropic)
Show clients their usage in admin panel:
- Input tokens used / limit
- Output tokens used / limit
- Conversations count
- Average response tokens
- Overage rate + current overage
- Upgrade prompt
Track via n8n AI Agent node token reporting → store in client DB → display in admin.