Zendesk Interview Question

How would you design an Realtime LLM Inference Service