About the Role
The Lenovo AI Technology Center (LATC)—Lenovo’s global AI Center of Excellence—is driving our transformation into an AI-first organization. We are assembling a world-class team of researchers, engineers, and innovators to position Lenovo and its customers at the forefront of the generational shift toward AI. We are building the next wave of AI core technologies and platforms that leverage and evolve with the fast-moving AI ecosystem, including novel model and agentic orchestration & collaboration across mobile, edge, and cloud resources. This space is evolving fast and so are we. If you’re ready to shape AI at a truly global scale, with products that touch every corner of life and work, there’s no better time to join us.
Responsibilities
- Architect agent systems: Design and own the architecture of production agent systems, including the Agent SDK (LangGraph/Pydantic Graphs), defining patterns and abstractions that the team builds upon.
- Lead orchestration & routing strategy: Define the technical vision for orchestration services, model routing (edge-cloud), and multi-agent coordination patterns. Make key architectural decisions on latency/cost/capability trade-offs.
- Drive cross-team integration: Partner with BU product teams (Qira, Tianxi, UDS IQ) to translate requirements into technical specifications. Coordinate with Infrastructure and Data teams on dependencies.
- Establish reliability & safety standards: Define and enforce guardrail policies, fallback chains, and safety constraints across agent systems. Own incident response and post-mortem processes.
- Build observability infrastructure: Design tracing, logging, and monitoring systems that enable the team to understand agent behavior at scale. Create dashboards and alerting for production systems.
- Mentor and grow the team: Lead technical decisions for the squad, mentor junior engineers, conduct code reviews, and establish engineering best practices and coding standards.
- Shape technical roadmap: Contribute to quarterly planning, identify technical risks, and drive initiatives that improve team velocity and system reliability.
Core Skills
- Expert-level Python programming (async patterns, performance optimization, library design) and experience designing APIs and SDKs.
- Deep knowledge of agentic frameworks (LangChain, LangGraph, LlamaIndex, AutoGen) including internals, not just usage.
- Proven track record shipping production agent systems serving real users at scale.
- Strong system design skills: distributed systems, state management, message queues, service mesh patterns.
- Experience with model routing strategies, embedding-based similarity matching, and edge-cloud orchestration.
- Ability to break down ambiguous problems, make architectural decisions independently, and communicate trade-offs clearly.
Bonus Skills
- Experience with MCP (Model Context Protocol) or similar agent communication protocols.
- Background in edge/on-device deployment (mobile, IoT, embedded systems) with latency and memory constraints.
- Contributions to open-source agent frameworks (LangChain, LlamaIndex, etc.).
- Experience building and operating ML platforms or MLOps infrastructure.
- Background in Go, Rust, or other systems languages for performance-critical components.
- Published blog posts, talks, or papers on agent systems or LLM engineering.
Qualifications
- 8+ years in software engineering, with at least 2 years focused on ML/AI systems or LLM-based applications (6+ years in software engineering with MS Degree).
- BS/MS in Computer Science or related field; equivalent practical experience considered.
- Track record of technical leadership: owning systems end-to-end, making architectural decisions, mentoring engineers.
- Experience with production incidents, on-call responsibilities, and post-mortem processes.
- Demonstrated ability to influence technical direction beyond immediate team.