Artificial Analysis is the leading independent AI benchmarking company. We support labs, engineers and enterprises to understand AI capabilities and make critical decisions about their AI strategies. We are the go-to authority for understanding AI, from AI labs and enterprises to media, investors, and policymakers. Our benchmarks don't just measure the cutting edge of AI, they are actively shaping the frontier.
Our benchmarks and analysis are trusted by hundreds of thousands of users and are the go-to reference for leading AI labs including OpenAI, Google, Meta, NVIDIA and Anthropic, and major publications including the Wall Street Journal, Bloomberg, the Financial Times and The Economist.
We are a team of 35+, on track to triple by year end, backed by Nat Friedman (Github, Meta), Daniel Gross (SSI), Andrew Ng (Google Brain, DeepLearning.ai, Amazon), Adam D'Angelo (Quora, Poe, OpenAI), Clem Delangue (Hugging Face) and other industry leaders.
We're looking for an Intermediate to Senior ML Engineer to join our team and lead projects in AI benchmarking and analysis. You'll work closely with our founders to build core parts of the software stack for our early stage start-up.
The coming wave of AI scaling is going to change the world in ways we don't yet understand — and we're offering a front row seat.
Lead the development and optimization of aspects of our core benchmarking stack, focusing on data intensive backend systems and APIs, and driving projects from concept to shipped
Design and implement robust Python solutions for benchmarking, model evaluation and data analysis
Design and implement user interfaces and data visualizations that distill complex AI benchmarking data into intuitive, interactive experiences for thousands of daily users
Conduct custom analyses for enterprise customers, providing actionable insights to inform their AI strategies
Contribute to the evolution of our benchmarking methodologies, including development of evaluation methodology for emerging modalities and capabilities
Embrace an AI-native workflow, using cutting-edge AI tools to generate leverage in a fast-changing industry
3+ years of professional software engineering experience
Passion for AI and eagerness to work at the forefront of technological innovation
Proficiency with relevant Python libraries for data analysis (e.g. pandas) and key AI APIs (e.g. OpenAI)
Familiarity with cloud infrastructure, orchestration and monitoring tools
Experience with visualizing & presenting data
Strong problem-solving skills and ability to distill complex concepts into actionable insights in the face of uncertainty
Excellent communication and collaboration skills
Proven ability to lead projects independently in a fast-paced environment
Bachelor's or Master's in Computer Science, Engineering, or related field (e.g. Physics)
Preferred but not essential:
Experience with AI/ML frameworks (e.g. PyTorch)
Creation of analytical reports
Shape how AI gets built: The leading AI labs track our benchmarks and use them to guide their development priorities. Your work will directly influence the direction of AI.
Become a world expert in AI: You will evaluate every major model, across every major capability, as they are released. Very few roles offer this breadth of exposure to frontier AI.
Work with the most important players in AI: You'll manage relationships with teams at the leading AI labs and major enterprises as a trusted, independent voice.
Join at a defining moment: We're 35+ people and fast growing, backed by some of the most connected investors in AI. The people who join now will shape the product, the team, and the strategy as we scale.
Competitive compensation including equity
Our team is split across San Francisco, Sydney, and Melbourne
Sign in to browse authentic reviews, anonymous ratings and salary data before you apply.