AI Application Engineer, APJ

ID: 9474

Type: Full-time

Category: Others

Company Name: Arize AI

Location: Singapore - Singapore - Singapore

Education Level: Junior (1-2 years)

Visit company vacancy

Job Description

About Arize

AI is rapidly transforming the world. As generative AI reshapes industries, teams need powerful ways to monitor, troubleshoot, and optimize their AI systems. That’s where we come in. Arize AI is the leading AI & Agent Engineering observability and evaluation platform, empowering AI engineers to ship high-performing, reliable agents and applications. From first prototype to production scale, Arize AX unifies build, test, and run in a single workspace—so teams can ship faster with confidence.

We’re a Series C company backed by top-tier investors, with over $135M in funding and a rapidly growing customer base of 150+ leading enterprises and Fortune 500 companies. Customers like Booking.com, Uber, Siemens, and PepsiCo leverage Arize to deliver AI that works.

The Opportunity

We’re looking for an Application Engineer who thrives on solving hard problems with code. In this role, you'll have the opportunity to work at the cutting edge of generative AI in a high-impact role with autonomy and ownership. While we are a remote-first company, the nature of this role requires candidates to be based in the Singapore area.

What You’ll Do

Debug and fix issues in our platform (and ship PRs with your fixes).
Build internal tools and copilots powered by generative AI to supercharge our team.
Rapidly prototype proof-of-concepts for customer use cases.
Work across Engineering, Product, and Solutions to unblock customers and push the boundaries of AI adoption.

What We’re Looking For

You have 2-5 years of experience in software.
Strong in Python and Golang; comfortable shipping fixes in production systems.
Hands-on with generative AI (LLM APIs, frameworks, building copilots or automations)
Hands-on with OpenTelemetry and deep familiarity with distributed tracing concepts.
Familiarity with AI frameworks (CrewAI, Langchain, Langgraph, DiFy, LiteLLM, etc).
Familiarity or eagerness to learn JavaScript/TypeScript.
Great debugger, creative problem solver, and fast learner.
Independent and resourceful. You create solutions, not dependencies.

Bonus Points (but not required!)

Experience in a customer-facing role
Built copilots, plugins, or custom GenAI-powered applications.
Open-sourced or contributed PRs to real codebases.
Startup or fast-moving environment experience.

Actual compensation is determined based upon a variety of job related factors that may include: transferable work experience, skill sets, and qualifications. Total compensation also includes unlimited paid time off, generous parental leave plan, and others for mental and wellness support.

More About Arize

Arize’s mission is to make the world’s AI work—and work for people.
Our founders came together through a shared frustration: while investments in AI are growing rapidly across every industry, organizations face a critical challenge—understanding whether AI is performing and how to improve it at scale.

Learn more about what we're doing here:

https://techcrunch.com/2025/02/20/arize-ai-hopes-it-has-first-mover-advantage-in-ai-observability/

https://arize.com/blog/arize-ai-raises-70m-series-c-to-build-the-gold-standard-for-ai-evaluation-observability/

Diversity & Inclusion @ Arize

Our company's mission is to make AI work and make AI work for the people, we hope to make an impact in bias industry-wide and that's a big motivator for people who work here. We actively hope that individuals contribute to a good culture

Regularly have chats with industry experts, researchers, and ethicists across the ecosystem to advance the use of responsible AI
Culturally conscious events such as LGBTQ trivia during pride month
We have an active Lady Arizers subgroup

Company Information

Company Name: Arize AI

Company Website: https://www.arize.com

Company Address: Singapore

Arize AI is a technology company that provides a commercial machine learning observability platform designed to help organizations monitor, troubleshoot, and improve production machine learning models. The company’s offering centers on tools and workflows for detecting and diagnosing issues that arise after models are deployed, with an emphasis on measurable model performance, data and concept drift detection, model explainability, and operational alerting. Arize AI’s platform is positioned as an enterprise-grade solution to make it easier for data science, ML engineering, and site reliability teams to maintain model quality, reduce time-to-detect and time-to-resolve model degradations, and support reproducible investigations into model behavior in production. At a high level, Arize’s core business activities include ingesting model predictions and related telemetry from production systems, processing and indexing prediction data at scale, computing performance and degradation signals, and presenting diagnostics and visualizations through a web-based user interface and APIs. The platform supports both batch and streaming inference patterns, enabling teams to send prediction records (including model outputs, scores, probabilities, feature values, and optional ground truth labels) to Arize for continuous evaluation. Once ingested, data is analyzed to surface performance metrics across slices, cohorts, time windows, and feature distributions, and to compute statistical tests and drift metrics that indicate when inputs or model outputs are moving away from expected patterns. Key product capabilities advertised by Arize include model performance monitoring, feature and population drift detection, bias and fairness assessments, explainability and attribution tooling, model comparison and versioning dashboards, and automated root-cause investigation aids. Explainability features typically provide population-level and individual prediction-level attributions to help engineers and data scientists understand which features are driving model decisions and where unexpected behaviors originate. Drift detection and skew analysis highlight shifts between training and production data or between different production cohorts, enabling teams to determine whether retraining, data collection changes, or model rollbacks are required. The platform’s slice and cohort analysis helps identify underperforming subpopulations and supports prioritized debugging of model issues. Arize provides developer- and engineer-facing integrations including SDKs, APIs, and connectors that make it straightforward to send inference telemetry from common ML frameworks, orchestration systems, or data pipelines. The platform typically lists compatibility with standard machine learning frameworks and model formats and offers integration patterns for popular data infrastructure such as streaming systems and cloud object stores. In practice, teams integrate Arize at inference time via its ingestion APIs or client libraries, and optionally instrument label feedback loops to continuously measure model accuracy and other supervised performance metrics. Operational features include configurable alerting and notifications so teams are informed when pre-defined thresholds or anomaly detectors trigger, as well as role-based access controls and audit logging to support governance. Dashboards and visualizations are designed to support collaborative incident response and post-incident analysis—helping teams trace degradations back to recent code, data, or system changes. The platform supports comparison across model versions and enables side-by-side analyses to assess whether a new model candidate is an improvement or regression relative to a production baseline. Arize’s product is marketed primarily to enterprises and teams deploying machine learning in production, including those operating real-time inference services, batch scoring pipelines, and decisioning systems that require ongoing validation and compliance evidence. Typical use cases include detection of feature distribution changes that impact model accuracy, rapid diagnosis of prediction errors after model rollout, continuous monitoring for dataset or label drift, and generating explanations that aid compliance and stakeholder transparency. On the technical and operational side, Arize emphasizes scalability to handle high-throughput inference streams and the ability to index and query large volumes of historical prediction data. The platform includes tooling for data retention, query-based investigation, and export of artifacts for deeper offline analysis. Security and data governance features, as described on official product pages, commonly cover encryption of data in transit and at rest, access controls, and enterprise deployment guidance, though specific policies and compliance certifications vary by customer and are typically outlined in product or security documentation. Arize also publishes educational content, documentation, and best-practice guides aimed at reducing the time required to instrument models and interpret production behavior. Documentation typically covers API usage, SDK integration patterns, recommended monitoring strategies, and example workflows for common scenarios such as drift remediation and model rollback. The company’s public-facing materials and product descriptions characterize it as a vendor focused on helping organizations operationalize model observability and reduce the risk and operational burden associated with production ML. The platform’s combination of telemetry ingestion, automated detection, explainability, and investigative tooling is intended to support continuous validation and iterative improvement of models after deployment.

Visit company vacancy