Applied AI Engineer - Federal (TS Required)

ID: 9218

Type: Full-time

Category: Others

Company Name: Snorkel AI

Location: District of Columbia (USA) - Washington - United States

Salary: 250 - 250K yearly

Education Level: Mid-level (2-5 years)

Visit company vacancy

Job Description

About Snorkel

At Snorkel, we believe meaningful AI doesn’t start with the model, it starts with the data.

We’re on a mission to help enterprises transform expert knowledge into specialized AI at scale. The AI landscape has gone through incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!

We’re on a mission to democratize AI by building the definitive AI data development platform. The AI landscape has gone through incredible change between 2016, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!

As an Applied AI Engineer, you’ll research and utilize state-of-the-art Gen AI and machine learning (ML) techniques to successfully deliver solutions to our customers. You will work directly with our customers to understand their business and technical needs and design and deliver AI solutions to solve them - either by leveraging Snorkel Flow or developing custom approaches when needed. You will also help define Snorkel’s Applied AI tooling by translating repeatable real-world challenges into reusable solution recipes, workflows, best practices, and platform-level capabilities that become part of Snorkel Flow’s next generation of AI tooling. We move fast and are constantly prototyping and innovating new ways to deliver value to our customers. This position is ideal for someone who enjoys solving complex problems, bridging the gap between AI technology and business value, working directly with customers, keeping up-to date with AI research, and standardizing bespoke solutions into internal recipes and staying naturally curious about the infrastructure that underpin the Applied AI stack end-to-end.

Main Responsibilities

Partner with customers to build and deploy impactful Gen AI and machine learning solutions, from use case scoping and data exploration to model development and deployment. This may involve leveraging Snorkel Flow or designing custom approaches using state-of-the-art tools, with the goal of delivering real business value and informing the evolution of the Snorkel platform.
Develop and implement state of the art AI systems such as retrieval-augmented generation (RAG), fine-tuning pipelines, prompt engineering recipes and agentic workflows.
Create augmented real-world datasets and comprehensive evaluation workflows to ensure model reliability, transparency, and stakeholder trust. A data- and evaluation-first mindset is essential for success in this role.
Forge and manage relationships with our customers’ leadership and stakeholders to ensure successful development and deployment of AI projects with Snorkel Flow.
Collaborate closely with pre-sales Solutions and Product teams to map customer needs to existing capabilities, prioritize roadmap gaps, and guide successful project setup.
Work with other Applied AI Engineers to standardize solutions and contribute to internal tooling and best practices.
Lead stakeholder education on quantitative capabilities, helping them to understand the strengths and weaknesses of different approaches and what problems are best-suited for Snorkel AI.
Serve as the voice of our customers for new AI paradigms, data science workflows, and share customer feedback to product teams.
Conduct one-to-few and one-to-many enablement workshops to transfer knowledge to customers considering or already using Snorkel AI.
As part of our team, you will have the opportunity to work on all aspects of complex National Security problems.
Annual travel up to 25%.

Preferred Qualifications

B.S. degree in a quantitative field such as Computer Science, Engineering, Mathematics, Statistics, or comparable degree/experience.
3+ years of customer-facing experience in the design and implementation of AI/ML solutions.
Proficiency in Python, including strong grounding in software engineering fundamentals (e.g., modular design, testing, profiling, packaging) and experience with modern Python constructs and libraries for type validation and typed data modeling (e.g., pydantic), building type-safe systems (e.g., mypy), testing (e.g., pytest), packaging and environment configuration (e.g., poetry), API and service frameworks (e.g., FastAPI), serialization and structured data handling (e.g., msgspec), and orchestration tooling relevant to ML deployment (e.g., Ray, Airflow).
Expertise across the Applied AI stack, spanning classical ML libraries (e.g., scikit-learn), deep learning frameworks (e.g., PyTorch), foundation-model ecosystems (e.g., Hugging Face Transformers), vector/embedding tooling (e.g., FAISS), data processing frameworks (e.g., pandas, Spark), retrieval/RAG tooling (e.g., Chroma, Weaviate), synthetic dataset curation, evaluation workflows, and LLM orchestration, workflow, agent authoring tools (e.g., LlamaIndex, LangGraph, CrewAI).
Experience leading strategic, customer-facing initiatives and collaborating with business stakeholders to ensure ML solutions drive successful business outcomes, with a strong focus on teaching and enablement.
Outstanding presentation skills to technical and executive audiences, whether impromptu on a whiteboard or using presentations and demos.
Ability to work in a fast-paced environment and balance priorities across multiple projects at once.
Experience reviewing and drafting responses to federal Requests for Comment (RFCs), Requests for Information (RFIs), Requests for Proposals (RFPs), etc. is preferred
Experience working across the Civilian, DOD and NATSEC agencies, inclusive of experience in Federal Law Enforcement, Healthcare, Financial/Regulatory and Military operations and logistics
Candidates must have an active TS (Top Secret) Clearance

Compensation range for Tier 2 Location - Washington DC Region, $160K - $250K OTE. All offers also include equity in the form of employee stock options. Our compensation ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.

Locations

Must reside in the Washington DC Region.

#LI-CG1

Salary Range

$160,000—$250,000 USD

Be Your Best at Snorkel

Joining Snorkel AI means becoming part of a company that has market proven solutions, robust funding, and is scaling rapidly—offering a unique combination of stability and the excitement of high growth. As a member of our team, you’ll have meaningful opportunities to shape priorities and initiatives, influence key strategic decisions, and directly impact our ongoing success. Whether you’re looking to deepen your technical expertise, explore leadership opportunities, or learn new skills across multiple functions, you’re fully supported in building your career in an environment designed for growth, learning, and shared success.

Snorkel AI is proud to be an Equal Employment Opportunity employer and is committed to building a team that represents a variety of backgrounds, perspectives, and skills. Snorkel AI embraces diversity and provides equal employment opportunities to all employees and applicants for employment. Snorkel AI prohibits discrimination and harassment of any type on the basis of race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local law. All employment is decided on the basis of qualifications, performance, merit, and business need.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Company Information

Company Name: Snorkel AI

Company Website: https://snorkel.ai

Company Address: N/A

Snorkel AI is a technology company that develops software to accelerate the creation, labeling, and management of training data for machine learning. Originating from academic research on programmatic data labeling, the company commercializes techniques and tooling that enable organizations to build high-quality training datasets more quickly and with less manual annotation than traditional hand-labeling approaches. Snorkel AI positions its products and services around the premise that labeled training data—rather than model architecture alone—is a central bottleneck for deploying reliable, production-grade machine learning systems. At the core of Snorkel AI’s offering is a platform for programmatic supervision and end-to-end training data engineering. The platform (commonly referred to in public materials as Snorkel Flow) provides tools to generate labels programmatically by writing labeling functions, heuristics, and weak supervision sources; to combine and denoise those noisy supervisory signals with statistical label models; and to manage the downstream processes that take labeled data into model training, evaluation, and deployment. The technology stack typically includes: a developer-facing environment for iterating on labeling functions and data transforms; a label model (a probabilistic model) that estimates and reconciles multiple noisy label sources into probabilistic training labels; tooling to convert programmatic labels into datasets ready for training modern supervised models; and monitoring, versioning, and lineage features for enterprise governance and reproducibility. Snorkel AI’s products are designed for use in production ML workflows across multiple data modalities, including text (natural language processing), structured/tabular data, and image or sensor data for computer vision tasks. Typical product capabilities highlighted in public materials include: rapid prototyping of labeling logic using programmatic rules and weak supervision; automated combination and calibration of diverse labeling sources (heuristics, weak models, knowledge bases, distant supervision, and small amounts of hand-labeled data); integrated data pipelines that feed downstream model training; built-in evaluation and debugging aids to measure labeling quality and model performance; and enterprise features such as access control, audit logs, dataset versioning, and lineage tracking to support compliance and collaboration in regulated environments. The platform is marketed primarily to enterprises and organizations that need to build and maintain large-scale labeled datasets for production ML applications where conventional manual labeling is expensive, slow, or difficult at scale. Snorkel AI emphasizes use cases where subject-matter expertise can be encoded programmatically (for example, domain-specific heuristics or rules) and where multiple weak or imperfect sources of supervision can be combined to achieve high-quality labels. Public-facing descriptions indicate that the technology is used in areas such as information extraction and text classification for legal/financial/healthcare data, automated document processing, search and recommendation system improvements, and structured data extraction from unstructured sources. The company highlights applicability to regulated industries and critical workflows where transparency, repeatability, and traceability of training data are important. Snorkel AI’s product suite builds on earlier academic work known as Snorkel (an open-source project and set of research papers) that introduced the data programming and weak supervision paradigm. The company commercializes and extends this research into a production-grade platform with workflow management, collaboration, and enterprise security features. Snorkel AI also provides professional services, technical support, and customer engineering to help organizations adopt programmatic labeling workflows, integrate Snorkel into existing ML infrastructure, and operationalize training data production at scale. Integrations and deployment options typically address common enterprise needs—APIs and SDKs for integration with model training frameworks, connectors for data storage and data lakes, and deployment options compatible with cloud and on-premises environments. In public accounts and company materials, Snorkel AI frames its mission around removing barriers to high-quality training data, enabling organizations to build ML systems faster while maintaining auditable and governed data pipelines. The company’s approach emphasizes a data-centric machine learning workflow—shifting focus from model tinkering to systematically improving and managing the training data that underlies ML performance. This positioning aligns Snorkel AI with broader trends in MLOps and data-centric AI, where tooling for dataset creation, labeling, versioning, and lineage has become an increasingly important complement to model development. Overall, Snorkel AI is a specialized enterprise software company that addresses a concrete pain point in machine learning adoption: how to create, maintain, and govern labeled datasets at scale. Its commercial platform leverages programmatic labeling and weak supervision techniques to reduce reliance on large volumes of manual annotations, and it supplements core technology with workflow, governance, and integration capabilities intended for production ML teams and domain-expert collaborators.

Visit company vacancy