AI screening that finds the top 10% applicants

Recruiting teams burn thousands of hours on early-stage screening, most of it on candidates who don't pass a phone screen. We built Blueberry to find real talent without slowing the pipeline.

Screening time saved72%

reduction in time spent on candidate evaluation.

Quality-of-hire lift3.2×

more likely to receive an offer.

Roles processed850+

Unique job requisitions screened through Blueberry in the first two quarters of operation.

Executive snapshot

72% reduction in time-to-shortlist with higher quality candidates advancing

Blueberry is an AI-powered screening layer that generates role-specific questionnaires designed to test depth of experience. It integrates directly into Lever and Greenhouse, scores every applicant, flags the top 10%, and gives recruiters a decision-ready summary before they open a single resume.

ClientEllenox internal product

EngagementInternal tool turned commercial product

Timeline14 weeks from concept to production launch

TeamAI engineering, product design, full-stack development

Mandate

Build an intelligent screening layer that replaces the manual top-of-funnel process with AI-generated assessments calibrated to each role, integrated into the existing ATS workflow, and equipped with cheat protection to ensure signal integrity.

Situation

The breaking point

We experienced this problem firsthand. Our own recruiting operation was scaling headcount across engineering, product, sales, and operations simultaneously. Recruiters were receiving 300 to 1,200 applications per role, and the initial screening process was entirely manual: resume review, followed by a phone screen, followed by a skills assessment. The best candidates were often lost to faster-moving competitors before a recruiter could reach them. We saw the same pattern across every growing company we worked with, and decided to build the solution ourselves.

Recruiters spent an average of 23 hours per open role on initial screening, most of it on candidates who would never advance past the first round.
Resume-based filtering rewarded keyword optimization over genuine experience, letting polished applicants through while overlooking strong candidates with non-traditional backgrounds.
Phone screen outcomes varied significantly by recruiter, creating inconsistent quality signals and making it difficult to benchmark across hiring teams.
Top candidates were accepting competing offers before recruiters could complete the screening cycle, with time-to-first-contact averaging 9 days from application.

Approach

The build

We designed Blueberry as a product that sits between the application and the recruiter's first human touchpoint. When a recruiter opens a new requisition, Blueberry ingests the job details and generates a custom questionnaire with questions engineered to distinguish candidates who have done the work from those who have only read about it. The questionnaire is sent to all applicants with a single click. Responses are scored using domain-tuned language models that evaluate depth of reasoning, specificity of examples, and technical accuracy. Recruiters receive a ranked shortlist with the top 10% highlighted, plus a comprehensive breakdown of how all applicants performed across different competency areas.

System blueprint

Under the hood

The core components that make the system work, and why each one matters.

Question Engine

Role-adaptive assessment generation

Blueberry analyzes job descriptions, required competencies, and seniority signals to generate questionnaires that probe for experiential depth. Questions ask for specific tradeoffs, failure modes, and decision rationale that are difficult to answer well without genuine hands-on experience.

Integrity Layer

Cheat protection and AI-detection controls

Response analysis includes timing patterns, stylometric consistency, copy-paste detection, and AI-generated content identification. Flagged responses are surfaced with integrity annotations so recruiters can make informed decisions on their own terms.

Scoring Engine

Competency-level candidate evaluation

Each response is evaluated across multiple dimensions: technical accuracy, depth of reasoning, specificity of examples, and relevance to role requirements. The composite score produces a ranked candidate list with per-section breakdowns that give recruiters granular visibility into each applicant's strengths.

ATS Integration

Native Lever and Greenhouse workflow

Blueberry operates inside the recruiter's existing workflow. Questionnaires are triggered from the ATS, candidate responses flow back as structured data, and scores appear alongside existing candidate records. The recruiter's daily operating rhythm stays exactly the same.

Performance shift

The numbers that moved

Key metrics before and after launch.

Time-to-shortlist

Lower is better

Before

9 days

After

2.5 days

Recruiter hours per role

Lower is better

Before

23 hrs

After

6.4 hrs

Candidate quality (offer rate)

Higher is better

Before

After

26%

Screening consistency

Higher is better

Before

54%

After

91%

Delivery path

How we shipped it

Every phase delivered something real. Here's the timeline.

Phase 01Weeks 1–3

Hiring workflow analysis and question model design

We mapped our own screening funnel across 12 active roles. We analyzed which phone screen questions actually predicted downstream success, identified where recruiter time was being wasted, and designed the question generation model architecture.

Funnel analysisQuestion model architectureScoring competency framework

Phase 02Weeks 4–8

Core engine build and ATS integration

The question generation engine, scoring pipeline, and integrity detection layer were built and connected to both Lever and Greenhouse via their APIs. The recruiter-facing interface was designed to require zero training: one click to send, one dashboard to review results.

Question engineScoring pipelineLever integrationGreenhouse integration

Phase 03Weeks 9–11

Controlled pilot across live requisitions

Blueberry was deployed on 40 active requisitions across engineering, product, and go-to-market roles. Recruiter feedback was collected daily, scoring accuracy was validated against eventual hiring outcomes, and the question model was refined to improve signal quality on edge-case roles.

Pilot validation reportScoring calibrationFeedback integration

Phase 04Weeks 12–14

Full rollout and product release

Blueberry was enabled across all open requisitions and prepared for external release. Recruiter onboarding was completed in a single 30-minute session per team. Monitoring dashboards, integrity alert workflows, and scoring accuracy tracking were built into the product for every customer.

Full rolloutOnboarding materialsMonitoring dashboardAccuracy tracking

What shipped

What we delivered

AI question generation engine calibrated per role, seniority, and competency area
Multi-dimensional candidate scoring and ranking system
Top 10% shortlist with per-candidate competency breakdown
Cheat protection and AI-generated response detection
One-click questionnaire distribution from within the ATS
Comprehensive applicant comparison dashboard for hiring teams
Bi-directional Lever and Greenhouse integrations

Integrations

Connected systems

Lever ATS (API v1)
Greenhouse ATS (Harvest API)
Email delivery infrastructure
SSO / identity provider
HRIS platforms for role metadata enrichment

Governance

Guardrails

All assessment content reviewed for bias indicators before deployment using structured fairness audits
Candidate data retention and deletion policies aligned with GDPR and SOC 2 requirements
Integrity flags surface annotations to recruiters, keeping the final call in human hands
Scoring model outputs are fully explainable, showing recruiters exactly why a candidate scored high or low on each dimension
Regular calibration audits comparing Blueberry shortlists against eventual hire outcomes to catch model drift

Outcomes

The payoff

Recruiters moved from spending the majority of their week on initial screening to receiving a decision-ready shortlist within hours of application close. The 72% reduction in screening time was reinvested into candidate engagement and closing.
Candidates surfaced by Blueberry were 3.2× more likely to receive an offer than those identified through traditional resume review. The assessments rewarded demonstrated experience, and that signal carried all the way through the funnel.
Time-to-first-contact dropped from 9 days to under 3, which reduced candidate drop-off by 41% and improved acceptance rates on competitive roles.
Screening consistency across the recruiting team went from 54% inter-rater agreement to 91%, eliminating the quality variance that had made cross-team hiring benchmarks unreliable.

We built Blueberry because we were tired of watching great candidates slip through slow process. Now every company on the platform knows who the top 10% are before most teams have finished reading resumes.

Ellenox Product TeamBlueberry by EllenoxAfter two quarters of live operation across multiple customers

Next case study

Collaborative AI for Teams

4,200+ active users before the category was absorbed by major platforms