WE SHIP FASTER THAN AMAZONTHE ONLY REAL MOAT IS ATTENTIONWE'RE ALMOST AS SECURE AS FORT KNOXTHE WORLD RUNS ON LOVE & STATUSFAST, GOOD, CHEAP, PICK THREEYOU CAN TRUST US WITH YOUR DOG (WE LOVE DOGS)WE SHIP FASTER THAN AMAZONTHE ONLY REAL MOAT IS ATTENTIONWE'RE ALMOST AS SECURE AS FORT KNOXTHE WORLD RUNS ON LOVE & STATUSFAST, GOOD, CHEAP, PICK THREEYOU CAN TRUST US WITH YOUR DOG (WE LOVE DOGS)
Back to Case Studies

AI screening that finds the top 10% applicants

Recruiting teams burn thousands of hours on early-stage screening, most of it on candidates who don't pass a phone screen. We built Blueberry to find real talent without slowing the pipeline.

Screening time saved72%

reduction in time spent on candidate evaluation.

Quality-of-hire lift3.2×

more likely to receive an offer.

Roles processed850+

Unique job requisitions screened through Blueberry in the first two quarters of operation.

Situation

The breaking point

We experienced this problem firsthand. Our own recruiting operation was scaling headcount across engineering, product, sales, and operations simultaneously. Recruiters were receiving 300 to 1,200 applications per role, and the initial screening process was entirely manual: resume review, followed by a phone screen, followed by a skills assessment. The best candidates were often lost to faster-moving competitors before a recruiter could reach them. We saw the same pattern across every growing company we worked with, and decided to build the solution ourselves.

  • Recruiters spent an average of 23 hours per open role on initial screening, most of it on candidates who would never advance past the first round.
  • Resume-based filtering rewarded keyword optimization over genuine experience, letting polished applicants through while overlooking strong candidates with non-traditional backgrounds.
  • Phone screen outcomes varied significantly by recruiter, creating inconsistent quality signals and making it difficult to benchmark across hiring teams.
  • Top candidates were accepting competing offers before recruiters could complete the screening cycle, with time-to-first-contact averaging 9 days from application.

Approach

The build

Build an intelligent screening layer that replaces the manual top-of-funnel process with AI-generated assessments calibrated to each role, integrated into the existing ATS workflow, and equipped with cheat protection to ensure signal integrity.

We designed Blueberry as a product that sits between the application and the recruiter's first human touchpoint. When a recruiter opens a new requisition, Blueberry ingests the job details and generates a custom questionnaire with questions engineered to distinguish candidates who have done the work from those who have only read about it. The questionnaire is sent to all applicants with a single click. Responses are scored using domain-tuned language models that evaluate depth of reasoning, specificity of examples, and technical accuracy. Recruiters receive a ranked shortlist with the top 10% highlighted, plus a comprehensive breakdown of how all applicants performed across different competency areas.

System blueprint

Under the hood

The core components that make the system work, and why each one matters.

Question Engine

Role-adaptive assessment generation

Blueberry analyzes job descriptions, required competencies, and seniority signals to generate questionnaires that probe for experiential depth. Questions ask for specific tradeoffs, failure modes, and decision rationale that are difficult to answer well without genuine hands-on experience.

Integrity Layer

Cheat protection and AI-detection controls

Response analysis includes timing patterns, stylometric consistency, copy-paste detection, and AI-generated content identification. Flagged responses are surfaced with integrity annotations so recruiters can make informed decisions on their own terms.

Scoring Engine

Competency-level candidate evaluation

Each response is evaluated across multiple dimensions: technical accuracy, depth of reasoning, specificity of examples, and relevance to role requirements. The composite score produces a ranked candidate list with per-section breakdowns that give recruiters granular visibility into each applicant's strengths.

ATS Integration

Native Lever and Greenhouse workflow

Blueberry operates inside the recruiter's existing workflow. Questionnaires are triggered from the ATS, candidate responses flow back as structured data, and scores appear alongside existing candidate records. The recruiter's daily operating rhythm stays exactly the same.

Performance shift

The numbers that moved

Key metrics before and after launch.

Time-to-shortlist

Lower is better
Before
9 days
After
2.5 days

Recruiter hours per role

Lower is better
Before
23 hrs
After
6.4 hrs

Candidate quality (offer rate)

Higher is better
Before
8%
After
26%

Screening consistency

Higher is better
Before
54%
After
91%

Delivery path

How we shipped it

Every phase delivered something real. Here's the timeline.

Phase 01Weeks 1–3

Hiring workflow analysis and question model design

We mapped our own screening funnel across 12 active roles. We analyzed which phone screen questions actually predicted downstream success, identified where recruiter time was being wasted, and designed the question generation model architecture.

Funnel analysisQuestion model architectureScoring competency framework
Phase 02Weeks 4–8

Core engine build and ATS integration

The question generation engine, scoring pipeline, and integrity detection layer were built and connected to both Lever and Greenhouse via their APIs. The recruiter-facing interface was designed to require zero training: one click to send, one dashboard to review results.

Question engineScoring pipelineLever integrationGreenhouse integration
Phase 03Weeks 9–11

Controlled pilot across live requisitions

Blueberry was deployed on 40 active requisitions across engineering, product, and go-to-market roles. Recruiter feedback was collected daily, scoring accuracy was validated against eventual hiring outcomes, and the question model was refined to improve signal quality on edge-case roles.

Pilot validation reportScoring calibrationFeedback integration
Phase 04Weeks 12–14

Full rollout and product release

Blueberry was enabled across all open requisitions and prepared for external release. Recruiter onboarding was completed in a single 30-minute session per team. Monitoring dashboards, integrity alert workflows, and scoring accuracy tracking were built into the product for every customer.

Full rolloutOnboarding materialsMonitoring dashboardAccuracy tracking

What shipped

What we delivered

  • AI question generation engine calibrated per role, seniority, and competency area
  • Multi-dimensional candidate scoring and ranking system
  • Top 10% shortlist with per-candidate competency breakdown
  • Cheat protection and AI-generated response detection
  • One-click questionnaire distribution from within the ATS
  • Comprehensive applicant comparison dashboard for hiring teams
  • Bi-directional Lever and Greenhouse integrations

Integrations

Connected systems

  • Lever ATS (API v1)
  • Greenhouse ATS (Harvest API)
  • Email delivery infrastructure
  • SSO / identity provider
  • HRIS platforms for role metadata enrichment

Governance

Guardrails

  • All assessment content reviewed for bias indicators before deployment using structured fairness audits
  • Candidate data retention and deletion policies aligned with GDPR and SOC 2 requirements
  • Integrity flags surface annotations to recruiters, keeping the final call in human hands
  • Scoring model outputs are fully explainable, showing recruiters exactly why a candidate scored high or low on each dimension
  • Regular calibration audits comparing Blueberry shortlists against eventual hire outcomes to catch model drift

Outcomes

The payoff

  • Recruiters moved from spending the majority of their week on initial screening to receiving a decision-ready shortlist within hours of application close. The 72% reduction in screening time was reinvested into candidate engagement and closing.
  • Candidates surfaced by Blueberry were 3.2× more likely to receive an offer than those identified through traditional resume review. The assessments rewarded demonstrated experience, and that signal carried all the way through the funnel.
  • Time-to-first-contact dropped from 9 days to under 3, which reduced candidate drop-off by 41% and improved acceptance rates on competitive roles.
  • Screening consistency across the recruiting team went from 54% inter-rater agreement to 91%, eliminating the quality variance that had made cross-team hiring benchmarks unreliable.
We built Blueberry because we were tired of watching great candidates slip through slow process. Now every company on the platform knows who the top 10% are before most teams have finished reading resumes.
Ellenox Product TeamBlueberry by EllenoxAfter two quarters of live operation across multiple customers

Next case study

Collaborative AI for Teams

4,200+ active users before the category was absorbed by major platforms