reduction in time spent on candidate evaluation.
AI screening that finds the top 10% applicants
Recruiting teams burn thousands of hours on early-stage screening, most of it on candidates who don't pass a phone screen. We built Blueberry to find real talent without slowing the pipeline.
more likely to receive an offer.
Unique job requisitions screened through Blueberry in the first two quarters of operation.
Situation
The breaking point
We experienced this problem firsthand. Our own recruiting operation was scaling headcount across engineering, product, sales, and operations simultaneously. Recruiters were receiving 300 to 1,200 applications per role, and the initial screening process was entirely manual: resume review, followed by a phone screen, followed by a skills assessment. The best candidates were often lost to faster-moving competitors before a recruiter could reach them. We saw the same pattern across every growing company we worked with, and decided to build the solution ourselves.
- Recruiters spent an average of 23 hours per open role on initial screening, most of it on candidates who would never advance past the first round.
- Resume-based filtering rewarded keyword optimization over genuine experience, letting polished applicants through while overlooking strong candidates with non-traditional backgrounds.
- Phone screen outcomes varied significantly by recruiter, creating inconsistent quality signals and making it difficult to benchmark across hiring teams.
- Top candidates were accepting competing offers before recruiters could complete the screening cycle, with time-to-first-contact averaging 9 days from application.
Approach
The build
Build an intelligent screening layer that replaces the manual top-of-funnel process with AI-generated assessments calibrated to each role, integrated into the existing ATS workflow, and equipped with cheat protection to ensure signal integrity.
We designed Blueberry as a product that sits between the application and the recruiter's first human touchpoint. When a recruiter opens a new requisition, Blueberry ingests the job details and generates a custom questionnaire with questions engineered to distinguish candidates who have done the work from those who have only read about it. The questionnaire is sent to all applicants with a single click. Responses are scored using domain-tuned language models that evaluate depth of reasoning, specificity of examples, and technical accuracy. Recruiters receive a ranked shortlist with the top 10% highlighted, plus a comprehensive breakdown of how all applicants performed across different competency areas.
System blueprint
Under the hood
The core components that make the system work, and why each one matters.
Role-adaptive assessment generation
Blueberry analyzes job descriptions, required competencies, and seniority signals to generate questionnaires that probe for experiential depth. Questions ask for specific tradeoffs, failure modes, and decision rationale that are difficult to answer well without genuine hands-on experience.
Cheat protection and AI-detection controls
Response analysis includes timing patterns, stylometric consistency, copy-paste detection, and AI-generated content identification. Flagged responses are surfaced with integrity annotations so recruiters can make informed decisions on their own terms.
Competency-level candidate evaluation
Each response is evaluated across multiple dimensions: technical accuracy, depth of reasoning, specificity of examples, and relevance to role requirements. The composite score produces a ranked candidate list with per-section breakdowns that give recruiters granular visibility into each applicant's strengths.
Native Lever and Greenhouse workflow
Blueberry operates inside the recruiter's existing workflow. Questionnaires are triggered from the ATS, candidate responses flow back as structured data, and scores appear alongside existing candidate records. The recruiter's daily operating rhythm stays exactly the same.
Performance shift
The numbers that moved
Key metrics before and after launch.
Time-to-shortlist
Lower is betterRecruiter hours per role
Lower is betterCandidate quality (offer rate)
Higher is betterScreening consistency
Higher is betterDelivery path
How we shipped it
Every phase delivered something real. Here's the timeline.
Hiring workflow analysis and question model design
We mapped our own screening funnel across 12 active roles. We analyzed which phone screen questions actually predicted downstream success, identified where recruiter time was being wasted, and designed the question generation model architecture.
Core engine build and ATS integration
The question generation engine, scoring pipeline, and integrity detection layer were built and connected to both Lever and Greenhouse via their APIs. The recruiter-facing interface was designed to require zero training: one click to send, one dashboard to review results.
Controlled pilot across live requisitions
Blueberry was deployed on 40 active requisitions across engineering, product, and go-to-market roles. Recruiter feedback was collected daily, scoring accuracy was validated against eventual hiring outcomes, and the question model was refined to improve signal quality on edge-case roles.
Full rollout and product release
Blueberry was enabled across all open requisitions and prepared for external release. Recruiter onboarding was completed in a single 30-minute session per team. Monitoring dashboards, integrity alert workflows, and scoring accuracy tracking were built into the product for every customer.
What shipped
What we delivered
- AI question generation engine calibrated per role, seniority, and competency area
- Multi-dimensional candidate scoring and ranking system
- Top 10% shortlist with per-candidate competency breakdown
- Cheat protection and AI-generated response detection
- One-click questionnaire distribution from within the ATS
- Comprehensive applicant comparison dashboard for hiring teams
- Bi-directional Lever and Greenhouse integrations
Integrations
Connected systems
- Lever ATS (API v1)
- Greenhouse ATS (Harvest API)
- Email delivery infrastructure
- SSO / identity provider
- HRIS platforms for role metadata enrichment
Governance
Guardrails
- All assessment content reviewed for bias indicators before deployment using structured fairness audits
- Candidate data retention and deletion policies aligned with GDPR and SOC 2 requirements
- Integrity flags surface annotations to recruiters, keeping the final call in human hands
- Scoring model outputs are fully explainable, showing recruiters exactly why a candidate scored high or low on each dimension
- Regular calibration audits comparing Blueberry shortlists against eventual hire outcomes to catch model drift
Outcomes
The payoff
- Recruiters moved from spending the majority of their week on initial screening to receiving a decision-ready shortlist within hours of application close. The 72% reduction in screening time was reinvested into candidate engagement and closing.
- Candidates surfaced by Blueberry were 3.2× more likely to receive an offer than those identified through traditional resume review. The assessments rewarded demonstrated experience, and that signal carried all the way through the funnel.
- Time-to-first-contact dropped from 9 days to under 3, which reduced candidate drop-off by 41% and improved acceptance rates on competitive roles.
- Screening consistency across the recruiting team went from 54% inter-rater agreement to 91%, eliminating the quality variance that had made cross-team hiring benchmarks unreliable.
We built Blueberry because we were tired of watching great candidates slip through slow process. Now every company on the platform knows who the top 10% are before most teams have finished reading resumes.
