Trusted by teams at
In this article
Most enterprise mobile development RFPs include 20 or more questions. The six questions that actually predict performance are rarely among them. The standard RFP was designed to filter out obviously unqualified vendors. It was never designed to tell you whether a vendor will deliver on time, communicate when something breaks, or still be staffed to your engagement six months in. The questions you are missing are the ones agencies are never asked. This guide gives you those questions and tells you what the answers reveal.
Key findings
Standard RFP questions are easy to answer well without any delivery capability. Agencies optimize for passing your filter, not for demonstrating a track record you can actually verify.
The single most predictive signal in any vendor evaluation is not their client list or their rating. It is how they describe a project that went wrong. Most agencies are never asked.
Team structure questions reveal key-person dependency before it becomes your problem. Ask who covers your engagement if the lead engineer leaves. The answer tells you whether you are buying a team or a person.
An agency that ships to users weekly and one that ships monthly are not two points on a spectrum. They produce fundamentally different roadmap outcomes. The difference does not appear in any pitch deck.
Why standard RFP questions fail
Standard RFP questions fail because they test for the ability to write good answers, not for the ability to deliver. Every capable agency and every mediocre agency has the same answers ready for "describe your Agile process," "how do you handle communication," and "what does your QA process look like." These questions were written to compare proposals on paper. They were not written to surface delivery risk.
The problem is not that the questions are wrong. The problem is that they are answerable from a template. An agency that ships weekly and an agency that ships quarterly give nearly identical answers to "describe your release process." Both will say they ship iteratively, incorporate feedback, and communicate proactively. One of those agencies is telling you the truth about how things go on a difficult engagement. The other is describing how they wish things went.
The questions that reveal delivery capability are the ones that require specifics: names, dates, what broke, who called it, and what changed afterward. Most RFPs never ask any of them. When they do, they accept general answers where specific ones exist. This guide is about getting the specific answers.
Questions about team structure
Ask: "Who is on my engagement specifically, and what happens to my work if that person leaves?"
The answer to this question reveals whether you are buying a team or a single point of failure. Many agencies staff mid-market engagements with one senior lead and two or three junior engineers. The senior lead is the person you meet in the sales process. They are impressive. They are also rarely on your engagement 40 hours a week six months in. What the answer to this question tells you is whether the agency has built redundancy into the engagement or whether they have built a dependency.
A capable agency names the specific people. They describe how each person's work overlaps with at least one other person on the engagement. They can answer what happens in week one, week two, and week three if the lead engineer takes unexpected leave. An agency that cannot answer this question concretely is telling you there is no plan.
Ask a second question: "What is your average engineer tenure on active engagements?" Attrition on the agency side is the most common source of disruption for enterprise clients. An agency with high internal attrition will cycle engineers through your engagement, each transition costing weeks of context transfer. An agency that cannot give you a number on average tenure either does not track it or does not want to share it. Both are useful information. A good answer is a specific number, ideally above 18 months, with a reason for any recent changes.
Questions about quality
Ask: "What are the last three defects your QA process caught before they shipped to users, and how did you find each one?"
"We have a QA process" is one of the most common and least informative answers in enterprise mobile vendor evaluation. Every agency has a QA process. The question is what that process actually catches and how. The answer to this question tells you whether quality is an automated workflow embedded in how the team builds or a manual gate that runs when someone remembers to schedule it.
A capable agency answers this with specifics. They name the type of defect, describe whether it was caught by automated testing or manual review, and explain what would have happened if it had shipped. An agency that answers with a general description of their QA process rather than specific examples is telling you they do not track this at the level the question requires. That is a gap worth knowing before you sign.
Ask a follow-up: "What was the last defect that shipped to users that your process did not catch, and what changed after?" This question is harder. It requires the agency to name a failure and explain what they learned. Agencies that cannot name one either have very low shipping volume or are not being honest. The answer is less important than the posture. An agency that names a specific failure, explains the gap in their process, and describes the change they made is an agency that treats quality as something to improve, not something to claim.
Questions about speed
Ask: "How often does your app ship to users on your longest active engagement, and what is the most common reason a release is delayed?"
How often an agency ships to users is the single most useful proxy for roadmap velocity. The gap between a weekly release rhythm and a monthly one is not a scheduling preference. It is a 4x difference in how fast your users see improvements, how fast your team gets feedback, and how fast defects get found and fixed before they accumulate. The second part of the question matters as much as the first: what blocks a release tells you where the delivery friction lives.
Common answers to "what blocks a release" fall into two categories. The first is external: client approval processes, App Store review times, third-party API availability. These are real constraints and are not the agency's fault. The second is internal: build instability, QA backlog, test environments that are not ready. If the most common blocker is internal, you are looking at a delivery infrastructure problem that will affect your engagement.
Ask for dates, not descriptions. "We ship frequently" is not an answer. "We shipped on January 6, January 13, January 20, and February 3 for this client" is an answer. The gap between January 20 and February 3 is a data point worth asking about. An agency that can produce specific dates for an active engagement in a few minutes has the release discipline you are evaluating. An agency that needs to check with the team has a looser process than their pitch described.
Bring your shortlist. We will walk you through how to evaluate each one on the questions that actually predict delivery.
Get my evaluation guide →Questions about past failures
Ask: "Tell me about a project that went significantly off track. What happened, whose call was it to flag it, and what did you change afterward?"
How a vendor describes a project that went wrong is the single most predictive signal in the entire evaluation. Every agency has had an engagement that did not go as planned. The ones that cannot name one are either too small to have faced real adversity or are not willing to be honest in a sales conversation. Neither is a strong start.
The specifics of the failure matter less than three things in the answer. First, does the agency name the failure without deflecting to the client? Agencies that describe failures entirely in terms of client decisions or external circumstances have not learned anything from the experience. Second, who flagged the problem? An agency that caught the issue and called it proactively is different from an agency that waited until the client noticed. Third, what changed? A specific process change, a new milestone structure, a different staffing model for similar engagements. The absence of a specific answer to "what changed" means nothing changed.
A capable agency gives you an answer that sounds like this: "We took on a healthcare engagement where the third-party records integration was scoped too loosely. It pushed the timeline by four weeks. Our delivery lead caught it at week three, not week six. We flagged it to the client, restructured the milestone plan, and added a separate integration milestone with its own buffer on every subsequent engagement that includes third-party dependencies." That answer is specific, names who caught the problem, and describes a concrete change. It is also verifiable. You can ask the follow-up: "Is that milestone structure in the scoping document I would sign?"
The answer that should end the conversation sounds like this: "We've had some communication challenges in the past but we've really worked on our processes since then." That answer tells you the agency has no specifics, has not changed anything structural, and is hoping you will not press further.
Red answers and green answers
Red answers are the ones that should end the evaluation. Green answers are the ones that earn trust. Here is how they split across the six questions.
Team structure. Red: "We have a strong bench and can flex resources as needed." This answer means the engagement is not staffed yet and the people you meet in the sales process may not be the people on your work. Green: "Your engagement would have [name] as delivery lead, [name] as lead engineer, and [name] as QA lead. If [name] leaves, [name] has full context and takes over. We have not had a senior-level departure on an active engagement in the last two years."
Quality. Red: "We have a robust QA process with automated testing across all platforms." The word "robust" is a tell. An agency with a real QA process describes it with specifics, not adjectives. Green: "We caught a payment flow defect in the last cycle before it shipped. It was caught by our automated end-to-end suite. Without it, the defect would have affected every user on checkout. Here is the type of test that found it."
Speed. Red: "We ship iteratively and adjust based on client feedback." This is a description of Agile methodology, not a release history. Green: "Here are the last eight release dates for our longest active client. The three-week gap in March was their legal review window, not our build cycle."
Past failures. Red: "Every project has challenges. The important thing is how you respond." This is a non-answer that signals the agency will not name a real failure in a sales conversation and will not call one early in an engagement either. Green: "We had an Android performance issue on a logistics engagement that shipped before our profiling step was complete. The client caught it before we did. We now run performance profiling as a required step, not an optional one, on every release. I can show you where that sits in our delivery checklist."
The pattern across red answers is abstraction. The pattern across green answers is specificity. An agency that has delivered well can be specific because they have data. An agency that has not delivered well falls back on descriptions of how they intend to work because they do not have a track record to draw from.
One more signal worth watching: how quickly does the agency ask about your timeline and budget versus how long they spend asking about your current problems? An agency that moves to timeline and pricing in the first 15 minutes is selling. An agency that spends the first 30 minutes asking what went wrong with your last vendor is evaluating whether they are the right fit. The second type is the one you want running your engagement.
One example of what these answers look like in practice: a federally regulated fintech exchange came to Wednesday after a prior agency shipped an app with architectural instability that was causing crashes in production. The prior agency's answers to quality questions during their own sales process had been general. No specifics, no named defects, no description of what their process actually caught. Wednesday's engagement began with a full architecture review, found issues the client did not know existed, and delivered a rebuild that reached zero crashes post-launch. The VP Engineering's summary: "They delivered on time, exceeded expectations, and found issues we didn't even know we had." That outcome starts with asking the right questions before you sign.
Bring your shortlist and your requirements. We will show you how Wednesday answers every one of these questions.
Book my 30-min call →Frequently asked questions
Not ready to call yet? Browse vendor scorecards, switching frameworks, and agency evaluation guides for enterprise mobile.
Read more decision guides →About the author
Rameez Khan
LinkedIn →Head of Delivery, Wednesday Solutions
Rameez has shipped mobile products at scale across on-demand logistics, entertainment, and edtech, and has led enterprise AI enablement across multiple Wednesday engagements. As Head of Delivery at Wednesday Solutions, he oversees how every engagement is scoped, staffed, and run from first build to production.
30 minutes with an engineer. You leave with a squad shape, a monthly cost, and a start date.
Get your start date →Keep reading
Shipped for enterprise and growth teams across US, Europe, and Asia