Writing

How to Evaluate a Flutter Development Vendor: The Complete Scorecard for US Enterprise 2026

Eight questions separate Flutter vendors with genuine enterprise experience from those who have shipped MVPs. Ask them before signing, not after.

Rameez KhanRameez Khan · Head of Delivery, Wednesday Solutions
9 min read·Published Apr 24, 2026·Updated Apr 24, 2026
0xfaster with AI
0xfewer crashes
0xmore work, same cost
4.8on Clutch
Trusted by teams atAmerican ExpressVisaDiscoverEYSmarshKalshiBuildOps

60% of Flutter agencies claiming enterprise experience have fewer than three production Flutter apps in the App Store. That gap between claim and evidence is the central problem in Flutter vendor evaluation. A vendor that has shipped five or more production Flutter enterprise apps has encountered and solved the failure modes that a vendor with one or two apps has not yet seen. Eight specific questions expose that gap before you sign, not after your deadline slips.

Key findings

60% of agencies claiming enterprise Flutter experience have fewer than three production Flutter apps in the App Store — the most basic verification step eliminates the majority of unqualified vendors.

Wednesday has shipped 10+ Flutter apps to production enterprise clients with a weekly release cadence across all active engagements.

The eight questions in this article produce a vendor evaluation that takes 30 minutes and reliably separates Flutter veterans from Flutter beginners.

Wednesday's Flutter clients rate Wednesday 4.8/5 on Clutch, with reviewers specifically citing delivery on time, proactive problem-finding, and exceeding expectations.

Why Flutter vendor evaluation is different

General mobile vendor evaluation focuses on process, communication, and references. These are necessary but not sufficient for Flutter-specific evaluation. Flutter has enough surface area — the widget rendering model, the Dart language, the Flutter-specific CI/CD requirements, the platform channel architecture for device integrations — that a vendor with strong general mobile experience but limited Flutter depth will underperform on Flutter-specific requirements.

The eight questions below are Flutter-specific. They are not general software quality questions. Each question has a specific technically correct answer — and the answer tells you whether the vendor has actually shipped Flutter at enterprise scale or has delivered enough MVPs to be credible in a sales call without being qualified for an enterprise engagement.

The questions are designed to be asked in a 30-minute technical call with the lead engineer or CTO, not with the sales lead. Sales leads can give credible-sounding answers to technical questions without the technical depth to back them up. The lead engineer's answer to these questions cannot be faked without genuine experience.

Question 1: production apps in the App Store

Ask: "Can you name three production Flutter apps that are currently in the App Store that your team built? I'd like to be able to download and test them."

The correct answer is three specific app names, available on the App Store, with the vendor's name in the developer attribution or a verifiable description of their involvement.

Red flags: naming apps that are no longer available, naming apps where the vendor relationship ended more than 18 months ago, inability to name three apps, mentioning internal tools or white-label apps that cannot be independently verified.

Why it matters: downloading the apps tells you about release freshness (when was the last update?), UI consistency across iOS and Android, performance on your test device, and whether the apps' quality matches the vendor's claims.

Question 2: crash-free rate target

Ask: "What crash-free rate do you target for enterprise Flutter apps, and how do you measure it?"

The correct answer names a specific rate (99%+ for enterprise, with 99.5% as a strong standard), names the measurement tool (Firebase Crashlytics, Sentry, or equivalent), and describes how the reporting is segmented (by device model, by OS version, by app version).

Red flags: "We aim for as few crashes as possible" without a number, reference only to testing without a production monitoring tool, a number below 99% presented as acceptable for enterprise use.

Why it matters: crash-free rate is the most fundamental mobile quality metric. A vendor that does not target a specific rate and measure it in production is not managing quality — they are shipping and hoping.

Question 3: Flutter updates and breaking changes

Ask: "How do you handle Flutter stable releases and breaking changes for active enterprise clients?"

The correct answer describes a process: a post-stable-release evaluation period (typically 2 to 4 weeks), a dependency audit run to identify breaking changes in the plugin ecosystem, automated tests run against the new Flutter version, and a scheduled update included in the regular release cycle.

Red flags: "We update when we need to," "We update as soon as the stable version is released" (without mentioning a dependency audit), "The client decides when to update," no mention of dependency audits.

Why it matters: Flutter stable releases happen roughly twice per year. Each release can break plugins in the dependency tree. An agency without a structured update process will either apply updates that break the app or defer updates indefinitely, accumulating technical debt.

Question 4: state management choice and reasoning

Ask: "What state management do you use for complex enterprise Flutter apps and why?"

The correct answer names Bloc or Riverpod, explains the reasoning (Bloc for complex state with many interdependencies and a need for strict testability, Riverpod for cleaner code with reactive patterns), and acknowledges when simpler approaches are appropriate.

Red flags: recommending Provider for complex enterprise apps ("it's simpler"), mentioning setState as the primary state management for anything beyond local UI state, recommending GetX without acknowledging its testability limitations, inability to name the specific state management approach they use.

Why it matters: the wrong state management choice for app complexity is one of the five diagnosable Flutter enterprise failure modes. A vendor that recommends Provider or setState for a complex enterprise app does not have the depth to sustain quality as the app grows.

Question 5: non-flagship Android device testing

Ask: "How do you test Flutter apps on non-flagship Android devices, and what devices are in your test matrix?"

The correct answer names specific device models beyond the current flagship (Samsung Galaxy A-series, older Pixel models, or equivalent mid-range Android), describes a physical device testing process (not emulator-only), and mentions a specific number of device and OS combinations in the matrix.

Red flags: testing only on current flagship devices, testing only on emulators, inability to name specific non-flagship devices in the test matrix, a device matrix of fewer than eight devices.

Why it matters: enterprise user fleets include mid-range Android devices that behave differently from flagship devices. Flutter rendering bugs, performance issues, and plugin failures on non-flagship Android are the second most common cause of post-launch user complaints after crash-free rate failures.

Question 6: release cadence

Ask: "What is your current release cadence for active Flutter enterprise clients?"

The correct answer is weekly, with a description of how that cadence is maintained: an automated CI/CD pipeline, a release branch process, automated App Store submission, and a process for handling App Store review delays without disrupting the cadence.

Red flags: "When features are ready," "We do monthly releases," "It depends on the client," any answer that treats release cadence as a variable rather than a commitment.

Why it matters: weekly release cadence is the operating standard for enterprise mobile. It requires an automated pipeline, not a manual process. A vendor that cannot describe a weekly cadence process does not have one. A vendor without a weekly release cadence is delivering at slower speed than the market expects.

Question 7: App Store Flutter-specific review issues

Ask: "Have you ever had a Flutter app rejected during App Store review for a Flutter-specific reason? What was the issue and how did you resolve it?"

The correct answer describes a specific experience — a Flutter plugin that was rejected for a native implementation issue, a Flutter app review flag for a specific entitlement requirement, or a review question about the Flutter engine's memory behavior. The resolution demonstrates a working knowledge of Apple's review process for Flutter apps.

A vendor with no App Store Flutter rejection experience has either shipped very few apps or been lucky. The answer should reflect genuine experience with the review process for Flutter-specific issues.

Red flags: "We've never had a rejection," "We submit and it goes through," claiming Flutter rejections are handled identically to any other app rejection, no specific experience with the App Store review process for Flutter.

Why it matters: App Store review for Flutter apps has Flutter-specific issues that appear without warning. An agency that has encountered and resolved them can navigate them on your timeline. One that has not will take longer to diagnose and resolve them.

Question 8: onboarding timeline

Ask: "If we signed tomorrow, when would your engineers be contributing to our weekly release?"

The correct answer is a specific number of weeks — four weeks is the Wednesday standard — with a description of what happens in those weeks: architecture review, development environment setup, CI/CD integration, and the first independently planned and shipped release.

Red flags: "A few months," "It depends on the complexity," "We'd need to do a discovery phase first," no specific commitment.

Why it matters: the onboarding commitment is a delivery commitment. A vendor who cannot commit to a specific onboarding timeline cannot commit to a delivery timeline. The engineering quality of the onboarding — how well the new team understands the architecture and contributes to the release cadence by the end of week four — reflects the overall engineering quality of the engagement.

Flutter vendor scorecard table

Question3 points (evidence-backed)2 points (credible, no evidence)1 point (vague)
Production App Store appsNames 3+ downloadable appsNames 3 apps without download evidenceFewer than 3 or cannot name
Crash-free rateSpecific rate + measurement toolSpecific rate without toolNo specific rate
Flutter update processDescribes specific processMentions process without detailNo structured process
State managementNames Bloc/Riverpod with reasoningNames Bloc/Riverpod without reasoningNames Provider or setState for complex apps
Device testing matrixNames specific devices, 12+Describes device testing without specificsEmulator only or flagship only
Release cadenceWeekly with automated pipelineWeekly without pipeline descriptionVariable or monthly
App Store review experienceDescribes specific rejection and fixMentions review experience without specifics"No issues ever"
Onboarding timelineSpecific weeks with processSpecific weeks without processVague or months

A score of 20 or above out of 24 indicates a vendor with genuine enterprise Flutter experience. Wednesday scores 24 out of 24 — every question has evidence-backed answers from production deployments.

Want to run this scorecard against Wednesday? Book a 30-minute technical call and ask every question on the list.

Get my recommendation

Wednesday's Flutter track record is public and verifiable. The retail engagement — 99% crash-free sessions at 20 million users, maintained across every release — is the most direct answer to every question in this scorecard. The crash-free rate is measured and reported. The release cadence is weekly. The device testing matrix covers 16 combinations. The onboarding timeline is four weeks. The App Store submission is automated. Every claim has evidence.

Run the eight questions on Wednesday in a 30-minute call. Leave with the specific answers that let you make a confident vendor decision.

Book my 30-min call
4.8 on Clutch
4x faster with AI2x fewer crashes100% money back

Frequently asked questions

More on vendor evaluation? The writing archive has mobile development vendor scorecards, contract frameworks, and Flutter agency comparisons.

Read more decision guides

About the author

Rameez Khan

Rameez Khan

LinkedIn →

Head of Delivery, Wednesday Solutions

Rameez Khan leads delivery at Wednesday Solutions and has run dozens of Flutter vendor transitions, technical assessments, and enterprise onboarding engagements.

Four weeks from this call, a Wednesday squad is shipping your mobile app. 30 minutes confirms the team shape and start date.

Get your start date
4.8 on Clutch
4x faster with AI2x fewer crashes100% money back

Shipped for enterprise and growth teams across US, Europe, and Asia

American Express
Visa
Discover
EY
Smarsh
Kalshi
BuildOps
Ninjavan
Kotak Securities
Rapido
PharmEasy
PayU
Simpl
Docon
Nymble
SpotAI
Zalora
Velotio
Capital Float
Buildd
Kunai
Kalsi
American Express
Visa
Discover
EY
Smarsh
Kalshi
BuildOps
Ninjavan
Kotak Securities
Rapido
PharmEasy
PayU
Simpl
Docon
Nymble
SpotAI
Zalora
Velotio
Capital Float
Buildd
Kunai
Kalsi
American Express
Visa
Discover
EY
Smarsh
Kalshi
BuildOps
Ninjavan
Kotak Securities
Rapido
PharmEasy
PayU
Simpl
Docon
Nymble
SpotAI
Zalora
Velotio
Capital Float
Buildd
Kunai
Kalsi