Writing

AI in Your Mobile App Without Internet: What Is Possible, What Is Not, and What It Costs in 2026

A plain-language guide to what on-device AI can do today, what it cannot, and what each capability costs to add.

Anurag RathodAnurag Rathod · Technical Lead, Wednesday Solutions
9 min read·Published Apr 24, 2026·Updated Apr 24, 2026
0xfaster with AI
0xfewer crashes
0xmore work, same cost
4.8on Clutch
Trusted by teams atAmerican ExpressVisaDiscoverEYSmarshKalshiBuildOps

A 7B parameter AI model fits on any device with 6GB of RAM shipped in the last three years. That covers the iPhone 15 Pro, the Samsung S24+, and the Pixel 8 Pro — all of which are in your users' pockets right now. What that model can do without touching the internet is the question most enterprise buyers have never gotten a straight answer to.

This guide answers it in plain language: what is working on-device today, what is not, what hardware it requires, and what it costs to add each capability to your app.

Key findings

A 7B parameter model fits on any flagship device with 6GB+ RAM — iPhone 15 Pro, Samsung S24+, Pixel 8 Pro. A 3B model fits on 4GB devices, covering most 2022+ flagships.

Voice transcription (Whisper) works on any device from 2020 onward. Image generation requires NPU acceleration: Apple A15+ or Snapdragon 8 Gen 1+.

What does not work on-device: real-time knowledge, models above 7B parameters, reliable support for rare languages.

Wednesday's Off Grid ships text AI, voice transcription, image generation, and vision analysis simultaneously on iOS and Android — the reference for what is achievable today.

What on-device AI actually means

On-device AI means the model runs on the phone or tablet. No internet connection is needed during inference. The user types a query, the device processes it using its own chip, and the response appears — all without any data leaving the device.

The model is installed on the device, either bundled with the app or downloaded to local storage on first launch. Once installed, it works offline indefinitely. Updates to the model come through app updates or background downloads, but day-to-day use requires no connectivity.

This is different from a cloud AI feature that "feels" fast because it has a low-latency API. Even a 100ms cloud AI response requires an internet connection and transmits user data to a remote server. On-device AI needs neither.

The practical implication for enterprise apps: on-device AI works in hospitals where phones are in airplane mode, on job sites with no cellular coverage, in areas with unreliable connectivity, and in any situation where sending data to a server is prohibited by policy.

What works on-device today

Six categories of AI capability run reliably on-device on current enterprise hardware.

Text generation and summarisation (up to 7B parameters). A 7B parameter language model handles text summarisation, question answering, document analysis, writing assistance, and conversational interfaces. The output quality matches cloud AI for most enterprise use cases. Wednesday's Off Grid ships 3B and 7B models; the 7B model produces output that users frequently cannot distinguish from a cloud API.

Voice transcription. OpenAI's Whisper model transcribes speech to text entirely on-device with accuracy equivalent to cloud transcription services. It runs on any device from 2020 onward. Battery impact during active transcription is under 3% per hour. Supported languages: English, Spanish, French, German, and 35 other commonly spoken languages.

Image classification. Identifying what is in a photo — object category, scene type, product identification — runs on-device with classification accuracy above 90% on standard categories. Inference time is under 200ms on NPU-equipped devices. Useful for field inspection apps, inventory apps, and any workflow where users photograph physical objects.

Object detection. Locating and labelling multiple objects within a single image, including their position. Runs on-device at 15-30 frames per second on NPU-equipped devices. Useful for assembly line inspection, safety compliance documentation, and augmented reality overlays.

Document Q&A. Asking questions about a PDF or document and receiving a specific answer. A 3B parameter model with retrieval-augmented generation handles documents up to approximately 50 pages without cloud processing. Useful for policy lookup, contract review, and field procedure reference.

Image generation. Generating images from text descriptions runs on-device on NPU-equipped devices (Apple A15+, Snapdragon 8 Gen 1+). Inference time for a 512x512 image is 15-45 seconds on current hardware. Off Grid ships image generation on-device for both iOS and Android.

What does not work on-device today

Three categories of capability have real limitations on current hardware that make on-device AI the wrong architecture.

Real-time knowledge. An on-device model knows nothing beyond its training data cutoff. It cannot look up current prices, recent news, live inventory, or any information that changes after the model was trained. Cloud AI features that use retrieval-augmented generation against live data sources cannot be replicated on-device without an internet connection. For apps that require current information, cloud AI is the correct architecture.

Models above 7B parameters. Current flagship devices support models up to approximately 7B parameters at practical inference speeds. Larger models (13B, 70B, and above) produce higher-quality outputs for complex reasoning tasks but do not fit in device RAM or run at usable inference speeds on current hardware. If your use case requires complex multi-step reasoning, nuanced writing, or advanced code generation, a 7B model may not meet the quality bar. Cloud AI may be necessary.

Reliable rare language support. Current small on-device models are trained primarily on English and major European languages. Support for Arabic, Hindi, Japanese, Korean, and other languages has improved significantly but is not yet equivalent to cloud models in output quality. For enterprise apps serving multilingual user bases, language-specific evaluation is required before committing to on-device AI for text generation.

A 30-minute call with a Wednesday engineer maps which on-device capabilities are feasible for your specific app and user base.

Get my recommendation

Device requirements by capability

Not all on-device AI works on all devices. The table below shows the minimum device specifications for each capability category, based on Wednesday's production testing data across Off Grid deployments.

CapabilityMinimum deviceRAM requirementNPU required?
Text generation (3B model)iPhone 12 / Samsung S214GBRecommended, not required
Text generation (7B model)iPhone 15 Pro / Samsung S24+6GBRequired for acceptable speed
Voice transcription (Whisper)iPhone X / any 2020+ Android2GBNo — CPU sufficient
Image classificationiPhone X / any 2019+ Android2GBRecommended
Object detectioniPhone X / any 2019+ Android2GBRecommended
Document Q&A (3B model)iPhone 12 / Samsung S214GBRecommended
Image generationiPhone 13 Pro / Samsung S22+6GBRequired

The practical implication for enterprise deployment: before committing to an on-device AI feature, profile your actual user base's device mix. If 40% of your users are on three-year-old devices with 4GB RAM, a 7B model will not serve them. A 3B model or voice transcription will.

Cost to add each capability

The following cost ranges are based on Wednesday's delivery data across enterprise mobile engagements. They assume an existing production app on iOS and Android and include model integration, device compatibility testing, RAM management, background state architecture, and App Store submission preparation.

CapabilityEstimated engagement timeComplexity notes
Voice transcription (both platforms)4-6 weeksLow complexity; Whisper is well-documented
Image classification (both platforms)3-5 weeksLow complexity; mature model ecosystem
Object detection (both platforms)5-7 weeksModerate; requires bounding box UI
Text generation — 3B model (both platforms)8-12 weeksHigh; RAM management, chipset variants
Text generation — 7B model (both platforms)10-14 weeksHigh; device compatibility matrix is narrower
Document Q&A (both platforms)8-12 weeksHigh; retrieval system plus model
Image generation (both platforms)12-16 weeksVery high; NPU requirement and long inference time
Full multi-capability suite16-24 weeksVery high; Off Grid is the reference

These ranges assume a vendor who has shipped on-device AI in production. A vendor shipping on-device AI for the first time should add 4-8 weeks to each range for problem discovery time.

The Off Grid reference point

Wednesday's Off Grid app ships all six capability categories simultaneously on iOS and Android. Text generation (3B and 7B), voice transcription, image classification, object detection, document Q&A, and image generation — all running on-device, offline, with no telemetry.

50,000+ users have downloaded Off Grid. The GitHub page is public and has 1,700+ stars. The implementation handles Metal abort() on 4GB iPhones, chipset-specific QNN variants on Android, and background generation state across all capabilities.

Off Grid is not a demonstration. It is a production application that Wednesday built to validate what on-device AI requires at production quality. Every enterprise on-device AI engagement Wednesday takes on starts from the architecture Off Grid validated.

How to decide what to build

The decision framework for on-device AI is three questions.

Does your use case require features that only work with internet connectivity? Real-time data, rare languages, or complex reasoning above a 7B parameter quality bar all require cloud AI. If your answer is yes for the primary use case, on-device AI may be a supplement but not the core architecture.

Does your compliance or privacy context make cloud AI complicated? If your users' inputs contain sensitive data — patient information, financial records, privileged communications — cloud AI creates compliance overhead that on-device AI avoids entirely. If the answer is yes, on-device AI is worth the implementation complexity.

What is your users' device mix? If your user base has significant numbers on 4GB RAM devices and the use case requires a 7B parameter model, the feature will not be available to those users. Profiling your actual user base before scoping determines whether on-device AI can serve enough of your users to justify the build.

Wednesday's pre-scope process covers all three questions for your specific app, compliance context, and user base before a line of code is written.

Wednesday has shipped every on-device AI capability in this guide in production. The assessment for your app takes 30 minutes.

Book my 30-min call
4.8 on Clutch
4x faster with AI2x fewer crashes100% money back

Frequently asked questions

More on-device AI guides, device requirement analyses, and cost frameworks are in the writing archive.

Read more decision guides

About the author

Anurag Rathod

Anurag Rathod

LinkedIn →

Technical Lead, Wednesday Solutions

Anurag builds on-device AI features at Wednesday Solutions and contributed to Off Grid, Wednesday's open-source on-device AI mobile application.

Four weeks from this call, a Wednesday squad is shipping your mobile app. 30 minutes confirms the team shape and start date.

Get your start date
4.8 on Clutch
4x faster with AI2x fewer crashes100% money back

Shipped for enterprise and growth teams across US, Europe, and Asia

American Express
Visa
Discover
EY
Smarsh
Kalshi
BuildOps
Ninjavan
Kotak Securities
Rapido
PharmEasy
PayU
Simpl
Docon
Nymble
SpotAI
Zalora
Velotio
Capital Float
Buildd
Kunai
Kalsi
American Express
Visa
Discover
EY
Smarsh
Kalshi
BuildOps
Ninjavan
Kotak Securities
Rapido
PharmEasy
PayU
Simpl
Docon
Nymble
SpotAI
Zalora
Velotio
Capital Float
Buildd
Kunai
Kalsi
American Express
Visa
Discover
EY
Smarsh
Kalshi
BuildOps
Ninjavan
Kotak Securities
Rapido
PharmEasy
PayU
Simpl
Docon
Nymble
SpotAI
Zalora
Velotio
Capital Float
Buildd
Kunai
Kalsi