AI apps are flooding the stores, but most rely on expensive cloud APIs and send your data to remote servers. What if your AI models ran directly on the phone, without connection, without API fees, guaranteeing total privacy? This is now possible thanks to on-device AI and the react-native-executorch framework. This complete guide explores how to build React Native apps with voice transcription, OCR and text generation, entirely offline.
What is On-Device AI?
On-device AI (or edge AI) refers to running artificial intelligence models directly on the user's device, without sending data to a server. Unlike cloud approaches (OpenAI API, Google Cloud AI), all processing happens locally on the smartphone's CPU/GPU/NPU.
History: From Cloud Siri to Local AI
In 2011, Siri emerged as one of the first mobile AI assistants, but completely dependent on the cloud. In 2014, "Hey Siri" introduced the first on-device feature (wake word detection). The turning point came in 2017 with the A11 Bionic (iPhone X) and its Neural Engine: 600 billion operations/second dedicated to AI.
Today, the A18 Pro (iPhone 16 Pro) reaches 35 trillion operations/second, enabling running LLMs, image generation and real-time voice recognition, entirely offline.
On-Device AI Advantages
| Advantage | Description | Impact |
|---|---|---|
| Privacy by design | Data never leaves the device | GDPR compliant, zero data leak |
| Zero cost | No recurring API fees | Save hundreds $/month |
| Offline-first | Works without network | Airplane, tunnels, dead zones |
| Ultra-low latency | No network round-trip | <100ms vs 500-2000ms cloud |
| Infinite scalability | Each device = a server | No infrastructure limitation |
Disadvantages and Limitations
- Resource consumption: battery, heat, RAM (6-12GB max on mobile)
- Model size: 20GB+ LLMs impossible to run locally
- Performance disparity: flagship 50 tokens/s vs mid-range 5 tokens/s
- Distribution: large apps (models = 50MB-2GB) or post-install download
Introducing react-native-executorch
react-native-executorch is an open-source library created by Software Mansion that allows React Native developers to integrate on-device AI without machine learning expertise. It relies on ExecuTorch, Meta's inference engine used in Instagram and Facebook.
Technical Architecture
ExecuTorch is Meta's optimized inference runtime for edge computing. It offers:
- PyTorch integration: direct export from the PyTorch ecosystem
- Cross-platform: smartphones, smartwatches, AR/VR, IoT, microcontrollers
- Optimized backends:
- CoreML (iOS): Neural Engine acceleration
- Vulkan: cross-platform GPU
- XNNPACK: optimized CPU
- QNN (Qualcomm): Android NPU acceleration
Communication Flow
React Native App
↓
react-native-executorch (React hooks)
↓
ExecuTorch Runtime (C++)
↓
Backend (CoreML / Vulkan / CPU)
↓
Hardware (NPU / GPU / CPU)Use Case: Offline Voice Transcription App
Let's build a real-time voice transcription app with OpenAI's Whisper model, running 100% on-device. This app allows you to:
- Record audio in real-time via microphone
- Transcribe voice to text instantly
- Work offline (airplane, tunnel, etc.)
- Guarantee privacy (audio never sent)
1. Installing Dependencies
# Create an Expo project
npx create-expo-app my-ai-app
cd my-ai-app
# Install dependencies
yarn add react-native-executorch
yarn add react-native-audio-api2. Loading the Whisper Model
Use the useSpeechToText hook with Whisper Tiny English (balanced model for mobile):
import { useSpeechToText, WHISPER_TINY_EN } from 'react-native-executorch';
const model = useSpeechToText({
model: WHISPER_TINY_EN,
});
// Display download progress
return (
<View>
{!model.isReady ? (
<>
<Text>Downloading Whisper model...</Text>
<ProgressBar progress={model.downloadProgress} />
</>
) : (
<Text>Model ready! ✅</Text>
)}
</View>
);3. Real-Time Transcription
Connect the audio recorder to the model for real-time transcription:
const [isRecording, setIsRecording] = useState(false);
const [transcription, setTranscription] = useState('');
const startRecording = () => {
setIsRecording(true);
// Stream audio to Whisper model
recorder.addListener('audioData', (audioChunk) => {
model.transcribe(audioChunk).then((result) => {
setTranscription(prev => prev + ' ' + result.text);
});
});
recorder.start();
};
const stopRecording = () => {
recorder.stop();
setIsRecording(false);
};Available Models
react-native-executorch provides several pre-optimized models:
Speech Recognition (Whisper)
- WHISPER_TINY_EN: 77MB, English only, best for mobile
- WHISPER_BASE_EN: 145MB, better accuracy
- WHISPER_SMALL_EN: 487MB, highest quality
Text Generation (LLaMA)
- LLAMA_3_2_1B: 1.3GB, 1 billion parameters
- LLAMA_3_2_3B: 3.2GB, 3 billion parameters
Performance Considerations
- Model selection: Start with the smallest model (Tiny) and scale up if needed
- Battery impact: Heavy models can drain battery quickly, optimize usage patterns
- Storage: Models are downloaded once and cached locally
- Device compatibility: Test on various device tiers (flagship, mid-range, budget)
- Thermal management: Prolonged AI usage can cause device heating
Use Cases for On-Device AI
- Healthcare apps: Patient data stays on device (HIPAA compliant)
- Note-taking apps: Voice-to-text transcription offline
- Document scanning: OCR without sending images to servers
- Language learning: Real-time pronunciation feedback
- Accessibility: Live captions for hearing-impaired users
- Translation: Offline language translation
Best Practices
- Progressive enhancement: Fall back to cloud API if device can't handle on-device
- User control: Let users choose between on-device and cloud processing
- Resource monitoring: Track battery and memory usage
- Model updates: Implement background downloads for model updates
- Error handling: Handle cases where models fail to load or run
Conclusion
On-device AI with React Native and ExecuTorch represents a paradigm shift in mobile app development. By running models locally, you gain privacy, reduce costs, enable offline functionality and deliver ultra-low latency experiences. While there are tradeoffs in model size and device compatibility, the benefits for many use cases are compelling. As mobile hardware continues to improve, on-device AI will become increasingly powerful and accessible.
Need Mobile AI Expertise?
Our VOID team can help you integrate on-device AI into your mobile apps. We work on:
- React Native development with ExecuTorch
- Model optimization for mobile devices
- Performance tuning and battery optimization
- Privacy-first architectures
Additional Resources
- Mobile App Development Morocco: our mobile development services
- Mobile Expertise: our mobile development approach
- All our publications: tech guides and news
Article published on October 4, 2025. Complete guide to on-device AI with React Native and ExecuTorch for developers.