React Native + ExecuTorch: On-Device AI for Mobile Apps

AI apps are flooding the stores, but most rely on expensive cloud APIs and send your data to remote servers. What if your AI models ran directly on the phone, without connection, without API fees, guaranteeing total privacy? This is now possible thanks to on-device AI and the react-native-executorch framework. This complete guide explores how to build React Native apps with voice transcription, OCR and text generation, entirely offline.

What is On-Device AI?

On-device AI (or edge AI) refers to running artificial intelligence models directly on the user's device, without sending data to a server. Unlike cloud approaches (OpenAI API, Google Cloud AI), all processing happens locally on the smartphone's CPU/GPU/NPU.

History: From Cloud Siri to Local AI

In 2011, Siri emerged as one of the first mobile AI assistants, but completely dependent on the cloud. In 2014, "Hey Siri" introduced the first on-device feature (wake word detection). The turning point came in 2017 with the A11 Bionic (iPhone X) and its Neural Engine: 600 billion operations/second dedicated to AI.

Today, the A18 Pro (iPhone 16 Pro) reaches 35 trillion operations/second, enabling running LLMs, image generation and real-time voice recognition, entirely offline.

On-Device AI Advantages

Advantage	Description	Impact
Privacy by design	Data never leaves the device	GDPR compliant, zero data leak
Zero cost	No recurring API fees	Save hundreds $/month
Offline-first	Works without network	Airplane, tunnels, dead zones
Ultra-low latency	No network round-trip	<100ms vs 500-2000ms cloud
Infinite scalability	Each device = a server	No infrastructure limitation

Disadvantages and Limitations

Resource consumption: battery, heat, RAM (6-12GB max on mobile)
Model size: 20GB+ LLMs impossible to run locally
Performance disparity: flagship 50 tokens/s vs mid-range 5 tokens/s
Distribution: large apps (models = 50MB-2GB) or post-install download

Introducing react-native-executorch

react-native-executorch is an open-source library created by Software Mansion that allows React Native developers to integrate on-device AI without machine learning expertise. It relies on ExecuTorch, Meta's inference engine used in Instagram and Facebook.

Technical Architecture

ExecuTorch is Meta's optimized inference runtime for edge computing. It offers:

PyTorch integration: direct export from the PyTorch ecosystem
Cross-platform: smartphones, smartwatches, AR/VR, IoT, microcontrollers
Optimized backends:
- CoreML (iOS): Neural Engine acceleration
- Vulkan: cross-platform GPU
- XNNPACK: optimized CPU
- QNN (Qualcomm): Android NPU acceleration

Communication Flow

React Native App
    ↓
react-native-executorch (React hooks)
    ↓
ExecuTorch Runtime (C++)
    ↓
Backend (CoreML / Vulkan / CPU)
    ↓
Hardware (NPU / GPU / CPU)

Use Case: Offline Voice Transcription App

Let's build a real-time voice transcription app with OpenAI's Whisper model, running 100% on-device. This app allows you to:

Record audio in real-time via microphone
Transcribe voice to text instantly
Work offline (airplane, tunnel, etc.)
Guarantee privacy (audio never sent)

1. Installing Dependencies

# Create an Expo project
npx create-expo-app my-ai-app
cd my-ai-app

# Install dependencies
yarn add react-native-executorch
yarn add react-native-audio-api

2. Loading the Whisper Model

Use the useSpeechToText hook with Whisper Tiny English (balanced model for mobile):

import { useSpeechToText, WHISPER_TINY_EN } from 'react-native-executorch';

const model = useSpeechToText({
  model: WHISPER_TINY_EN,
});

// Display download progress
return (
  <View>
    {!model.isReady ? (
      <>
        <Text>Downloading Whisper model...</Text>
        <ProgressBar progress={model.downloadProgress} />
      </>
    ) : (
      <Text>Model ready! ✅</Text>
    )}
  </View>
);

3. Real-Time Transcription

Connect the audio recorder to the model for real-time transcription:

const [isRecording, setIsRecording] = useState(false);
const [transcription, setTranscription] = useState('');

const startRecording = () => {
  setIsRecording(true);
  
  // Stream audio to Whisper model
  recorder.addListener('audioData', (audioChunk) => {
    model.transcribe(audioChunk).then((result) => {
      setTranscription(prev => prev + ' ' + result.text);
    });
  });
  
  recorder.start();
};

const stopRecording = () => {
  recorder.stop();
  setIsRecording(false);
};

Available Models

react-native-executorch provides several pre-optimized models:

Speech Recognition (Whisper)

WHISPER_TINY_EN: 77MB, English only, best for mobile
WHISPER_BASE_EN: 145MB, better accuracy
WHISPER_SMALL_EN: 487MB, highest quality

Text Generation (LLaMA)

LLAMA_3_2_1B: 1.3GB, 1 billion parameters
LLAMA_3_2_3B: 3.2GB, 3 billion parameters

Performance Considerations

Model selection: Start with the smallest model (Tiny) and scale up if needed
Battery impact: Heavy models can drain battery quickly, optimize usage patterns
Storage: Models are downloaded once and cached locally
Device compatibility: Test on various device tiers (flagship, mid-range, budget)
Thermal management: Prolonged AI usage can cause device heating

Use Cases for On-Device AI

Healthcare apps: Patient data stays on device (HIPAA compliant)
Note-taking apps: Voice-to-text transcription offline
Document scanning: OCR without sending images to servers
Language learning: Real-time pronunciation feedback
Accessibility: Live captions for hearing-impaired users
Translation: Offline language translation

Best Practices

Progressive enhancement: Fall back to cloud API if device can't handle on-device
User control: Let users choose between on-device and cloud processing
Resource monitoring: Track battery and memory usage
Model updates: Implement background downloads for model updates
Error handling: Handle cases where models fail to load or run

Conclusion

On-device AI with React Native and ExecuTorch represents a paradigm shift in mobile app development. By running models locally, you gain privacy, reduce costs, enable offline functionality and deliver ultra-low latency experiences. While there are tradeoffs in model size and device compatibility, the benefits for many use cases are compelling. As mobile hardware continues to improve, on-device AI will become increasingly powerful and accessible.

Need Mobile AI Expertise?

Our VOID team can help you integrate on-device AI into your mobile apps. We work on:

React Native development with ExecuTorch
Model optimization for mobile devices
Performance tuning and battery optimization
Privacy-first architectures

Contact a mobile expert

Additional Resources

Mobile App Development Morocco: our mobile development services
Mobile Expertise: our mobile development approach
All our publications: tech guides and news

Article published on October 4, 2025. Complete guide to on-device AI with React Native and ExecuTorch for developers.