February 4, 2026

·

RunAnywhere Flutter SDK Part 1: Chat with LLMs On-Device

RunAnywhere Flutter SDK Part 1: Chat with LLMs On-Device
DEVELOPERS

Run LLMs Entirely On-Device with Flutter


This is Part 1 of our RunAnywhere Flutter SDK tutorial series:

  1. Chat with LLMs (this post) — Project setup and streaming text generation
  2. Speech-to-Text — Real-time transcription with Whisper
  3. Text-to-Speech — Natural voice synthesis with Piper
  4. Voice Pipeline — Full voice assistant with VAD

Flutter's "write once, run anywhere" promise meets on-device AI. With RunAnywhere, you can build cross-platform apps that run powerful language models directly on iOS and Android devices—no cloud, no API keys, complete privacy.

In this tutorial, we'll set up the Flutter SDK and build a streaming chat interface that works offline on both platforms.

This tutorial targets RunAnywhere 0.15.x (latest). The reference sample project is local_ai_playground; the code below matches the current SDK API.

Why On-Device AI?

AspectCloud AIOn-Device AI
PrivacyData sent to serversData stays on device
LatencyNetwork round-tripInstant local processing
OfflineRequires internetWorks anywhere
CostPer-request billingOne-time download

For cross-platform apps handling sensitive data, on-device processing provides the privacy users expect across both iOS and Android.

Prerequisites

  • Flutter 3.10+ with Dart 3.0+
  • Xcode 14+ (for iOS builds)
  • Android Studio with SDK 24+ (for Android builds; matches minSdkVersion 24 below)
  • Physical device recommended (iOS or Android)
  • ~250MB storage for the LLM model (Parts 2-4 add ~140MB more)

Project Setup

1. Create a New Flutter Project

bash
1flutter create local_ai_playground
2cd local_ai_playground
Android Studio new Flutter project

2. Add the RunAnywhere SDK

Add the following dependencies to your pubspec.yaml:

yaml
1dependencies:
2 flutter:
3 sdk: flutter
4 runanywhere: ^0.17.4
5 runanywhere_llamacpp: ^0.17.4
6 runanywhere_onnx: ^0.17.4
7 provider: ^6.0.0
8 # Audio recording & playback (used in Parts 2-4)
9 path_provider: ^2.1.0
10 record: ^5.1.0
11 audioplayers: ^6.0.0

Then run:

bash
1flutter pub get
pubspec.yaml with RunAnywhere dependencies

3. iOS Configuration

For iOS, add or ensure these lines in your existing ios/Podfile (inside the target 'Runner' do block for use_frameworks!; do not replace the entire file):

ruby
1platform :ios, '14.0'
2
3# Critical: Use static linking for RunAnywhere
4use_frameworks! :linkage => :static

Why static linking? RunAnywhere's native iOS libraries are distributed as static frameworks. The :linkage => :static flag tells CocoaPods to link them statically, avoiding "image not found" crashes at runtime. This is required for Flutter projects using RunAnywhere on iOS.

Then install pods:

bash
1cd ios && pod install && cd ..

4. Android Configuration

Set minSdk 24 (Android 7.0+) in your app-level build file—add or update that line in defaultConfig; don’t replace the whole file. Your project will have one or the other of the files below; only edit the one that exists (don’t update both).

text
1android/app/
2 ├── build.gradle ← Use this snippet if you have this file (Groovy)
3 └── build.gradle.kts ← Use this snippet if you have this file (Kotlin DSL)

Groovy — file: android/app/build.gradle

groovy
1android {
2 defaultConfig {
3 minSdkVersion 24 // Required for RunAnywhere (Android 7.0+)
4 }
5}

Kotlin DSL — file: android/app/build.gradle.kts

kotlin
1android {
2 defaultConfig {
3 minSdk = 24 // Required for RunAnywhere (Android 7.0+)
4 }
5}

If your project uses minSdk = flutter.minSdkVersion (or minSdkVersion flutter.minSdkVersion), replace or override it with 24, since Flutter’s default is 21 and the SDK needs 24.

Add these permissions to android/app/src/main/AndroidManifest.xml (merge with any existing <uses-permission> tags):

  • INTERNET — Required for downloading models in this part (and for any cloud fallback).
  • RECORD_AUDIO — Required for Parts 2–4 (Speech-to-Text and Voice). Safe to add now.
xml
1<uses-permission android:name="android.permission.INTERNET" />
2<uses-permission android:name="android.permission.RECORD_AUDIO" />

Sample app and setup notes

The local_ai_playground sample (and the Flutter Example App in the SDK repo) is aligned with RunAnywhere 0.15.x:

  • lib/features/chat/chat_view.dart — Chat UI with model download, load, and streaming generation using the current SDK API.
  • ios/Flutter/Profile.xcconfig — Ensures the iOS Profile build configuration includes CocoaPods settings and avoids the "CocoaPods did not set the base configuration" warning.
  • API migration — If you're coming from an older SDK, check the sample app's docs/ (e.g. RUNANYWHERE_API_UPDATES.md) or the repository for old vs new API snippets.

SDK Initialization

The SDK requires a specific initialization order. Create lib/app/app_initializer.dart:

dart
1import 'package:flutter/foundation.dart';
2import 'package:runanywhere/runanywhere.dart';
3import 'package:runanywhere_llamacpp/runanywhere_llamacpp.dart';
4import 'package:runanywhere_onnx/runanywhere_onnx.dart';
5
6class AppInitializer {
7 static Future<void> initialize() async {
8 try {
9 // Step 1: Initialize core SDK
10 await RunAnywhere.initialize();
11 debugPrint('SDK: RunAnywhere initialized');
12
13 // Step 2: Register backends BEFORE adding models
14 await LlamaCpp.register();
15 debugPrint('SDK: LlamaCpp backend registered');
16
17 await Onnx.register();
18 debugPrint('SDK: ONNX backend registered');
19
20 // Step 3: Register the LLM model
21 RunAnywhere.registerModel(
22 id: 'lfm2-350m-q4_k_m',
23 name: 'LiquidAI LFM2 350M',
24 url: 'https://huggingface.co/LiquidAI/LFM2-350M-GGUF/resolve/main/LFM2-350M-Q4_K_M.gguf',
25 framework: InferenceFramework.llamaCpp,
26 memoryRequirement: 250000000,
27 );
28
29 debugPrint('SDK: Model registered successfully');
30 } catch (e) {
31 debugPrint('SDK: Initialization failed: $e');
32 rethrow;
33 }
34 }
35}
App initializing SDK on launch

Update lib/main.dart:

dart
1import 'package:flutter/material.dart';
2import 'app/app_initializer.dart';
3import 'features/chat/chat_view.dart';
4
5void main() async {
6 WidgetsFlutterBinding.ensureInitialized();
7
8 await AppInitializer.initialize();
9
10 runApp(const MyApp());
11}
12
13class MyApp extends StatelessWidget {
14 const MyApp({super.key});
15
16 @override
17 Widget build(BuildContext context) {
18 return MaterialApp(
19 title: 'Local AI Playground',
20 theme: ThemeData.dark(),
21 home: const ChatView(),
22 );
23 }
24}

Architecture Overview

text
1┌─────────────────────────────────────────────────────┐
2│ RunAnywhere Core │
3│ (Unified API, Model Management) │
4├───────────────────────┬─────────────────────────────┤
5│ LlamaCpp Backend │ ONNX Backend │
6│ ───────────────── │ ───────────────── │
7│ • Text Generation │ • Speech-to-Text │
8│ • Chat Completion │ • Text-to-Speech │
9│ • Streaming │ • Voice Activity (VAD) │
10└───────────────────────┴─────────────────────────────┘

Downloading & Loading Models

Create lib/services/model_service.dart:

dart
1import 'package:flutter/foundation.dart';
2import 'package:runanywhere/runanywhere.dart';
3
4class ModelService extends ChangeNotifier {
5 double _downloadProgress = 0.0;
6 bool _isDownloading = false;
7 bool _isModelLoaded = false;
8 String? _error;
9
10 double get downloadProgress => _downloadProgress;
11 bool get isDownloading => _isDownloading;
12 bool get isModelLoaded => _isModelLoaded;
13 String? get error => _error;
14
15 Future<void> downloadAndLoadModel(String modelId) async {
16 _isDownloading = true;
17 _error = null;
18 notifyListeners();
19
20 try {
21 // Check if already downloaded
22 final isDownloaded =
23 (await RunAnywhere.availableModels()).any((m) => m.id == modelId && m.localPath != null);
24
25 if (!isDownloaded) {
26 // Download with progress tracking
27 await for (final progress in RunAnywhere.downloadModel(modelId)) {
28 _downloadProgress = progress.percentage;
29 notifyListeners();
30
31 debugPrint('Download: ${(_downloadProgress * 100).toStringAsFixed(1)}%');
32
33 if (progress.state.isCompleted) break;
34 }
35 }
36
37 // Load into memory
38 await RunAnywhere.loadModel(modelId);
39
40 _isModelLoaded = true;
41 _isDownloading = false;
42 notifyListeners();
43
44 debugPrint('Model loaded successfully');
45 } catch (e) {
46 _error = e.toString();
47 _isDownloading = false;
48 notifyListeners();
49 debugPrint('Model error: $e');
50 }
51 }
52}

Note: Only one LLM model can be loaded at a time. Loading a different model automatically unloads the current one. The SDK uses loadModel() for LLMs; Parts 2-3 use loadSTTModel() and loadTTSVoice() for speech models—these use separate memory pools and can be loaded simultaneously.

Streaming Text Generation

Now for the fun part—generating text with your on-device LLM. Create lib/features/chat/chat_view.dart:

dart
1import 'package:flutter/material.dart';
2import 'package:runanywhere/runanywhere.dart';
3
4class ChatView extends StatefulWidget {
5 const ChatView({super.key});
6
7 @override
8 State<ChatView> createState() => _ChatViewState();
9}
10
11class _ChatViewState extends State<ChatView> {
12 final TextEditingController _controller = TextEditingController();
13 final List<ChatMessage> _messages = [];
14 bool _isGenerating = false;
15 bool _isModelLoaded = false;
16 double _downloadProgress = 0.0;
17
18 @override
19 void initState() {
20 super.initState();
21 _loadModel();
22 }
23
24 Future<void> _loadModel() async {
25 const modelId = 'lfm2-350m-q4_k_m';
26
27 final isDownloaded =
28 (await RunAnywhere.availableModels()).any((m) => m.id == modelId && m.localPath != null);
29
30 if (!isDownloaded) {
31 await for (final progress in RunAnywhere.downloadModel(modelId)) {
32 setState(() {
33 _downloadProgress = progress.percentage;
34 });
35 if (progress.state.isCompleted) break;
36 }
37 }
38
39 await RunAnywhere.loadModel(modelId);
40 setState(() {
41 _isModelLoaded = true;
42 });
43 }
44
45 Future<void> _sendMessage() async {
46 final text = _controller.text.trim();
47 if (text.isEmpty || _isGenerating) return;
48
49 _controller.clear();
50
51 setState(() {
52 _messages.add(ChatMessage(role: 'user', content: text));
53 _messages.add(ChatMessage(role: 'assistant', content: ''));
54 _isGenerating = true;
55 });
56
57 try {
58 final options = LLMGenerationOptions(
59 maxTokens: 256,
60 temperature: 0.7,
61 );
62
63 final streamResult = await RunAnywhere.generateStream(text, options: options);
64
65 String fullResponse = '';
66 await for (final token in streamResult.stream) {
67 fullResponse += token;
68 setState(() {
69 _messages.last = ChatMessage(role: 'assistant', content: fullResponse);
70 });
71 }
72
73 // Get final metrics
74 final metrics = await streamResult.result;
75 debugPrint('Speed: ${metrics.tokensPerSecond.toStringAsFixed(1)} tok/s');
76
77 } catch (e) {
78 setState(() {
79 _messages.last = ChatMessage(
80 role: 'assistant',
81 content: 'Error: ${e.toString()}',
82 );
83 });
84 } finally {
85 setState(() {
86 _isGenerating = false;
87 });
88 }
89 }
90
91 @override
92 Widget build(BuildContext context) {
93 return Scaffold(
94 appBar: AppBar(
95 title: const Text('On-Device Chat'),
96 ),
97 body: Column(
98 children: [
99 if (!_isModelLoaded)
100 LinearProgressIndicator(value: _downloadProgress),
101
102 Expanded(
103 child: ListView.builder(
104 padding: const EdgeInsets.all(16),
105 itemCount: _messages.length,
106 itemBuilder: (context, index) {
107 final message = _messages[index];
108 return MessageBubble(message: message);
109 },
110 ),
111 ),
112
113 Padding(
114 padding: const EdgeInsets.all(16),
115 child: Row(
116 children: [
117 Expanded(
118 child: TextField(
119 controller: _controller,
120 decoration: const InputDecoration(
121 hintText: 'Type a message...',
122 border: OutlineInputBorder(),
123 ),
124 enabled: _isModelLoaded && !_isGenerating,
125 onSubmitted: (_) => _sendMessage(),
126 ),
127 ),
128 const SizedBox(width: 8),
129 IconButton(
130 icon: Icon(_isGenerating ? Icons.stop : Icons.send),
131 onPressed: _isModelLoaded && !_isGenerating ? _sendMessage : null,
132 ),
133 ],
134 ),
135 ),
136 ],
137 ),
138 );
139 }
140}
141
142class ChatMessage {
143 final String role;
144 final String content;
145
146 ChatMessage({required this.role, required this.content});
147}
148
149class MessageBubble extends StatelessWidget {
150 final ChatMessage message;
151
152 const MessageBubble({super.key, required this.message});
153
154 @override
155 Widget build(BuildContext context) {
156 final isUser = message.role == 'user';
157
158 return Align(
159 alignment: isUser ? Alignment.centerRight : Alignment.centerLeft,
160 child: Container(
161 margin: const EdgeInsets.symmetric(vertical: 4),
162 padding: const EdgeInsets.all(12),
163 constraints: BoxConstraints(
164 maxWidth: MediaQuery.of(context).size.width * 0.75,
165 ),
166 decoration: BoxDecoration(
167 color: isUser ? Colors.blue : Colors.grey[800],
168 borderRadius: BorderRadius.circular(12),
169 ),
170 child: Text(
171 message.content.isEmpty ? '...' : message.content,
172 style: const TextStyle(color: Colors.white),
173 ),
174 ),
175 );
176 }
177}
Chat interface with streaming response

Non-Streaming Generation

For simpler use cases, you can also use non-streaming generation:

dart
1final result = await RunAnywhere.generate(
2 prompt,
3 options: LLMGenerationOptions(maxTokens: 256),
4);
5
6print('Response: ${result.text}');
7print('Speed: ${result.tokensPerSecond} tok/s');

Models Reference

Model IDSizeNotes
lfm2-350m-q4_k_m~250MBLiquidAI LFM2, fast, efficient

Completed Chat screen

Chat interface with streaming response on device
Chat interface with streaming response on device

Troubleshooting

IssueSolution
CocoaPods install failureRun ,[object Object], first, ensure Xcode 14+
iOS crash on launch ("image not found")Ensure ,[object Object], in Podfile, then ,[object Object]
Android Gradle sync failsEnsure ,[object Object], in build.gradle, JDK 17+
Model download hangsCheck ,[object Object], permission in AndroidManifest.xml
[object Object], fails on RunAnywhereEnsure you're using Flutter 3.10+ and Dart 3.0+

What's Next

In Part 2, we'll add speech-to-text capabilities using Whisper, including the audio format handling that's critical for accurate transcription.


Resources


Questions? Open an issue on GitHub or reach out on Twitter/X.

Frequently Asked Questions

Does the app work offline after the model is downloaded?

What devices are supported?

Can I add RunAnywhere to an existing Flutter project?

Why do I need both runanywhere_llamacpp and runanywhere_onnx?

Why do I need use_frameworks! :linkage => :static in my Podfile?

Do I need a physical iOS device?

My project uses minSdk = flutter.minSdkVersion. Do I need to change it?

Can I use an Android emulator?

Why does the SDK initialization order matter?

The model download seems stuck. What should I do?

What is the difference between generateStream and generate?

The model responses seem short or low quality. Is that expected?

I get "image not found" when launching on iOS.

Android Gradle sync fails.

How much RAM does the model use?

Can I use a different or larger model?

RunAnywhere Logo

RunAnywhere

Connect with developers, share ideas, get support, and stay updated on the latest features. Our Discord community is the heart of everything we build.

Company

Copyright © 2025 RunAnywhere, Inc.