SDK Blueprint

Comprehensive guide to integrating and using the Generate Nano SDK in your applications.

Getting Started

Installation

The Generate Nano Menu Assistant SDK is packaged as a set of Android Archive (AAR) files and a sample integration application. It can be added to any Android project targeting Qualcomm QCM6490 devices.

Package Overview

This package contains a complete AI-powered restaurant menu assistant solution, including:

  • Two Production-Ready SDKs for voice-enabled menu interactions
  • Fully Functional Demo Application showcasing all features
  • Complete Source Code for the demo app
  • Menu Configuration System for easy customization
  • Integration Documentation and code examples

SDK Files

Two Android SDK files (AAR format) ready for integration into your own applications:

GMK_V1.0.aar - Core SDK (~390MB)

What it does:

  • Processes natural language menu queries
  • Understands item names, modifications, and quantities
  • Manages shopping cart operations
  • Provides intelligent menu recommendations
  • Supports both local AI processing and cloud AI

Core Capabilities:

  • Menu item search and filtering
  • Cart operations (add, remove, modify items)
  • Order total calculations
  • Natural language understanding
  • Dietary preference filtering (vegetarian, vegan, etc.)
GWK6490_V1.0.aar - Voice Processing SDK (~300MB)

What it does:

  • Converts speech to text with high accuracy
  • Converts text responses back to speech
  • Optimized for restaurant/menu terminology
  • Real-time audio processing

Core Capabilities:

  • Speech-to-text transcription
  • Text-to-speech output
  • Noise reduction and audio enhancement
  • Real-time processing (< 3 seconds end-to-end)

⚠️ Important Setup Note

When you first open the demo app project in Android Studio, you'll need to update the Android SDK path:

What to update:

  • File: local.properties (in the root folder)
  • Current path: sdk.dir=/Users/NathanJ/Library/Android/sdk
  • Update to: Your local Android SDK path

How Android Studio helps:

  • Android Studio will automatically detect this issue
  • You'll see a prompt: "Gradle sync failed: SDK location not found"
  • Click "Fix" or "Update SDK Location" and Android Studio will update it for you
  • Alternatively, manually edit local.properties and set sdk.dir to your Android SDK installation path

Finding your Android SDK path:

  • macOS: Usually /Users/YourUsername/Library/Android/sdk
  • Windows: Usually C:\Users\YourUsername\AppData\Local\Android\Sdk
  • Linux: Usually /home/YourUsername/Android/Sdk

Step 1: Prerequisites

Before installation, verify that your development environment meets the following:

  • Android Studio Electric Eel (2022.1.1) or newer
  • Gradle 7.0+
  • Android SDK API Level 33
  • Device running Android 13 or later
  • Minimum 4 GB RAM and 64-bit ARM processor

Step 2: Importing the SDK

Copy the following AAR files into your project's /libs directory:

  • GMK_V1.0.aar — Core SDK
  • GWK6490_V1.0.aar — Whisper ASR integration module

Open your app-level build.gradle file and include:

repositories {
    flatDir {
        dirs 'libs'
    }
}

dependencies {
    implementation(name: 'GMK_V1.0', ext: 'aar')
    implementation(name: 'GWK6490_V1.0', ext: 'aar')
    implementation 'org.json:json:20210307'
    implementation 'androidx.appcompat:appcompat:1.5.1'
    implementation 'com.microsoft.onnxruntime:onnxruntime-android:1.15.1'
}

Sync the project to ensure all dependencies are resolved successfully.

Step 3: Permissions

The SDK requires a few Android runtime permissions:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />

Ensure microphone access is granted during first launch to enable voice input via Whisper ASR.

Running the Application

Once the SDK has been integrated, you can run the Menu Assistant on your 6490 device or emulator (if supported).

Step 1: Launch Application

  • Install the generated APK on your Qualcomm 6490 device.
  • The Menu Assistant icon will appear on the home screen or app drawer.
  • Tap the icon to open the assistant interface.

Step 2: Initialize the Model

Upon first launch, the app will require initialization of the local LLM and ASR components:

  • Navigate to Settings within the app (bottom-right icon).
  • Select Build Menu Index to setup the menu data for the LLM
  • Once the embedding is completed, continue with regualr queries and functions.

Step 3: Verifying Setup

After menu loading:

  • Speak or type a menu query (e.g., "What appetizers do you have?").
  • The LLM should respond conversationally using data from the local menu dataset.
  • Voice-to-text transcription should appear in real time during audio input.

Configuration

The SDK includes configurable options that allow developers to adjust model behavior, menu data, and response logic.

Configuration File: app_config.json

This file defines paths, model options, and UI parameters:

{
  "model_path": "models/menu_assistant.gguf",
  "asr_model": "models/whisper_small.bin",
  "menu_file": "assets/menu.json",
  "language": "en",
  "max_response_tokens": 256,
  "enable_logging": true
}

Menu Configuration

The SDK uses a JSON format for menu configuration. Your menu file should be placed in app/src/main/assets/menu.json:

{
  "menu": [
    {
      "id": 1,
      "name": "Cheeseburger",
      "category": "Large Plates",
      "description": "Classic burger with cheese, lettuce, tomato",
      "price": 16.00,
      "image": "cheeseburger.jpg",
      "dietary": ["contains_dairy", "contains_gluten"],
      "spiciness": "0/10",
      "keywords": ["burger", "beef", "cheese", "sandwich"],
      "allergens": ["dairy", "gluten", "soy"]
    },
    {
      "id": 2,
      "name": "Caesar Salad",
      "category": "Small Plates",
      "description": "Crisp romaine lettuce with Caesar dressing",
      "price": 12.00,
      "dietary": ["vegetarian"],
      "spiciness": "0/10",
      "keywords": ["salad", "greens", "healthy"]
    }
  ]
}
Required Fields
  • id - Unique identifier for the item
  • name - Display name of the menu item
  • category - Menu category (e.g., "Appetizers", "Entrees", "Desserts")
  • description - Brief description of the item
  • price - Price in decimal format
Optional Fields
  • image - Image filename (place images in assets/images/)
  • dietary - Array of dietary tags (e.g., ["vegetarian", "vegan", "gluten-free"])
  • spiciness - Spice level (e.g., "0/10", "5/10")
  • keywords - Additional search terms
  • allergens - List of allergens

SDK Configuration

SDKConfig config = SDKConfig.getInstance()
    .setLoggingEnabled(true)           // Enable debug logging
    .setMenuJsonPath("menu.json")      // Menu file path in assets/
    .setMonitoringEnabled(false);      // Disable performance monitoring
    
config.apply(getApplicationContext());

AI Mode Selection

// Use local AI processing (faster, more private)
menuSDK.setCloudAIEnabled(false);

// Use cloud AI (requires internet, more capable)
menuSDK.setCloudAIEnabled(true);

// Check current mode
menuSDK.getCloudAIEnabled(new AsyncResult() {
    @Override
    public void resolve(Boolean isEnabled) {
        String mode = isEnabled ? "Cloud AI" : "Local AI";
        Log.d("AI", "Current mode: " + mode);
    }
});

Voice Configuration

// Configure voice settings
voiceSDK = WhisperSDK.builder(this)
    .enableTTS(true)                   // Enable/disable voice output
    .setLanguage("en")                 // Language code
    .setTimeout(30000)                 // Max recording time (ms)
    .enableContinuous(false)           // Continuous vs single utterance
    .setEndOfSpeechTimeout(2000)       // Silence detection timeout (ms)
    .setDebugMode(false)               // Enable/disable debug logging
    .build();

Optional Parameters

Setting Description
max_response_tokens Limits output length from the LLM.
language Sets default ASR and LLM language (default: English).
enable_logging Enables debug logs and session analytics.
menu_file Defines which menu dataset is loaded at runtime.

Developers can modify these parameters directly in app_config.json or dynamically within the SDK initialization call.

Key Features

The Generate Nano Menu Assistant SDK includes multiple AI-driven capabilities built to enhance user interaction, improve order accuracy, and maintain full offline functionality on the Qualcomm QCM6490 platform.

Each feature operates as part of the local processing pipeline—coordinating between the ASR, BM25 retriever, LLM, and cart management layer—to interpret and respond to user input in real time.

Q&A LLM Responses

The Menu Assistant's core logic is powered by a fine-tuned local LLM optimized for menu-oriented tasks. When a user asks a question, the assistant performs the following process:

  1. Input Parsing: The text or speech transcription is analyzed for keywords and intent.
  2. Item Retrieval: The SDK runs a BM25 relevance search against the local menu.json dataset to identify related items.
  3. Context Packaging: The retrieved menu entries are concatenated with the question as model input.
  4. Local Inference: The LLM (quantized for 6490 GPU execution) generates a human-readable, conversational response.

Example Interactions

User: "What appetizers do you have?"
Response: "The appetizers available are: Mozzarella Sticks, Loaded Fries, and Spinach Artichoke Dip."
User: "Tell me more about the Loaded Fries."
Response: "The Loaded Fries are a popular dish served at a restaurant. Each serving costs $12.00, and they are typically served with small plates. The dish is known for its crispy texture, topped with cheddar cheese, bacon bits, and green onions, and served with sour cream."
User: "Can I get a Cheeseburger with extra sauce?"
Response:
Your cart contains:
• 1 × Cheeseburger (with extra sauce) – $16.00
Total: $16.00
User: "Hi can I get one Garlic Bread, two slices Apple Pie, and one Loaded Fries for the table."
Response:
Your cart contains:
• 2 × Apple Pie – $16.00
• 1 × Garlic Bread – $8.00
• 1 × Loaded Fries – $12.00
Total: $36.00
User: "Add one Iced Tea and one Lemonade."
Response:
Your cart contains:
• 1 × Iced Tea – $3.00
• 1 × Lemonade – $4.00
Total: $7.00

The assistant intelligently identifies the number of items, modifiers (e.g., extra sauce), and quantities, updating the cart dynamically while providing immediate feedback to the user.

Cart Functions

The cart engine manages all ordering-related tasks within the SDK. It stores menu selections, modifiers, and pricing details, maintaining session persistence until checkout or reset. Developers can directly interact with the cart object to retrieve, clear, or modify order data programmatically.

Supported Commands

Command Type Example Input Result
Add Item "Add one Cheeseburger." Adds the Cheeseburger to the cart.
Modify Item "Add one Cheeseburger with extra onions." Updates Cheeseburger entry with modifier.
Remove Item "Remove the Cheeseburger from my order." Removes Cheeseburger entry and displays empty cart.
Compound Order "Can I have one Grilled Ribs with extra sauce for my brother and two Cheeseburger with extra onions for me." Adds all items and modifiers, returns detailed cart summary and total.

Developer Access

Developers can access or modify the cart via SDK methods:

Cart currentCart = MenuAssistantSDK.getCart();
currentCart.listItems();
currentCart.clear();

Cart updates are automatically synchronized with the conversational responses to maintain alignment between the assistant's output and the actual cart state.

OpenAI Whisper ASR

The Generate Nano SDK integrates OpenAI's Whisper Automatic Speech Recognition (ASR) technology to enable voice interactions with the Menu Assistant.

Key Features

  • Offline Processing: All voice recognition happens on-device without requiring internet connectivity.
  • Multi-language Support: Recognizes and transcribes speech in 99 languages.
  • Restaurant Terminology: Optimized for food and restaurant-specific vocabulary.
  • NPU Acceleration: Utilizes the Qualcomm Neural Processing Unit for efficient processing.

Implementation Example

// Initialize the Whisper ASR component
WhisperASR whisperASR = new WhisperASR.Builder(context)
    .setModel("whisper-small")
    .setLanguage("en")
    .build();

// Start listening for voice input
whisperASR.startListening(new ASRCallback() {
    @Override
    public void onResult(String transcription) {
        // Process the transcribed text
        menuAssistant.processQuery(transcription);
    }
    
    @Override
    public void onError(int errorCode) {
        // Handle any ASR errors
        Log.e("ASR", "Error code: " + errorCode);
    }
});

LLM Deployments

The Generate Nano SDK supports multiple deployment configurations for the Language Model component, allowing developers to choose the optimal setup for their specific use case.

Deployment Options

  • Local-only Mode: All processing happens on-device with no external dependencies.
  • Hybrid Mode: Combines on-device processing with cloud capabilities for complex queries.
  • Cloud Fallback: Uses local processing by default but can fall back to cloud when needed.

Model Specifications

  • Model Size: 0.6B parameters (4-bit quantized)
  • Context Window: 2048 tokens
  • Response Time: <1 second for typical queries
  • Memory Usage: ~350MB RAM during inference

Hybrid GPU Implementation

The Generate Nano SDK utilizes a hybrid approach to maximize performance on Qualcomm devices, intelligently distributing workloads across CPU, GPU, and NPU.

Architecture Overview

  • CPU: Handles preprocessing, tokenization, and application logic.
  • GPU: Accelerates matrix operations for the LLM inference.
  • NPU: Optimizes voice processing and embedding generation.

Performance Benefits

  • Reduced Latency: 3-5x faster inference compared to CPU-only processing.
  • Lower Power Consumption: More efficient use of device resources.
  • Improved Responsiveness: Near real-time interactions even on resource-constrained devices.

Components

Components

The Generate Nano SDK consists of several modular components that work together to provide a complete menu assistant solution.

Core Components

  • MenuAssistant: The main interface for processing queries and managing interactions.
  • WhisperASR: Handles voice-to-text conversion using OpenAI's Whisper model.
  • LLMEngine: Manages the local language model for generating responses.
  • CartManager: Tracks and manages order items and modifications.
  • MenuDatabase: Stores and indexes menu items for efficient retrieval.

Component Diagram

graph TD subgraph "Generate Nano SDK Components" MA[MenuAssistant] ASR[WhisperASR] LLM[LLMEngine] CART[CartManager] DB[MenuDatabase] MA --> ASR MA --> LLM MA --> CART LLM --> DB CART --> DB end style MA fill:#2d2d2d,stroke:#333,stroke-width:1px style ASR fill:#2d2d2d,stroke:#333,stroke-width:1px style LLM fill:#2d2d2d,stroke:#333,stroke-width:1px style CART fill:#2d2d2d,stroke:#333,stroke-width:1px style DB fill:#2d2d2d,stroke:#333,stroke-width:1px

Exposed Functions

The Generate Nano SDK exposes several key functions that developers can use to integrate and control the menu assistant functionality.

Core Functions

Function Description Example
processQuery(String query) Processes a text query and returns a response menuAssistant.processQuery("What appetizers do you have?")
startVoiceInput() Activates the microphone and begins listening menuAssistant.startVoiceInput()
stopVoiceInput() Stops listening and processes the captured audio menuAssistant.stopVoiceInput()
getCart() Returns the current cart object Cart cart = menuAssistant.getCart()
clearCart() Empties the current cart menuAssistant.clearCart()
switchMenu(String menuPath) Changes to a different menu configuration menuAssistant.switchMenu("dinner_menu.json")

Usage Examples

Sample Code

The following examples demonstrate how to integrate and use the Generate Nano SDK in your Android application.

Basic Integration

// Initialize the SDK
MenuAssistantSDK menuSDK = new MenuAssistantSDK.Builder(context)
    .setMenuFile("menu.json")
    .setModelPath("models/menu_assistant.gguf")
    .setLoggingEnabled(true)
    .build();

// Set up the UI listener
menuSDK.setResponseListener(new ResponseListener() {
    @Override
    public void onResponse(String response) {
        // Update UI with the response
        responseTextView.setText(response);
    }
    
    @Override
    public void onCartUpdate(Cart cart) {
        // Update cart display
        updateCartUI(cart);
    }
});

Processing Text Input

// Handle button click for text input
submitButton.setOnClickListener(new View.OnClickListener() {
    @Override
    public void onClick(View v) {
        String query = inputEditText.getText().toString();
        if (!query.isEmpty()) {
            // Show loading indicator
            loadingIndicator.setVisibility(View.VISIBLE);
            
            // Process the query
            menuSDK.processQuery(query, new QueryCallback() {
                @Override
                public void onComplete() {
                    // Hide loading indicator
                    loadingIndicator.setVisibility(View.GONE);
                    
                    // Clear input field
                    inputEditText.setText("");
                }
            });
        }
    }
});

Voice Input Integration

// Handle microphone button
micButton.setOnClickListener(new View.OnClickListener() {
    @Override
    public void onClick(View v) {
        if (!isRecording) {
            // Start recording
            isRecording = true;
            micButton.setImageResource(R.drawable.ic_stop);
            statusTextView.setText("Listening...");
            
            // Start voice input
            menuSDK.startVoiceInput();
        } else {
            // Stop recording
            isRecording = false;
            micButton.setImageResource(R.drawable.ic_mic);
            statusTextView.setText("Processing...");
            
            // Stop and process voice input
            menuSDK.stopVoiceInput();
        }
    }
});

Queries

The Generate Nano SDK can handle a wide variety of query types. Here are some examples of supported queries and how they are processed.

Informational Queries

User: "What vegetarian options do you have?"
Response: "We have several vegetarian options: Caesar Salad, Margherita Pizza, Vegetable Pasta, and Mushroom Risotto. Would you like more details about any of these dishes?"
User: "Which dishes contain nuts?"
Response: "The following dishes contain nuts: Walnut Salad, Pecan Pie, and Thai Peanut Noodles. If you have a nut allergy, I recommend avoiding these items."

Ordering Queries

User: "I'd like to order a cheeseburger medium rare with extra pickles and no onions."
Response: "I've added 1 Cheeseburger (medium rare, extra pickles, no onions) to your cart. The price is $16.00. Would you like anything else?"
User: "Add two glasses of the house red wine and one tiramisu for dessert."
Response: "I've added 2 House Red Wine ($18.00) and 1 Tiramisu ($8.00) to your cart. Your current total is $42.00."

Cart Management Queries

User: "Remove the tiramisu from my order."
Response: "I've removed the Tiramisu from your cart. Your updated total is $34.00."
User: "What's in my cart right now?"
Response: "Your cart currently contains: 1 Cheeseburger (medium rare, extra pickles, no onions) - $16.00 and 2 House Red Wine - $18.00. Your total is $34.00."

Resources

Troubleshooting

Common issues and their solutions when working with the Generate Nano SDK.

Model Loading Issues

  • Problem: Model fails to load with "Out of memory" error.
  • Solution: Ensure your application has sufficient memory allocation in the AndroidManifest.xml file by adding android:largeHeap="true" to the application tag.
  • Problem: Model file not found error.
  • Solution: Verify that the model file is correctly placed in the specified directory and that the path in the configuration is correct.

Voice Recognition Issues

  • Problem: Microphone permission denied.
  • Solution: Ensure your application requests and receives the RECORD_AUDIO permission at runtime.
  • Problem: Poor voice recognition accuracy.
  • Solution: Ensure the device is in a quiet environment and that the microphone is not obstructed. Consider using the larger Whisper model for better accuracy.

Performance Issues

  • Problem: Slow response times.
  • Solution: Enable GPU acceleration in the SDK configuration and ensure the device is not in battery saving mode.
  • Problem: Application crashes during heavy usage.
  • Solution: Implement proper resource management by unloading the model when not in use and monitoring memory usage.

Project Structure

Understanding the structure of a Generate Nano SDK integration project.

Typical Project Layout

app/
├── src/
│   ├── main/
│   │   ├── java/
│   │   │   └── com/example/menuapp/
│   │   │       ├── MainActivity.java
│   │   │       ├── MenuActivity.java
│   │   │       └── utils/
│   │   │           └── SDKHelper.java
│   │   ├── res/
│   │   │   ├── layout/
│   │   │   │   ├── activity_main.xml
│   │   │   │   └── activity_menu.xml
│   │   │   └── values/
│   │   │       └── strings.xml
│   │   └── assets/
│   │       ├── menu.json
│   │       └── models/
│   │           ├── menu_assistant.gguf
│   │           └── whisper_small.bin
│   └── androidTest/
│       └── java/
│           └── com/example/menuapp/
│               └── SDKIntegrationTest.java
├── libs/
│   ├── GMK_V1.0.aar
│   └── GWK6490_V1.0.aar
└── build.gradle

Key Files and Directories

  • libs/: Contains the SDK AAR files.
  • assets/menu.json: The menu configuration file.
  • assets/models/: Directory containing the LLM and ASR model files.
  • SDKHelper.java: A utility class for managing SDK initialization and interactions.

Best Practices

Recommendations for optimal implementation and usage of the Generate Nano SDK.

Performance Optimization

  • Lazy Loading: Load the model only when needed, not at application startup.
  • Resource Management: Unload the model when not in use to free up memory.
  • Background Processing: Run model inference in a background thread to avoid UI freezes.
  • Caching: Cache common responses to reduce processing time for frequent queries.

User Experience

  • Loading Indicators: Always show loading indicators during model processing.
  • Error Handling: Provide clear error messages and recovery options.
  • Voice Feedback: Give audio or visual cues when voice recording starts and stops.
  • Progressive Loading: Load UI elements progressively while the model initializes.

Menu Configuration

  • Detailed Descriptions: Provide comprehensive item descriptions for better query matching.
  • Keyword Enrichment: Include alternative names and common misspellings in item keywords.
  • Dietary Information: Always include complete dietary and allergen information.
  • Regular Updates: Keep menu data current with seasonal changes and price updates.

Getting Started Checklist

A step-by-step checklist to ensure successful implementation of the Generate Nano SDK.

Pre-Integration Checklist

  • Verify device compatibility (Qualcomm QCM6490 chipset, Android 13+)
  • Ensure sufficient storage space for model files (~700MB)
  • Prepare menu data in the required JSON format
  • Set up development environment with Android Studio Electric Eel or newer
  • Download the SDK AAR files from the developer portal

Integration Checklist

  • Add SDK AAR files to the project's libs directory
  • Update build.gradle with necessary dependencies
  • Add required permissions to AndroidManifest.xml
  • Create assets directory and add menu.json file
  • Implement runtime permission requests for microphone access
  • Initialize the SDK in your application
  • Set up UI components for user interaction

Testing Checklist

  • Verify model loading and initialization
  • Test basic menu queries and responses
  • Validate cart operations (add, remove, modify items)
  • Test voice input functionality
  • Verify performance on target device
  • Test error handling and recovery

License & Usage

Information about licensing, usage restrictions, and attribution requirements for the Generate Nano SDK.

License Terms

  • Usage Scope: The SDK is licensed for use in commercial applications on Qualcomm QCM6490 devices.
  • Distribution: You may distribute applications containing the SDK through official app stores or direct installation.
  • Modifications: You may not modify, reverse engineer, or decompile the SDK AAR files.
  • Attribution: Applications using the SDK must include attribution in the about section or settings.

Usage Limitations

  • Device Limit: Standard license allows deployment on up to 100 devices.
  • Enterprise License: Required for deployments exceeding 100 devices.
  • Support: Technical support is provided for 12 months from purchase date.
  • Updates: License includes access to updates for 12 months.

Quick Tips

Helpful tips and tricks for working with the Generate Nano SDK.

Development Tips

  • Debug Mode: Enable debug logging during development to track SDK operations.
  • Test Menu: Start with a small test menu before implementing your full menu.
  • Emulator Testing: Test on a physical device, as emulators may not support all GPU features.
  • Memory Profiling: Use Android Studio's Memory Profiler to monitor memory usage.

Deployment Tips

  • APK Size: Use Android App Bundle to reduce download size for different device configurations.
  • Model Updates: Implement a mechanism to update models without requiring a full app update.
  • Error Reporting: Implement crash reporting to collect information about issues in production.
  • User Feedback: Collect user feedback to improve menu data and response quality.

SDK Images Gallery

A comprehensive collection of images showing the SDK's interfaces and functionality.