Step-by-Step Model Deployment Examples

Converting and Running MobileNetV2 on Torq

Follow the steps below to convert, and run a MobileNetV2 model on Torq.

Set up environment

  • If not yet done, activate Python environment as explained in Getting Started. (Skip this step if using the Docker container)

Convert TFLite → TOSA

  • Navigate to the root directory of the Release Package (Ubuntu 24.04), or run the Docker container. For the Docker container, release package is located at:

    $ cd /opt/release
    
  • Convert the .tflite model into TOSA format using the IREE import tool:

    # Convert TFLite to TOSA (binary MLIR)
    iree-import-tflite samples/hf/Synaptics_MobileNetV2/MobileNetV2_int8.tflite -o mobilenetv2.tosa
    

    The TOSA format is a binary format that can be fed to our compiler. For more details on the IREE TFLite tools, see the official IREE website.

  • Optionally convert the TOSA file (binary MLIR) to the text representation:

    # Convert binary MLIR to text MLIR
    $ iree-opt mobilenetv2.tosa -o mobilenetv2.mlir
    

    While torq-compile supports TOSA binary format it is useful at this early stage of the compiler development to use the MLIR text representation in order to facilitate debugging and error reporting.

Compile TOSA → Torq model

  • Convert the TOSA file into a Torq runtime executable (.vmfb):

    $ torq-compile mobilenetv2.tosa -o mobilenetv2.vmfb
    

    or (when using the text MLIR format)

    $ torq-compile mobilenetv2.mlir -o mobilenetv2.vmfb
    

Run inference on Torq

  • Use the compiled model with the Torq runtime to run inference. For example:

    $ iree-run-module \
      --device=torq \
      --module=mobilenetv2.vmfb \
      --function=main \
      --input="1x224x224x3xi8=1"
    

Note: To use an actual image as input, preprocess the image and save it as a NumPy array (.npy file). Then, provide the path to the .npy file as input, as described in the Input Guide.