Neural Network for WASI

In WasmEdge, we implemented the WASI-NN (Neural Network for WASI) proposal to allow access the Machine Learning (ML) functions with the fashion of graph loader APIs by the following functions:

You can find more detail about the WASI-NN proposal in Reference.

In this section, we will use an Rust example project badge to demonstrate how to use the WASI-NN api and run an image classification demo.

Prerequisites

Currently, WasmEdge used OpenVINO™ as the backend implementation. For this demo, you need to install OpenVINO™(2021) for the WASI-NN backend, and Rust if you want to build the example WASM file by yourself.

OpecnVINO Installation

For installing OpenVINO™ on Ubuntu20.04, we recommend the following commands:

OPENVINO_VERSION="2021.4.582"
OPENVINO_YEAR="2021"
curl -sSL https://apt.repos.intel.com/openvino/$OPENVINO_YEAR/GPG-PUB-KEY-INTEL-OPENVINO-$OPENVINO_YEAR | sudo gpg --dearmor > ./usr/share/keyrings/GPG-PUB-KEY-INTEL-OPENVINO-$OPENVINO_YEAR.gpg
echo "deb [signed-by=/usr/share/keyrings/GPG-PUB-KEY-INTEL-OPENVINO-$OPENVINO_YEAR.gpg] https://apt.repos.intel.com/openvino/$OPENVINO_YEAR all main" | sudo tee /etc/apt/sources.list.d/intel-openvino-$OPENVINO_YEAR.list
sudo apt update
sudo apt install -y intel-openvino-runtime-ubuntu20-$OPENVINO_VERSION

WasmEdge Building and Installation

By default, we don't enable any WASI-NN backend in WasmEdge. To enable the OpenVINO™ backend, we need to building the WasmEdge from source with the cmake option -DWASMEDGE_PLUGIN_WASI_NN_BACKEND="OpenVINO" to enable the OpenVINO™ backend:

git clone https://github.com/WasmEdge/WasmEdge.git && cd WasmEdge
# If use docker
docker pull wasmedge/wasmedge
docker run -it --rm \
    -v <path/to/your/wasmedge/source/folder>:/root/wasmedge \
    wasmedge/wasmedge:latest
cd /root/wasmedge
# If you don't use docker, you need to run only the following commands in the cloned repository root
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DWASMEDGE_PLUGIN_WASI_NN_BACKEND="OpenVINO" .. && make -j
# For the WASI-NN plugin, you should install this project.
cmake --install .

(Make sure you have configured the OpenVINO™ environment correctly with source /opt/intel/openvino_2021/bin/setupvars.sh ). You should have an executable wasmedge runtime under /usr/local/bin after installation.

(Optional) Rust Installation

If you want to build the example WASM from Rust by yourself, the Rust inatallation is required.

curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh -s -- -y
source "$HOME/.cargo/env"

And make sure to add wasm32-wasi target with the following command:

rustup target add wasm32-wasi

Run

Download the demo with:

git clone https://github.com/second-state/WasmEdge-WASINN-examples
cd WasmEdge-WASINN-examples

Getting the Fixture Files

For the example demo of Mobilenet, we need the fixture files:

  • mobilenet.xml: the model description
  • mobilenet.bin: the model weights
  • tensor-1x3x224x224-f32.bgr: the input image tensor.

The above artifacts are generated by OpenVINO™ Model Optimizer. Thanks for the amazing jobs done by Andrew Brown, you can find the artifacts and a build.sh which can regenerate the artifacts here.

You can use the script to download the fixures quickly:

cd openvino-mobilenet-raw
./download_mobilenet.sh <PATH>
# For example:
#   The command "./download_mobilenet.sh ." will download and store the files into the current directory.

(Optional) Building the WASM File From Rust Source

To build the demo, run:

cd rust
cargo build --release --target=wasm32-wasi
cd ..

The outputted wasmedge-wasinn-example-mobilenet.wasm will be under rust/target/wasm32-wasi/release/(or here), we can find that the WASM file imports the necessary WASI-NN functions by converting into WAT format with tools like wasm2wat:

 ...
 (import "wasi_ephemeral_nn" "load" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn4load17hdca997591f45db43E (type 8)))
  (import "wasi_ephemeral_nn" "init_execution_context" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn22init_execution_context17h2cb3b4398c18d1fdE (type 4)))
  (import "wasi_ephemeral_nn" "set_input" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn9set_input17h4d10422433f5c246E (type 7)))
  (import "wasi_ephemeral_nn" "get_output" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn10get_output17h117ce8ea097ddbebE (type 8)))
  (import "wasi_ephemeral_nn" "compute" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn7compute17h96ef5b407fe8173aE (type 5)))
  ...

Run the Example

Then you can use the OpenVINO-enabled WasmEdge which was compiled above to execute the WASM file:

wasmedge --dir .:. wasmedge-wasinn-example-mobilenet.wasm mobilenet.xml mobilenet.bin tensor-1x224x224x3-f32.bgr
# If you didn't install the project, you should give the `WASMEDGE_PLUGIN_PATH` environment variable for specifying the WASI-NN plugin path (the built plugin is at `build/plugins/wasi_nn`).

If everything goes well, you should have the terminal output:

# massive rust compiling output above...
Read graph XML, size in bytes: 143525
Read graph weights, size in bytes: 13956476
Loaded graph into wasi-nn with ID: 0
Created wasi-nn execution context with ID: 0
Read input tensor, size in bytes: 602112
Executed graph inference
   1.) [963](0.7113)pizza, pizza pie
   2.) [762](0.0707)restaurant, eating house, eating place, eatery
   3.) [909](0.0364)wok
   4.) [926](0.0155)hot pot, hotpot
   5.) [567](0.0153)frying pan, frypan, skillet

Code walkthrough

The main.rs will be started as the entry point. First, read the model description and weights into memory:

#![allow(unused)]
fn main() {
let args: Vec<String> = env::args().collect();
let model_xml_name: &str = &args[1]; // read filename from commandline
let model_bin_name: &str = &args[2];
let tensor_name: &str = &args[3];

let xml = fs::read_to_string(model_xml_name).unwrap();
let weights = fs::read(model_bin_name).unwrap();
}

Now we can start our inference with WASI-NN:

#![allow(unused)]
fn main() {
// load model
let graph = unsafe {
        wasi_nn::load(
            &[&xml.into_bytes(), &weights],
            wasi_nn::GRAPH_ENCODING_OPENVINO,
            wasi_nn::EXECUTION_TARGET_CPU,
        )
        .unwrap()
    };
// initialize the computation context
let context = unsafe { wasi_nn::init_execution_context(graph).unwrap() };
// initialize the input tensor
let tensor_data = fs::read(tensor_name).unwrap();
let tensor = wasi_nn::Tensor {
  dimensions: &[1, 3, 224, 224],
  r#type: wasi_nn::TENSOR_TYPE_F32,
  data: &tensor_data,
};
// set_input
unsafe {
  wasi_nn::set_input(context, 0, tensor).unwrap();
}
// Execute the inference.
unsafe {
  wasi_nn::compute(context).unwrap();
}
// retrieve output
let mut output_buffer = vec![0f32; 1001];
unsafe {
  wasi_nn::get_output(
    context,
    0,
    &mut output_buffer[..] as *mut [f32] as *mut u8,
    (output_buffer.len() * 4).try_into().unwrap(),
  )
  .unwrap();
}
}

where wasi_nn::GRAPH_ENCODING_OPENVINO means using the OpenVION™ backend, and wasi_nn::EXECUTION_TARGET_CPU means running the computation on CPU.

Finally, we sort the output and then print the top-5 classification result:

#![allow(unused)]
fn main() {
let results = sort_results(&output_buffer);
for i in 0..5 {
  println!(
    "   {}.) [{}]({:.4}){}",
    i + 1,
    results[i].0,
    results[i].1,
    imagenet_classes::IMAGENET_CLASSES[results[i].0]
  );
}
}

Reference

The intoduction of WASI-NN can be referred to this amazing blog written by Andrew Brown. This demo is greatly adapted from another demo.