📊
BMNNSDK2_1684_2.0.1
  • ​BM1684 BMNNSDK2 用户手册
  • 概述
    • 开发阶段
    • SoC和PCIE模式
  • BMNNSDK2
    • 简介
  • On Linux
  • On Windows10
  • 网络模型编译
    • 概述
  • FP32 Bmodel转化
  • INT8 Bmodel转化
  • INT8 Umodel转换INT8 Bmodel
  • 演示DEMO
  • 算法移植
    • 算法移植概述
  • 算法C编程详解
  • Python编程详解
  • 解码模块
  • 图形运算加速模块
  • 模型推理
  • 实例演示
Powered by GitBook
On this page
  • 1. BMLib模块c接口介绍
  • BMLIB接口
  • 2. BMRuntime模块c接口介绍
  • 3. Python接口

Was this helpful?

模型推理

Previous图形运算加速模块Next实例演示

Last updated 5 years ago

Was this helpful?

c接口详细介绍请阅读 和 。

​ python接口详细介绍请阅读

​ BMRuntime用于读取BMCompiler的编译输出(.bmodel),驱动其在BITMAIN TPU芯片中执行。BMRuntime向用户提供了丰富的接口,便于用户移植算法,其软件架构如下:

Bmruntime实现了c/c++/python,本章主要介绍c和python常用接口,本章主要介绍如下内容:

  • BMLIB 接口:用于设备管理

  • BMruntime的c语言接口

  • BMLIB和BMruntime的python接口介绍

1. BMLib模块c接口介绍

BMLIB接口

  • 用于设备管理,不属于BMRuntime,但需要配合使用,所以先介绍

    BMLIB接口是C语言接口,对应的头文件是bmlib_runtime.h,对应的lib库为libbmlib.so,

    BMLIB接口用于设备管理,包括设备内存的管理。

    BMLIB的接口很多,这里介绍应用程序通常需要用到的接口。

  • bm_dev_request

    用于请求一个设备,得到设备句柄handle。其他设备接口,都需要指定这个设备句柄。

    其中devid表示设备号,在PCIE模式下,存在多个设备时可以用于选择对应的设备;在SoC模式下,请指定为0。

    /**
     * @name    bm_dev_request
     * @brief   To create a handle for the given device
     * @ingroup bmlib_runtime
     *
     * @param [out] handle  The created handle
     * @param [in]  devid   Specify on which device to create handle
     * @retval  BM_SUCCESS  Succeeds.
     *          Other code  Fails.
     */
    bm_status_t bm_dev_request(bm_handle_t *handle, int devid);
  • bm_dev_free

    用于释放一个设备。通常应用程序开始需要请求一个设备,退出前释放这个设备。

    /**
     * @name    bm_dev_free
     * @brief   To free a handle
     * @param [in] handle  The handle to free
     */
    void bm_dev_free(bm_handle_t handle);

2. BMRuntime模块c接口介绍

对应的头文件为bmruntime_interface.h,对应的lib库为libbmrt.so。

用户程序使用C接口时建议使用该接口,该接口支持多种shape的静态编译网络,支持动态编译网络。

  • bmrt_create

    /**
     * @name    bmrt_create
     * @brief   To create the bmruntime with bm_handle.
     * This API creates the bmruntime. It returns a void* pointer which is the pointer
     * of bmruntime. Device id is set when get bm_handle;
     * @param [in] bm_handle     bm handle. It must be initialized by using bmlib.
     * @retval void* the pointer of bmruntime
     */
    void* bmrt_create(bm_handle_t bm_handle);
  • bmrt_destroy

    /**
     * @name    bmrt_destroy
     * @brief   To destroy the bmruntime pointer
     * @ingroup bmruntime
     * This API destroy the bmruntime.
     * @param [in]     p_bmrt        Bmruntime that had been created
     */
    void bmrt_destroy(void* p_bmrt);
  • bmrt_load_bmodel

    加载bmodel文件,加载后bmruntime中就会存在若干网络的数据,后续可以对网络进行推理。

    /**
     * @name    bmrt_load_bmodel
     * @brief   To load the bmodel which is created by BM compiler
     * This API is to load bmodel created by BM compiler.
     * After loading bmodel, we can run the inference of neuron network.
     * @param   [in]   p_bmrt        Bmruntime that had been created
     * @param   [in]   bmodel_path   Bmodel file directory.
     * @retval true    Load context sucess.
     * @retval false   Load context failed.
     */
    bool bmrt_load_bmodel(void* p_bmrt, const char *bmodel_path);
  • bmrt_load_bmodel_data

    加载bmodel,不同于bmrt_load_bmodel,它的bmodel数据存在内存中

    /*
      Parameters: [in] p_bmrt      - Bmruntime that had been created.
                  [in] bmodel_data - Bmodel data pointer to buffer.
                  [in] size        - Bmodel data size.
      Returns:    bool             - true: success; false: failed.
      */
      bool bmrt_load_bmodel_data(void* p_bmrt, const void * bmodel_data, size_t size);
  • bmrt_get_network_info

    bmrt_get_network_info根据网络名,得到某个网络的信息

      /* bm_stage_info_t holds input shapes and output shapes;
      every network can contain one or more stages */
      typedef struct {
        bm_shape_t* input_shapes;   /* input_shapes[0] / [1] / ... / [input_num-1] */
        bm_shape_t* output_shapes;  /* output_shapes[0] / [1] / ... / [output_num-1] */
      } bm_stage_info_t;
    
      /* bm_tensor_info_t holds all information of one net */
      typedef struct {
        const char* name;              /* net name */
        bool is_dynamic;               /* dynamic or static */
        int input_num;                 /* number of inputs */
        char const** input_names;      /* input_names[0] / [1] / .../ [input_num-1] */
        bm_data_type_t* input_dtypes;  /* input_dtypes[0] / [1] / .../ [input_num-1] */
        float* input_scales;           /* input_scales[0] / [1] / .../ [input_num-1] */
        int output_num;                /* number of outputs */
        char const** output_names;     /* output_names[0] / [1] / .../ [output_num-1] */
        bm_data_type_t* output_dtypes; /* output_dtypes[0] / [1] / .../ [output_num-1] */
        float* output_scales;          /* output_scales[0] / [1] / .../ [output_num-1] */
        int stage_num;                 /* number of stages */
        bm_stage_info_t* stages;       /* stages[0] / [1] / ... / [stage_num-1] */
      } bm_net_info_t;

    bm_net_info_t表示一个网络的全部信息,bm_stage_info_t表示该网络支持的不同的shape情况。

    /**
     * @name    bmrt_get_network_info
     * @brief   To get network info by net name
     * @param [in]     p_bmrt         Bmruntime that had been created
     * @param [in]     net_name       Network name
     * @retval  bm_net_info_t*        Pointer to net info, needn't free by user; if net name not found, will return NULL.
     */
    const bm_net_info_t* bmrt_get_network_info(void* p_bmrt, const char* net_name);

示例代码:

  const char *model_name = "VGG_VOC0712_SSD_300X300_deploy"
  const char **net_names = NULL;
  bm_handle_t bm_handle;
  bm_dev_request(&bm_handle, 0);
  void * p_bmrt = bmrt_create(bm_handle);
  bool ret = bmrt_load_bmodel(p_bmrt, bmodel.c_str());
  std::string bmodel; //bmodel file
  int net_num = bmrt_get_network_number(p_bmrt, model_name);
  bmrt_get_network_names(p_bmrt, &net_names);
  for (int i=0; i<net_num; i++) {
    //do somthing here
    ......
  }
  free(net_names);
  bmrt_destroy(p_bmrt);
  bm_dev_free(bm_handle);
  • bmrt_shape_count

    接口声明如下:

      /*
      number of shape elements, shape should not be NULL and num_dims should not large than BM_MAX_DIMS_NUM 
      */
      uint64_t bmrt_shape_count(const bm_shape_t* shape);

    可以得到shape的元素个数。

    比如num_dims为4,则得到的个数为dims[0]*dims[1]*dims[2]*dims[3]

    bm_shape_t 结构介绍:

      typedef struct {
        int num_dims;
        int dims[BM_MAX_DIMS_NUM];
      } bm_shape_t;

    bm_shape_t表示tensor的shape,目前最大支持8维的tensor。其中num_dims为tensor的实际维度数,dims为各维度值,dims的各维度值从[0]开始,比如(n, c, h, w)四维分别对应(dims[0], dims[1], dims[2], dims[3])。

    如果是常量shape,初始化参考如下:

      bm_shape_t shape = {4, {4,3,228,228}};
      bm_shape_t shape_array[2] = {
              {4, {4,3,28,28}}, // [0]
              {2, {2,4}}, // [1]
      }
  • bm_image_from_mat

    if use this function you need to open USE_OPENCV macro in include/bmruntime/bm_wrapper.hpp
    /**
     * @name    bm_image_from_mat
     * @brief   Convert opencv Mat object to BMCV bm_image object
     * @param [in]     in          OPENCV mat object
     * @param [out]    out         BMCV bm_image object
     * @retval true    Launch success.
     * @retval false   Launch failed.
     */
     static inline bool bm_image_from_mat (cv::Mat &in, bm_image &out)
     * @brief   Convert opencv multi Mat object to multi BMCV bm_image object
    static inline bool bm_image_from_mat (std::vector<cv::Mat> &in, std::vector<bm_image> &out)

    实例代码请参考bmmnsdk2开发包中examples/SSD_object/cpp_cv_bmcv_bmrt

  • bm_image_from_frame

    /**
     * @name    bm_image_from_frame
     * @brief   Convert ffmpeg a avframe object to a BMCV bm_image object
     * @ingroup bmruntime
     *
     * @param [in]     bm_handle   the low level device handle
     * @param [in]     in          a read-only avframe
     * @param [out]    out         an uninitialized BMCV bm_image object
                         use bm_image_destroy function to free out parameter until                              you no longer useing it.
     * @retval true    change success.
     * @retval false   change failed.
     */
    
    static inline bool bm_image_from_frame (bm_handle_t       &bm_handle,
                                          AVFrame           &in,
                                          bm_image          &out)
    /**
     * @name    bm_image_from_frame
     * @brief   Convert ffmpeg avframe  to BMCV bm_image object
     * @ingroup bmruntime
     *
     * @param [in]     bm_handle   the low level device handle
     * @param [in]     in          a read-only ffmpeg avframe vector
     * @param [out]    out         an uninitialized BMCV bm_image vector
                       use bm_image_destroy function to free out parameter until                              you no longer useing it.
     * @retval true    change success.
     * @retval false   chaneg failed.
     */
    static inline bool bm_image_from_frame (bm_handle_t                &bm_handle,
                                          std::vector<AVFrame>       &in,
                                          std::vector<bm_image>      &out)

    实例代码请参考bmmnsdk2开发包中examples/SSD_object/cpp_ffmpeg_bmcv_bmrt/main.cpp

  • bm_inference

    if use this function you need to open USE_OPENCV macro in include/bmruntime/bm_wrapper.hpp
    /**
     * @name    bm_inference
     * @brief   A block inference wrapper call
     * @ingroup bmruntime
     *
     * This API supports the neuron nework that is static-compiled or dynamic-compiled
     * After calling this API, inference on TPU is launched. And the CPU
     * program will be blocked.
     * This API support single input && single output, and multi thread safety
     *
     * @param [in]    p_bmrt         Bmruntime that had been created
     * @param [in]    input          bm_image of single-input data
     * @param [in]    output         Pointer of  single-output buffer
     * @param [in]    net_name       The name of the neuron network
     * @param [in]    input_shape    single-input shape
     *
     * @retval true    Launch success.
     * @retval false   Launch failed.
     */
    static inline bool bm_inference (void           *p_bmrt,
                                     bm_image        *input,
                                     void           *output,
                                     bm_shape_t input_shape,
                                     const char   *net_name)
     * This API support single input && multi output, and multi thread safety
    static inline bool bm_inference (void                       *p_bmrt,
                                     bm_image                    *input,
                                     std::vector<void*>         outputs,
                                     bm_shape_t             input_shape,
                                     const char               *net_name)
    * This API support multiple inputs && multiple outputs, and multi thread safety
    static inline bool bm_inference (void                           *p_bmrt,
                                     std::vector<bm_image*>          inputs,
                                     std::vector<void*>             outputs,
                                     std::vector<bm_shape_t>   input_shapes,
                                     const char                   *net_name)

    实例代码请参考bmmnsdk2开发包中examples/SSD_object/cpp_cv_bmcv_bmrt/main.cpp

3. Python接口

本章节只介绍了用例py_ffmpeg_bmcv_sail中用的的接口函数

  • Engine

    def __init__(tpu_id):
    """ Constructor does not load bmodel.
    Parameters 
    ---------
    tpu_id : int TPU ID. You can use bm-smi to see available IDs 
    """
  • load

    def load(bmodel_path):
    Load bmodel from file.
    Parameters 
    ---------
    bmodel_path : str Path to bmode
    """
  • set_io_mode

    def set_io_mode(mode):
    """ Set IOMode for a graph.
    Parameters 
    ---------
    mode : sail.IOMode Specified io mode 
    """
  • get_graph_names

    def get_graph_names(): 
    """ Get all graph names in the loaded bmodels.
    Returns 
    ------
    graph_names : list Graph names list in loaded context 
    """
  • get_input_names

    def get_input_names(graph_name): 
    """ Get all input tensor names of the specified graph.
    Parameters 
    ---------
    graph_name : str Specified graph name
    Returns 
    ------
    input_names : list All the input tensor names of the graph 
    """
  • get_output_names

    def get_output_names(graph_name):
    """ Get all output tensor names of the specified graph.
    Parameters
    ---------
    graph_name : str Specified graph name
    Returns 
    ------
    input_names : list All the output tensor names of the graph 
    """
  • sail.IOMode

    # Input tensors are in system memory while output tensors are in device memory sail.IOMode.SYSI
    # Input tensors are in device memory while output tensors are in system memory. 
    sail.IOMode.SYSO 
    # Both input and output tensors are in system memory. 
    sail.IOMode.SYSIO 
    # Both input and output tensors are in device memory. 
    ail.IOMode.DEVIO
  • set_io_mode

    def set_io_mode(mode):
    """ Set IOMode for a graph.
    Parameters 
    ---------
    mode : sail.IOMode Specified io mode 
    """
  • sail.Tensor

    def __init__(handle, shape, dtype, own_sys_data, own_dev_data):
    """ Constructor allocates system memory and device memory of the tensor.
    Parameters 
    ---------
    handle : sail.Handle Handle instance 
    shape : tuple Tensor shape 
    dytpe : sail.Dtype Data type 
    own_sys_data : bool Indicator of whether own system memory 
    own_dev_data : bool Indicator of whether own device memory 
    """
  • get_input_dtype

    def get_input_dtype(graph_name, tensor_name):
    """ Get scale of an input tensor. Only used for int8 models.
    Parameters 
    ---------
    graph_name : str The specified graph name tensor_name : str The specified output tensor name
    Returns 
    ------
    scale: sail.Dtype Data type of the input tensor 
    """
  • get_output_dtype

    def get_output_dtype(graph_name, tensor_name):
    """ Get the shape of an output tensor in a graph.
    Parameters 
    ---------
    graph_name : str The specified graph name tensor_name : str The specified output tensor name
    Returns 
    ------
    tensor_shape : list The shape of the tensor 
    """
  • process

    def process(graph_name, input_tensors, output_tensors):
    """ Inference with provided input and output tensors.
    Parameters 
    ---------
    graph_name : str The specified graph name 
    input_tensors : dict {str : sail.Tensor} Input tensors managed by user 
    output_tensors : dict {str : sail.Tensor} Output tensors managed by user 
    """
  • get_input_scale

    def get_input_scale(graph_name, tensor_name):
    """ Get scale of an input tensor. Only used for int8 models.
    Parameters 
    ---------
    graph_name : str The specified graph name tensor_name : str The specified output tensor name
    Returns 
    ------
    scale: float32 Scale of the input tensor 
    """

更多接口定义请查阅

Sophon_Inference_zh.pdf
NNToolChain.pdf
BMLib_User_Guide.pdf
Sophon_Inference_zh.pdf