.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/plot_convert_pipeline_vectorizer.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_examples_plot_convert_pipeline_vectorizer.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_plot_convert_pipeline_vectorizer.py:


Train, convert and predict with ONNX Runtime
============================================

This example demonstrates an end to end scenario
starting with the training of a scikit-learn pipeline
which takes as inputs not a regular vector but a
dictionary ``{ int: float }`` as its first step is a
`DictVectorizer <http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.DictVectorizer.html>`_.

.. contents::
    :local:

Train a pipeline
++++++++++++++++

The first step consists in retrieving the boston datset.

.. GENERATED FROM PYTHON SOURCE LINES 22-32

.. code-block:: default

    import pandas
    from sklearn.datasets import load_boston
    boston = load_boston()
    X, y = boston.data, boston.target

    from sklearn.model_selection import train_test_split
    X_train, X_test, y_train, y_test = train_test_split(X, y)
    X_train_dict = pandas.DataFrame(X_train[:,1:]).T.to_dict().values()
    X_test_dict = pandas.DataFrame(X_test[:,1:]).T.to_dict().values()


.. GENERATED FROM PYTHON SOURCE LINES 33-34

We create a pipeline.

.. GENERATED FROM PYTHON SOURCE LINES 34-44

.. code-block:: default


    from sklearn.pipeline import make_pipeline
    from sklearn.ensemble import GradientBoostingRegressor
    from sklearn.feature_extraction import DictVectorizer
    pipe = make_pipeline(
                DictVectorizer(sparse=False),
                GradientBoostingRegressor())
            
    pipe.fit(X_train_dict, y_train)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    Pipeline(steps=[('dictvectorizer', DictVectorizer(sparse=False)),
                    ('gradientboostingregressor', GradientBoostingRegressor())])


.. GENERATED FROM PYTHON SOURCE LINES 45-47

We compute the prediction on the test set
and we show the confusion matrix.

.. GENERATED FROM PYTHON SOURCE LINES 47-52

.. code-block:: default

    from sklearn.metrics import r2_score

    pred = pipe.predict(X_test_dict)
    print(r2_score(y_test, pred))


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    0.848444978558249


.. GENERATED FROM PYTHON SOURCE LINES 53-59

Conversion to ONNX format
+++++++++++++++++++++++++

We use module 
`sklearn-onnx <https://github.com/onnx/sklearn-onnx>`_
to convert the model into ONNX format.

.. GENERATED FROM PYTHON SOURCE LINES 59-69

.. code-block:: default


    from skl2onnx import convert_sklearn
    from skl2onnx.common.data_types import FloatTensorType, Int64TensorType, DictionaryType, SequenceType

    # initial_type = [('float_input', DictionaryType(Int64TensorType([1]), FloatTensorType([])))]
    initial_type = [('float_input', DictionaryType(Int64TensorType([1]), FloatTensorType([])))]
    onx = convert_sklearn(pipe, initial_types=initial_type)
    with open("pipeline_vectorize.onnx", "wb") as f:
        f.write(onx.SerializeToString())


.. GENERATED FROM PYTHON SOURCE LINES 70-72

We load the model with ONNX Runtime and look at
its input and output.

.. GENERATED FROM PYTHON SOURCE LINES 72-82

.. code-block:: default

    import onnxruntime as rt
    from onnxruntime.capi.onnxruntime_pybind11_state import InvalidArgument

    sess = rt.InferenceSession("pipeline_vectorize.onnx")

    import numpy
    inp, out = sess.get_inputs()[0], sess.get_outputs()[0]
    print("input name='{}' and shape={} and type={}".format(inp.name, inp.shape, inp.type))
    print("output name='{}' and shape={} and type={}".format(out.name, out.shape, out.type))


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    input name='float_input' and shape=[] and type=map(int64,tensor(float))
    output name='variable' and shape=[None, 1] and type=tensor(float)


.. GENERATED FROM PYTHON SOURCE LINES 83-85

We compute the predictions.
We could do that in one call:

.. GENERATED FROM PYTHON SOURCE LINES 85-91

.. code-block:: default


    try:
        pred_onx = sess.run([out.name], {inp.name: X_test_dict})[0]
    except (RuntimeError, InvalidArgument) as e:
        print(e)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Unexpected input data type. Actual: ((seq(map(int64,tensor(float))))) , expected: ((map(int64,tensor(float))))


.. GENERATED FROM PYTHON SOURCE LINES 92-94

But it fails because, in case of a DictVectorizer,
ONNX Runtime expects one observation at a time.

.. GENERATED FROM PYTHON SOURCE LINES 94-96

.. code-block:: default

    pred_onx = [sess.run([out.name], {inp.name: row})[0][0, 0] for row in X_test_dict]


.. GENERATED FROM PYTHON SOURCE LINES 97-98

We compare them to the model's ones.

.. GENERATED FROM PYTHON SOURCE LINES 98-100

.. code-block:: default

    print(r2_score(pred, pred_onx))


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    0.9999999999999528


.. GENERATED FROM PYTHON SOURCE LINES 101-103

Very similar. *ONNX Runtime* uses floats instead of doubles,
that explains the small discrepencies.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  1.592 seconds)


.. _sphx_glr_download_auto_examples_plot_convert_pipeline_vectorizer.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_convert_pipeline_vectorizer.py <plot_convert_pipeline_vectorizer.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_convert_pipeline_vectorizer.ipynb <plot_convert_pipeline_vectorizer.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_