TensorRT-Developer-Guide.pdf

基本信息
源码名称：TensorRT-Developer-Guide.pdf
源码大小：3.89M
文件格式：.pdf
开发语言：C/C++
更新时间：2020-12-11
友情提示：（无需注册或充值，赞助后即可获取资源下载链接）
嘿，亲！知识可是无价之宝呢，但咱这精心整理的资料也耗费了不少心血呀。小小地破费一下，绝对物超所值哦！如有下载和支付问题，请联系我们QQ(微信同号)：78630559
本次赞助数额为： 1 元　
源码介绍
TensorRT开发指南，英伟达底层GPU加速库，边缘计算必备书籍。计算机视觉算法加速利器，支持Tensorflow等多个平台。
TABLE OF CONTENTS
Chapter 1. What Is TensorRT?................................................................................. 1
1.1. Benefits Of TensorRT.....................................................................................3
1.1.1. Who Can Benefit From TensorRT................................................................. 4
1.2. Where Does TensorRT Fit?...............................................................................5
1.3. How Does TensorRT Work?.............................................................................. 8
1.4. What Capabilities Does TensorRT Provide?........................................................... 9
1.5. How Do I Get TensorRT?............................................................................... 10
Chapter 2. Using The C   API............................................................................... 11
2.1. Instantiating TensorRT Objects in C  ............................................................... 11
2.2. Creating A Network Definition In C  ............................................................... 13
2.2.1. Creating A Network Definition From Scratch Using The C   API........................... 13
2.2.2. Importing A Model Using A Parser In C  ...................................................... 14
2.2.3. Importing A Caffe Model Using The C   Parser API.......................................... 15
2.2.4. Importing A TensorFlow Model Using The C   UFF Parser API.............................. 15
2.2.5. Importing An ONNX Model Using The C   Parser API.........................................16
2.3. Building An Engine In C  ............................................................................. 17
2.4. Serializing A Model In C  .............................................................................18
2.5. Performing Inference In C  .......................................................................... 18
2.6. Memory Management In C  .......................................................................... 19
2.7. Refitting An Engine..................................................................................... 20
Chapter 3. Using The Python API........................................................................... 22
3.1. Importing TensorRT Into Python...................................................................... 22
3.2. Creating A Network Definition In Python........................................................... 23
3.2.1. Creating A Network Definition From Scratch Using The Python API....................... 23
3.2.2. Importing A Model Using A Parser In Python.................................................. 24
3.2.3. Importing From Caffe Using Python............................................................ 24
3.2.4. Importing From TensorFlow Using Python......................................................25
3.2.5. Importing From ONNX Using Python............................................................ 26
3.2.6. Importing From PyTorch And Other Frameworks..............................................27
3.3. Building An Engine In Python......................................................................... 27
3.4. Serializing A Model In Python......................................................................... 28
3.5. Performing Inference In Python.......................................................................29
Chapter 4. Extending TensorRT With Custom Layers................................................... 30
4.1. Adding Custom Layers Using The C   API...........................................................30
4.1.1. Example 1: Adding A Custom Layer Using C   For Caffe....................................32
4.1.2. Example 2: Adding A Custom Layer That Is Not Supported In UFF Using C  .............33
4.1.3. Example 3: Adding A Custom Layer With Dynamic Shape Support Using C  ............. 34
4.1.4. Example 4: Add A Custom Layer With INT8 I/O Support Using C  ........................ 36
4.2. Adding Custom Layers Using The Python API....................................................... 38
4.2.1. Example 1: Adding A Custom Layer to a TensorRT Network Using Python................ 38
www.nvidia.com
TensorRT Developer's Guide SWE-SWDOCTRT-001-DEVG_vTensorRT 6.0.1 | iii
4.2.2. Example 2: Adding A Custom Layer That Is Not Supported In UFF Using Python......... 39
4.3. Using Custom Layers When Importing A Model From A Framework............................. 40
4.3.1. Example 1: Adding A Custom Layer To A TensorFlow Model................................ 41
4.4. Plugin API Description.................................................................................. 41
4.4.1. Migrating Plugins From TensorRT 5.x.x To TensorRT 6.x.x...................................42
4.4.2. IPluginV2 API Description......................................................................... 42
4.4.3. IPluginCreator API Description...................................................................44
4.4.4. Persistent LSTM Plugin............................................................................ 44
4.5. Best Practices For Custom Layers Plugin............................................................45
Chapter 5. Working With Mixed Precision.................................................................47
5.1. Mixed Precision Using The C   API...................................................................47
5.1.1. Setting The Layer Precision Using C  ......................................................... 47
5.1.2. Enabling FP16 Inference Using C  ............................................................. 48
5.1.3. Enabling INT8 Inference Using C  ............................................................. 49
5.1.3.1. Setting Per-Tensor Dynamic Range Using C  ............................................49
5.1.3.2. INT8 Calibration Using C  ................................................................. 50
5.1.4. Working With Explicit Precision Using C  .....................................................51
5.2. Mixed Precision Using The Python API............................................................... 52
5.2.1. Setting The Layer Precision Using Python..................................................... 52
5.2.2. Enabling FP16 Inference Using Python......................................................... 52
5.2.3. Enabling INT8 Inference Using Python..........................................................52
5.2.3.1. Setting Per-Tensor Dynamic Range Using Python........................................ 52
5.2.3.2. INT8 Calibration Using Python..............................................................53
5.2.4. Working With Explicit Precision Using Python................................................. 53
Chapter 6. Working With Reformat-Free Network I/O Tensors....................................... 54
6.1. Building An Engine With Reformat-Free Network I/O Tensors................................... 54
6.2. Supported Combination Of Data Type And Memory Layout of I/O Tensors..................... 55
6.3. Calibration For A Network With INT8 I/O Tensors................................................. 56
Chapter 7. Working With Dynamic Shapes................................................................ 57
7.1. Specifying Runtime Dimensions....................................................................... 58
7.2. Optimization Profiles................................................................................... 58
7.3. Layer Extensions For Dynamic Shapes............................................................... 59
7.4. Restrictions For Dynamic Shapes..................................................................... 60
7.5. Execution Tensors vs. Shape Tensors.................................................................60
7.5.1. Formal Inference Rules........................................................................... 61
7.6. Shape Tensor I/O (Advanced)......................................................................... 62
Chapter 8. Working With DLA................................................................................63
8.1. Running On DLA During TensorRT Inference........................................................ 63
8.1.1. Example 1: sampleMNIST With DLA............................................................. 64
8.1.2. Example 2: Enable DLA Mode For A Layer During Network Creation.......................65
8.2. DLA Supported Layers.................................................................................. 66
8.3. GPU Fallback Mode..................................................................................... 67
Chapter 9. Deploying A TensorRT Optimized Model.....................................................68
www.nvidia.com
TensorRT Developer's Guide SWE-SWDOCTRT-001-DEVG_vTensorRT 6.0.1 | iv
9.1. Deploying In The Cloud................................................................................ 68
9.2. Deploying To An Embedded System.................................................................. 68
Chapter 10. Working With Deep Learning Frameworks................................................ 70
10.1. Working With TensorFlow.............................................................................70
10.1.1. Freezing A TensorFlow Graph...................................................................70
10.1.2. Freezing A Keras Model..........................................................................71
10.1.3. Converting A Frozen Graph To UFF............................................................71
10.1.4. Working With TensorFlow RNN Weights....................................................... 71
10.1.4.1. TensorFlow RNN Cells Supported In TensorRT.......................................... 71
10.1.4.2. Maintaining Model Consistency Between TensorFlow And TensorRT................. 72
10.1.4.3. Workflow......................................................................................72
10.1.4.4. Dumping The TensorFlow Weights........................................................ 73
10.1.4.5. Loading Dumped Weights.................................................................. 73
10.1.4.6. Converting The Weights To A TensorRT Format........................................ 73
10.1.4.7. BasicLSTMCell Example.....................................................................74
10.1.4.8. Setting The Converted Weights And Biases............................................. 76
10.1.5. Preprocessing A TensorFlow Graph Using the Graph Surgeon API......................... 77
10.2. Working With PyTorch And Other Frameworks.................................................... 78
Chapter 11. Working With DALI............................................................................. 79
11.1. Benefits Of Integration............................................................................... 79
Chapter 12. Troubleshooting................................................................................. 81
12.1. FAQs...................................................................................................... 81
12.2. How Do I Report A Bug?.............................................................................. 84
12.3. Understanding Error Messages....................................................................... 84
12.4. Support.................................................................................................. 89
Appendix A. Appendix......................................................................................... 90
A.1. TensorRT Layers......................................................................................... 90
A.1.1. IActivationLayer.................................................................................... 90
A.1.2. IConcatenationLayer...............................................................................91
A.1.3. IConstantLayer......................................................................................91
A.1.4. IConvolutionLayer.................................................................................. 91
A.1.5. IDeconvolutionLayer............................................................................... 93
A.1.6. IElementWiseLayer.................................................................................95
A.1.6.1. ElementWise Layer Setup................................................................... 96
A.1.7. IFullyConnectedLayer..............................................................................96
A.1.8. IGatherLayer........................................................................................ 97
A.1.9. IIdentityLayer.......................................................................................98
A.1.10. ILRNLayer.......................................................................................... 98
A.1.11. IMatrixMultiplyLayer..............................................................................99
A.1.11.1. MatrixMultiply Layer Setup................................................................ 99
A.1.12. IPaddingLayer.................................................................................... 100
A.1.13. IPluginLayer...................................................................................... 100
A.1.14. IPoolingLayer.....................................................................................100
www.nvidia.com
TensorRT Developer's Guide SWE-SWDOCTRT-001-DEVG_vTensorRT 6.0.1 | v
A.1.15. IRaggedSoftMaxLayer........................................................................... 102
A.1.16. IReduceLayer.....................................................................................103
A.1.17. IResizeLayer......................................................................................103
A.1.18. IRNNLayer........................................................................................ 104
A.1.19. IRNNv2Layer......................................................................................104
A.1.19.1. RNNv2 Layer Setup........................................................................ 108
A.1.19.2. RNNv2 Layer - Optional Inputs.......................................................... 108
A.1.20. IScaleLayer....................................................................................... 109
A.1.21. IShapeLayer...................................................................................... 109
A.1.22. IShuffleLayer..................................................................................... 110
A.1.23. ISliceLayer........................................................................................110
A.1.24. ISoftMaxLayer.................................................................................... 111
A.1.25. ITopKLayer........................................................................................111
A.1.25.1. TopK Layer Setup.......................................................................... 112
A.1.26. IUnaryLayer...................................................................................... 112
A.2. Data Format Descriptions............................................................................ 113
A.3. Command-Line Programs............................................................................. 116
A.4. ACKNOWLEDGEMENTS................................................................................. 117