PyTorch Autograd Profiler

PyTorch Autograd Profiler

2020. 2. 2. 02:16ㆍDevelop

링크: https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.profile

Automatic differentiation package - torch.autograd — PyTorch master documentation

Automatic differentiation package - torch.autograd torch.autograd provides classes and functions implementing automatic differentiation of arbitrary scalar valued functions. It requires minimal changes to the existing code - you only need to declare Tensor

pytorch.org

Autograd includes a profiler that lets you inspect the cost of different operators inside your model - both on the CPU and GPU. There are two modes implemented at the moment - CPU-only using profile. and nvprof based (registers both CPU and GPU activity) using emit_nvtx.

torch.autograd.profiler.profile(enabled=True, use_cuda=False, record_shapes=False)

Context manager that manages autograd profiler state and holds a summary of results. Under the hood it just records events of functions being executed in C++ and exposes those events to Python. You can wrap any code into it and it will only report runtime of PyTorch functions.

enabled (bool, optional) – Setting this to False makes this context manager a no-op. (Default: True).
use_cuda (bool, optional) – Enables timing of CUDA events as well using the cudaEvent API. Adds approximately 4us of overhead to each tensor operation. Default: False
record_shapes (bool, optional) – If shapes recording is set, information about input dimensions will be collected. This allows one to see which dimensions have been used under the hood and further group by them using prof.key_averages(group_by_input_shape=True). Please note that shape recording might skew your profiling data. It is recommended to use separate runs with and without shape recording to validate the timing. Most likely the skew will be negligible for bottom most events (in a case of nested function calls). But for higher level functions the total self cpu time might be artificially increased because of the shape collection.

>>> x = torch.randn((1, 1), requires_grad=True)
>>> with torch.autograd.profiler.profile() as prof:
>>>     for _ in range(100):  # any normal python code, really!
>>>         y = x ** 2
>>          y.backward()
>>> # NOTE: some columns were removed for brevity
>>> print(prof.key_averages().table(sort_by="self_cpu_time_total"))
-----------------------------------  ---------------  ---------------  ---------------
Name                                 Self CPU total   CPU time avg     Number of Calls
-----------------------------------  ---------------  ---------------  ---------------
mul                                  32.048ms         32.048ms         200
pow                                  27.041ms         27.041ms         200
PowBackward0                         9.727ms          55.483ms         100
torch::autograd::AccumulateGrad      9.148ms          9.148ms          100
torch::autograd::GraphRoot           691.816us        691.816us        100
-----------------------------------  ---------------  ---------------  ---------------

Useful Functions

export_chrome_trace(path): Exports an EventList as a Chrome tracing tools file. The checkpoint can be later loaded and inspected under chrome://tracing URL.

key_averages(group_by_input_shape=False): Averages all function events over their keys.

table(sort_by=None, row_limit=100, header=None): Prints an EventList as a nicely formatted table.

torch.autograd.profiler.emit_nvtx(enabled=True, record_shapes=False): https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx

'Develop' 카테고리의 다른 글

Root 권한 없이 local에 CUDA 설치하기 (1)	2020.04.06
ctypes를 통해 c++ array를 python list로 변환하기 (4)	2020.03.01
PyTorch를 ONNX로 export하기 (2)	2020.02.08
[Linux/Mac] vim을 IDE처럼! zsh 설정부터 vim 플러그인 설정까지 총 정리 (2)	2020.02.07
[Pytorch Error] RuntimeError: Given input size: (256x1x1). Calculated output size: (256x0x0). Output size is too small (0)	2020.01.21

태그

최근글

댓글

공지사항

아카이브

Useful Functions

'Develop' 카테고리의 다른 글

관련글

티스토리툴바