All Stories

Code, Run, Debug on AutoPilot: Let Your Local Llama Do All Your Heavy Lifting!

AutoGen isn’t just another framework; it marks a revolutionary leap in leveraging Large Language Models (LLMs). Built to empower developers, AutoGen excels in orchestrating multi-agent conversations, where these agents are...

Deep Learning for Graphics Programmers: Performing Tensor Operations with DirectML and Direct3D 12

Deep Learning for Graphics Programmers: Performing Tensor Operations with DirectML and Direct3D 12

Comparing SYCL, OpenCL, and CUDA: Matrix Multiplication Example

Comparing SYCL, OpenCL, and CUDA: Matrix Multiplication Example

Intro to DirectX 12 Pipeline

DirectX 12 organizes graphics rendering into pipelines.

The Simple Path to PyTorch Graphs: Dynamo and AOT Autograd Explained

Graph acquisition in PyTorch refers to the process of creating and managing the computational graph that represents a neural network’s operations. This graph is central to PyTorch’s dynamic nature, allowing...

Profiling ResNet Models with PyTorch Profiler for Performance Optimization

In the realm of deep learning, model performance is paramount. Whether you’re working on image classification, object detection, or any other computer vision task, the efficiency of your model can...

Accelerating Deep Learning Inference on Intel Arc 770: ONNX and PyTorch Go Head-to-Head

When deploying deep learning models, the choice of framework can significantly impact performance. PyTorch is a popular choice for its user-friendly interface and dynamic computation graph, but when it comes...

Warmup Wisdom: Accurate PyTorch Benchmarking Made Simple!

In the realm of PyTorch model benchmarking, achieving accurate results is paramount for gauging performance effectively. However, traditional benchmarking often overlooks the initial warmup phase, leading to skewed results. In...

Mastering Frame Rates: Discover the True FPS with PresentMon

PresentMon is a tool used for capturing frame time data during application runtime, which can then be used to calculate frames per second (FPS). Here’s a general process for using...