Articles by Tag #cuda

Browse our collection of articles on various topics related to IT technologies. Dive in and explore something new!

GPU-Powered Networking: The Future of Blazing-Fast Model Training by Arvind Sundararajan

GPU-Powered Networking: The Future of Blazing-Fast Model Training \Are you tired of...

Learn More 0 0Nov 20

Building a CUDA-Accelerated Neural Network Library in Rust

The library is organized into three crates: corrosive-tensor handles the core tensor operations and...

Learn More 0 0Oct 4

I Made A Fish Schooling Sim And Honestly It Was Fun As Hell

So yeah I made this fish schooling thing. Literally a bunch of fake fish vibing together on my GPU. I...

Learn More 3 0Nov 23

Demystifying GPUs: From Core Architecture to Scalable Systems

Table of Contents Motivation Optimization goal of GPUs Key concepts of GPUs - software and...

Learn More 81 2Jul 20

CUDA Kernel Execution Debugging Journey

Short version: we went from 8/70 passing CUDA tests to a stable, auditable path by fixing NVRTC name...

Learn More 1 0Sep 4

Modeling Epidemic Spread on Large Graphs Using CUDA

1. Introduction What if we could predict disease outbreaks not in days, but in minutes?...

Learn More 0 1Nov 7

Snooping on your GPU: Using eBPF to Build Zero-instrumentation CUDA Monitoring

A deep dive into building a zero-instrumentation GPU monitoring tool using eBPF, complete with memory leak detection and kernel launch tracking.

Learn More 7 1Dec 22 '24

"A wild goose never laid a tame egg" - I rebuild the Xerxes DDoS Tool

🚀 I Built the Ultimate DoS Tool Using 4x RTX 4090s - And It's 1,200x Faster Than the...

Learn More 1 2Jul 1

CUDA Series (1/3)

olcf's CUDA series 01. CUDA C Basics slide Host: The CPU and its memory Device:...

Learn More 5 0Mar 13

Just finished my GGUF-Shard

I'm happy to announce my latest open‑source project Sharded Suite it's sharding + caching system that...

Learn More 0 0Jul 22

Custom CUDA Kernels Outperforming cuBLAS: Deep Dive into GPU Memory Optimization for Small-Batch ML Workloads

Developed specialized CUDA kernels for financial ML inference that achieve 93,563 operations/second...

Learn More 0 0Jul 25

Single bash script to install CUDA 12.8 on Ubuntu

As a developer working with NVIDIA GPUs, you know how crucial it is to have the right CUDA toolkit...

Learn More 0 0Jul 25

CUDA Deep Dive: Demystifying Kernels, Thread Hierarchies, and the GPU Execution Model: P-1

CUDA Deep Dive: Demystifying Kernels, Thread Hierarchies, and the GPU Execution Model:...

Learn More 0 2Jun 3

Accelerating OpenCV with CUDA on Jetson Orin NX: A Complete Build Guide

What do we have right out of the box? The NVIDIA Jetson Orin NX is a powerful, community...

Learn More 1 0Feb 21

Running Nvidia COSMOS on A100 80Gb

Video Example https://youtube.com/shorts/9dOihUzSSho How to run Nvidia Cosmos on Ubuntu...

Learn More 2 0Jan 13

NVIDIA CUDA Toolkit 12.8

CUDA(Compute Unified Device Architecture)는 여러분들도 잘 아시다시피, NVIDIA의 GPU를 활용해 병렬 계산을 수행할 수 있도록 해주는 프로그래밍...

Learn More 2 0May 6

ROCm RX 6700 XT Installation Guide on Ubuntu 24.04

Overview As AI workloads continue to grow, having proper GPU support is essential. AMD’s...

Learn More 0 0Sep 14

Implementing DeepSeek-R1 Tool Calls with OpenWebUI and Llama.cpp for Local AI Workflows

The latest advancements in AI technology have brought exciting news for developers and AI...

Learn More 0 0Feb 1

#Day1 of My Journey to Google

Hi everyone! I'm Reenmayee, a 2nd-year BTech student, and today I’m starting my 3-month learning...

Learn More 0 0May 1

Global vs Static in C++

Key Differences Aspect Global Variable Static Variable Scope Accessible throughout...

Learn More 0 0Jan 4

OpenMP Data-Sharing Clauses: Differences Explained

1. private Purpose: Each thread gets its own uninitialized copy of the...

Learn More 2 0Jan 4

CUDA Series (2/3)

6. CUDA Unified Memory = Managed Memory slide lecture subsidiary 6.1...

Learn More 0 0Apr 22

Building a JS pytorch clone: Performance investigation

One thing that we haven't done is some benchmarking. For this I though I'd start with a simple...

Learn More 0 0Mar 24

WSL2 TensorFlow GPU Setup – RTX 4060 + Ubuntu 22.04 + CUDA 12.2 + cuDNN

🛠 Prereqs: WSL2 enabled on Windows NVIDIA GPU driver (≥ v535) installed on Windows Ubuntu...

Learn More 0 0Apr 16

Evolution of GPU Programming

From Smart Pixels to the Backbone of an AI-driven World Every decade GPUs reinvented...

Learn More 0 0Sep 3

Building a JS pytorch clone with CUDA

I had actually wanted part 2 to be a WebGPU implementation and started poking at the CUDA version...

Learn More 0 0Mar 3

Installing TensorFlow 2.19 with GPU support on Fedora 42

This post aims to be a simple guide on how to install TensorFlow 2.19 with GPU support on Fedora...

Learn More 0 0May 3

Introducing a project called metal like cuda

Introducing a project called metal like cuda where the aim is to bring metal closer to the...

Learn More 0 0Mar 29

Upcoming update for gpumkat

Testing functionality and more is coming to gpumkat soon https://github.com/MetalLikeCuda/gpumkat

Learn More 1 0Jul 28

Accelerating Data Processing with Grid Stride Loops in CUDA

As the demand for processing large datasets increases, achieving high performance becomes critical....

Learn More 0 0Jan 15