Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack) Job at Brillfy Technology Inc, United States

c0dzOEpxclFEY21nU0l4cDFLSGdjVXhiNmc9PQ==
  • Brillfy Technology Inc
  • United States

Job Description

 

Job Title: Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack)

Duration: Min 12+ Months

Location: 100% Remote

This is a hands-on engineering role , requiring deep expertise in CUDA, GPU architecture, and performance profiling .

Key Responsibilities

  • Profile and optimize AI/ML workloads across multi-GPU and multi-node systems
  • Identify bottlenecks across compute, memory, networking, and orchestration layers
  • Optimize CUDA kernels (memory coalescing, shared memory usage, occupancy tuning)
  • Improve inference performance using TensorRT, Triton, DeepStream, NeMo
  • Analyze and improve latency, throughput, GPU utilization, and memory efficiency
  • Work on distributed AI systems using Apache Ray, NCCL, Kubernetes GPU scheduling
  • Build benchmarking frameworks and performance monitoring systems
  • Collaborate with AI, DevOps, and Infrastructure teams for system-wide optimization

Required Skills

  • Strong hands-on CUDA programming and GPU performance optimization
  • Deep understanding of GPU architecture and memory hierarchy
  • Experience with Nsight, CUDA profiling tools, performance benchmarking
  • Hands-on experience with NVIDIA ecosystem (Triton, TensorRT, NeMo, DeepStream)
  • Experience with distributed AI systems (multi-GPU, multi-node, NCCL, Ray)
  • Experience working with AI models such as YOLO, GPT, LLaMA, Transformers
  • Strong understanding of AI system performance metrics (latency, throughput, utilization)

Preferred

  • Experience working at NVIDIA or similar GPU/AI infrastructure companies
  • Experience with real-time video / Vision AI systems
  • Experience with large-scale production AI deployments

Interview Process (Mandatory)

  • Candidates will receive a technical handout 1 day before interview
  • 90-minute deep-dive demo discussion (NOT theoretical)
  • Candidate must explain:
  • Bottleneck identification approach
  • GPU optimization strategies
  • System-level performance improvements

Job Tags

Full time, Remote work

Similar Jobs

Capital One

Lead Software Engineer - Full Stack Job at Capital One

 ...needs for the company Share your passion for staying on top of tech trends, experimenting with and learning new technologies,...  ...protected by federal, state, or local laws. For technical support or questions about Capital One's recruiting process, please send... 

Utilidata

Interim Fractional Controller (Contractor) Job at Utilidata

Utilidata is a fast-growing NVIDIA-backed AI company enabling AI data centers to dynamically orchestrate power and unlock more compute capacity from existing energy infrastructure. For over a decade, we have applied AI to the electric grid bringing real-time visibility... 

Threshold Giving

Advocacy to Protect Refugees Job at Threshold Giving

 ...restaurant and hospitality, activism, public speaking. It's also a fantastic opportunity for recent graduates looking for an entry-level non-profit position to kick start their career! Why Choose Threshold Giving: Competitive Pay: $25 hourly (8 donors) and uncapped weekly... 

Comfort Keepers - Asheville

Caregiver | Part-time - Nights|Weekends Job at Comfort Keepers - Asheville

 ...Looking for in-home caregivers that can work weekends (possibly alternating) to start immediately. Also looking for someone that has the flexibility for evening/overnight work (8pm - 8am | a few nights a week) in Asheville & Hendersonville. Weekend shift duties include... 

Acme Distribution Centers

Class A LOCAL TRUCK DRIVER - No experience necessary Job Job at Acme Distribution Centers

Class A LOCAL TRUCK DRIVER - No experience necessary JobKnowledge of metro areaPick-up and delivery of piggyback trailers to various shippers...  ...Create a Driver's Account.Person to Contact about this CDL Job: Alexa LaGrangeAcme Distribution Centers Phone Number...