↓Skip to main content

Tech Share

Compiling Neuron model

10 April 2025·1310 words

Tech Share Notes

This post is a step-by-step tutorial on compiling PyTorch models for AWS Neuron (Inf2) chips, focusing on BlipForQuestionAnswering. It covers model wrapping, tracing, inference, and practical code examples for deployment.

12 May 2021·611 words

Tech Share Notes

This post explains the differences between HTTP-based APIs (REST, polling, streaming, SSE) and WebSocket APIs, using analogies and code samples to illustrate communication models and protocol upgrades.