Skip to main content

Tech Share

Compiling Neuron model
·1310 words
Tech Share Notes
This post is a step-by-step tutorial on compiling PyTorch models for AWS Neuron (Inf2) chips, focusing on BlipForQuestionAnswering. It covers model wrapping, tracing, inference, and practical code examples for deployment.
WebSocket
·611 words
Tech Share Notes
This post explains the differences between HTTP-based APIs (REST, polling, streaming, SSE) and WebSocket APIs, using analogies and code samples to illustrate communication models and protocol upgrades.