Tech Share
Compiling Neuron model
·1310 words
Tech Share
Notes
This post is a step-by-step tutorial on compiling PyTorch models for AWS Neuron (Inf2) chips, focusing on BlipForQuestionAnswering. It covers model wrapping, tracing, inference, and practical code examples for deployment.
WebSocket
·611 words
Tech Share
Notes
This post explains the differences between HTTP-based APIs (REST, polling, streaming, SSE) and WebSocket APIs, using analogies and code samples to illustrate communication models and protocol upgrades.