Learn to write performant kernels in Metal Shader Langauge
Breaking down how torch.compile() optimizes large models