Research Highlights: SparseGPT: Prune LLMs Accurately in One-Shot
A new research paper shows that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called …
Research Highlights: SparseGPT: Prune LLMs Accurately in One-Shot Read more »