Skip to content
Change the repository type filter

All

    Repositories list

    • lightllm

      Public
      LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
      Python
      Apache License 2.0
      1982.5k616Updated Oct 18, 2024Oct 18, 2024
    • llmc

      Public
      [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
      Python
      Apache License 2.0
      2829060Updated Oct 18, 2024Oct 18, 2024
    • Token healing implementation in Rust
      Rust
      Apache License 2.0
      0300Updated Oct 18, 2024Oct 18, 2024
    • Python bindings for general-sam and some utilities
      Python
      Apache License 2.0
      0300Updated Oct 18, 2024Oct 18, 2024
    • A general suffix automaton implementation in Rust with Python bindings
      Rust
      Apache License 2.0
      0401Updated Oct 18, 2024Oct 18, 2024
    • EasyLLM

      Public
      Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
      Python
      Apache License 2.0
      73900Updated Sep 18, 2024Sep 18, 2024
    • DeepSpeed

      Public
      DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
      Python
      Apache License 2.0
      4.1k000Updated Sep 13, 2024Sep 13, 2024
    • OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
      Python
      Apache License 2.0
      416100Updated Sep 6, 2024Sep 6, 2024
    • xtuner

      Public
      An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
      Python
      Apache License 2.0
      304000Updated Aug 22, 2024Aug 22, 2024
    • InternVL

      Public
      [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
      Python
      MIT License
      452000Updated Aug 16, 2024Aug 16, 2024
    • OmniBal

      Public
      Python
      01520Updated Aug 9, 2024Aug 9, 2024
    • TFMQ-DM

      Public
      [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
      Jupyter Notebook
      Apache License 2.0
      35300Updated Aug 1, 2024Aug 1, 2024
    • Python
      Apache License 2.0
      01110Updated Jun 16, 2024Jun 16, 2024
    • msbench

      Public
      A tool for model sparse based on torch.fx
      Python
      Apache License 2.0
      1700Updated Jun 3, 2024Jun 3, 2024
    • MQBench

      Public
      Model Quantization Benchmark
      Shell
      Apache License 2.0
      13775935Updated Jun 3, 2024Jun 3, 2024
    • FCPTS

      Public template
      Python
      0200Updated May 14, 2024May 14, 2024
    • statecs

      Public
      Rust
      Apache License 2.0
      1100Updated May 10, 2024May 10, 2024
    • Greedily tokenize strings with the longest tokens iteratively.
      Python
      Apache License 2.0
      0000Updated Mar 27, 2024Mar 27, 2024
    • QLLM

      Public
      [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
      Python
      Apache License 2.0
      33410Updated Mar 11, 2024Mar 11, 2024
    • Dipoorlet

      Public
      Offline Quantization Tools for Deploy.
      Python
      Apache License 2.0
      16113112Updated Dec 28, 2023Dec 28, 2023
    • Summary of system papers/frameworks/codes/tools on training or serving large model
      Apache License 2.0
      55600Updated Dec 17, 2023Dec 17, 2023
    • Python
      21800Updated Nov 29, 2023Nov 29, 2023
    • Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
      Python
      MIT License
      34000Updated Oct 21, 2023Oct 21, 2023
    • Python
      1000Updated Aug 11, 2023Aug 11, 2023
    • ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
      Python
      Apache License 2.0
      5.2k200Updated Jun 20, 2023Jun 20, 2023
    • pyvlova

      Public
      Yet another Polyhedra Compiler for DeepLearning
      Python
      Apache License 2.0
      41900Updated Apr 14, 2023Apr 14, 2023
    • HTML
      0000Updated Apr 3, 2023Apr 3, 2023
    • NART

      Public
      NART = NART is not A RunTime, a deep learning inference framework.
      Python
      Apache License 2.0
      143810Updated Mar 2, 2023Mar 2, 2023
    • United Perception
      Python
      Apache License 2.0
      65427271Updated Dec 5, 2022Dec 5, 2022
    • AAAI2023 Efficient and Accurate Models towards Practical Deep Learning Baseline
      01320Updated Nov 29, 2022Nov 29, 2022