Skip to content

NVIDIA Neural Modules 2.0.0rc1

Latest
Compare
Choose a tag to compare
@ko3n1g ko3n1g released this 15 Aug 21:55
· 1 commit to r2.0.0rc1 since this release
579983f

Highlights

Large language models

  • PEFT: QLoRA support, LoRA/QLora for Mixture-of-Experts (MoE) dense layer
  • State Space Models & Hybrid Architecture support (Mamba2 and NV-Mamba2-hybrid)
  • Support Nemotron, Minitron, Gemma2, Qwen, RAG
  • Custom Tokenizer training in NeMo
  • Update the Auto-Configurator for EP, CP and FSDP

Multimodal

  • NeVA: Add SOTA LLM backbone support (Mixtral/LLaMA3) and suite of model parallelism support (PP/EP)
  • Support Language Instructed Temporal-Localization Assistant (LITA) on top of video NeVA

ASR

  • SpeechLM and SALM
  • Adapters for Canary Customization
  • Pytorch allocator in PyTorch 2.2 improves training speed up to 30% for all ASR models
  • Cuda Graphs for Transducer Inference
  • Replaced webdataset with Lhotse - gives up to 2x speedup
  • Transcription Improvements - Speedup and QoL Changes
  • ASR Prompt Formatter for multimodal Canary

Export & Deploy

  • In framework PyTriton deployment with backends: - PyTorch - vLLM - TRT-LLM update to 0.10
  • TRT-LLM C++ runtime

Detailed Changelogs

ASR

Changelog

TTS

Changelog

LLM/Multimodal

Changelog

Export

Changelog

Bugfixes

Changelog
  • use get with fallback when reading checkpoint_callback_params by @akoumpa :: PR: #9223
  • fix import by @akoumpa :: PR: #9240
  • Remove .nemo instead of renaming by @mikolajblaz :: PR: #9281
  • call set_expert_model_parallel_world_size instead of set_cpu_expert_m… by @akoumpa :: PR: #9275
  • Fix typos in Mixtral NeMo->HF and Starcoder2 NeMo->HF conversion scripts by @evellasques :: PR: #9325
  • Skip sequence_parallel allreduce when using Mcore DistOpt by @akoumpa :: PR: #9344
  • Add OpenAI format response to r2.0.0rc1 by @athitten :: PR: #9796
  • [NeMo UX] Support generating datasets using different train/valid/test distributions by @ashors1 :: PR: #9771
  • Add missing imports for torch dist ckpt in export by @oyilmaz-nvidia :: PR: #9826

General Improvements

Changelog