Release NVIDIA Neural Modules 2.0.0rc1 · NVIDIA/NeMo

Highlights

Large language models

PEFT: QLoRA support, LoRA/QLora for Mixture-of-Experts (MoE) dense layer
State Space Models & Hybrid Architecture support (Mamba2 and NV-Mamba2-hybrid)
Support Nemotron, Minitron, Gemma2, Qwen, RAG
Custom Tokenizer training in NeMo
Update the Auto-Configurator for EP, CP and FSDP

Multimodal

NeVA: Add SOTA LLM backbone support (Mixtral/LLaMA3) and suite of model parallelism support (PP/EP)
Support Language Instructed Temporal-Localization Assistant (LITA) on top of video NeVA

ASR

SpeechLM and SALM
Adapters for Canary Customization
Pytorch allocator in PyTorch 2.2 improves training speed up to 30% for all ASR models
Cuda Graphs for Transducer Inference
Replaced webdataset with Lhotse - gives up to 2x speedup
Transcription Improvements - Speedup and QoL Changes
ASR Prompt Formatter for multimodal Canary

Export & Deploy

In framework PyTriton deployment with backends: - PyTorch - vLLM - TRT-LLM update to 0.10
TRT-LLM C++ runtime

Detailed Changelogs

ASR

Changelog

Support dataloader as input to audio for transcription by @titu1994 :: PR: #9201
Clean up dev docs collection section by @yaoyu-33 :: PR: #9205
Fix Online_Offline_Microphone_VAD_Demo.ipynb by @stevehuang52 :: PR: #9251
Remove .nemo instead of renaming by @mikolajblaz :: PR: #9281
Fix GreedyBatchedCTCInfer regression from GreedyCTCInfer. by @galv :: PR: #9347
Revert "Fix GreedyBatchedCTCInfer regression from GreedyCTCInfer." by @titu1994 :: PR: #9351
Prompt formatter API and canary transcribe tensor input support by @pzelasko :: PR: #9206
Fix prompt formatter's defaults=None case in multi-task model by @pzelasko :: PR: #9366
move AED chunked infer script by @stevehuang52 :: PR: #9367
Use model-cast-to-bfloat16 rather than AMP-to-bfloat16 for inference. by @galv :: PR: #9198
ci: Fix `L2_Segmentation_Tool_Parallel_ctc_segmentation_test_L2_Eng_C… by @ko3n1g :: PR: #9399
Fix logging message for ASR by @titu1994 :: PR: #9469
Add support to change Multi task model prompt by @titu1994 :: PR: #9542
Enable encoder adapters for Canary and MultiTaskAED models by @titu1994 :: PR: #9409
Audio model collection by @anteju :: PR: #9263
TitaNet Batch Verify Speaker by @monica-sekoyan :: PR: #9337
Fix the arguments of forward_for_export function in msdd_models by @tango4j :: PR: #9624
chore: Pin branch in notebooks by @ko3n1g :: PR: #9697
refactor: notebook branch release by @ko3n1g :: PR: #9711
Canary Adapters tutorial (#9670) by @nithinraok :: PR: #9777
typos and branch name update to r2.0.0rc1 by @nithinraok :: PR: #9846
Fix RNNT alignments test by @artbataev :: PR: #9770
By default trust remote code from HF Datasets by @nithinraok :: PR: #9886
Temporarily disable cuda graph based RNN-T greedy inference for r2.0.0rc1 by @galv :: PR: #9904
Enable CUDA graphs by default, but require CUDA 12.6 for full graphs by @artbataev :: PR: #9919
update branch name for script by @nithinraok :: PR: #9936
updte branch by @nithinraok :: PR: #9942

TTS

Changelog

Clean up dev docs collection section by @yaoyu-33 :: PR: #9205
Add mel codec checkpoints by @anteju :: PR: #9228
GPU unit tests: Mark flaky tests to be fixed by @pablo-garay :: PR: #9559
chore: Pin branch in notebooks by @ko3n1g :: PR: #9697
refactor: notebook branch release by @ko3n1g :: PR: #9711

LLM/Multimodal

Changelog

Update nemo.export module for quantized models by @janekl :: PR: #9218
Add save option to the TRT-LLM export test script by @oyilmaz-nvidia :: PR: #9221
Checkpoint resuming compatible for 2403 container by @suiyoubi :: PR: #9199
Clean up dev docs collection section by @yaoyu-33 :: PR: #9205
use get with fallback when reading checkpoint_callback_params by @akoumpa :: PR: #9223
Revert rope fusion defaults by @cuichenx :: PR: #9237
fix import by @akoumpa :: PR: #9240
Add TRT-LLM params like max_num_tokens and opt_num_tokens by @oyilmaz-nvidia :: PR: #9210
sum-reduce grad_norm in DP+CP domain by @erhoo82 :: PR: #9262
Alit/bert convert fix by @JRD971000 :: PR: #9285
conv1d stable version by @JRD971000 :: PR: #9330
Fix trainer builder when exp_manager is not in config by @yaoyu-33 :: PR: #9293
Fix Peft Weights Loading in NeVA by @yaoyu-33 :: PR: #9341
Skip sequence_parallel allreduce when using Mcore DistOpt by @akoumpa :: PR: #9344
Fix FSDP gradient calculation with orig params by @janEbert :: PR: #9335
TRT-LLM Export Code Cleanup by @oyilmaz-nvidia :: PR: #9270
support null/None truncation field by @arendu :: PR: #9355
NeVa token fusion by @paul-gibbons :: PR: #9245
bugfix if using mcore distOpt with sft by @akoumpa :: PR: #9356
Re-org export code by @oyilmaz-nvidia :: PR: #9353
QLoRA by @cuichenx :: PR: #9340
PeFT fix for distOpt by @akoumpa :: PR: #9392
[NeMo-UX] Integrating mcore's DistributedDataParallel into MegatronStrategy by @marcromeyn :: PR: #9387
cherry pick of #9266 by @dimapihtar :: PR: #9411
Enable specifying alpha for PTQ INT8 SmoothQuant method by @janekl :: PR: #9423
add support for new mcore ds features by @dimapihtar :: PR: #9388
LoRA for MoE Layer by @cuichenx :: PR: #9396
Mistral-7B: apply user's precision to output checkpoint by @akoumpa :: PR: #9222
Add option to merge distributed optimizer buckets by @timmoon10 :: PR: #9414
TRT-LLM 0.10 Update by @oyilmaz-nvidia :: PR: #9402
In-framework deployment by @oyilmaz-nvidia :: PR: #9438
Bugfix missing variables and argument changes to MegatronPretrainingRandomSampler by @jstjohn :: PR: #9458
Hyena Operator by @guyjacob :: PR: #9264
Refactor Quantizer for reusing in QAT by @kevalmorabia97 :: PR: #9276
move load state dict after initialize parallel state in nlp_model by @ryxli :: PR: #9382
Enable user to optionally upgrade Megatron by @jstjohn :: PR: #9478
Fix unwrap model by @cuichenx :: PR: #9480
fix operator precedence by @akoumpa :: PR: #9403
[NeMo-UX] Adding context- & expert-parallelism to MegatronStrategy by @marcromeyn :: PR: #9525
update mcoreddp call by @akoumpa :: PR: #9345
mcore distOpt restore fix by @akoumpa :: PR: #9421
vLLM Export Support by @apanteleev :: PR: #9381
PL: Delete precision if using plugin. TODO switch to MegatronTrainerB… by @akoumpa :: PR: #9535
extend get_gpt_layer_modelopt_spec to support MoE by @akoumpa :: PR: #9532
fix mock data generation for legacy dataset by @dimapihtar :: PR: #9530
add reset learning rate functionality by @dimapihtar :: PR: #9372
Use closed-formula to round by multiple by @akoumpa :: PR: #9307
GPU unit tests: Mark flaky tests to be fixed by @pablo-garay :: PR: #9559
Consolidate gpt continue training script into pretraining script by @yaoyu-33 :: PR: #9413
Enable encoder adapters for Canary and MultiTaskAED models by @titu1994 :: PR: #9409
PTQ refinements by @janekl :: PR: #9574
Add ModelOpt QAT example for Llama2 SFT model by @kevalmorabia97 :: PR: #9326
Multimodal projection layer adapter fix for PP>1 by @paul-gibbons :: PR: #9445
Add offline quantization script for QLoRA deployment by @cuichenx :: PR: #9455
Make QLoRA more model-agnostic by @cuichenx :: PR: #9488
Set n_gpu to None in nemo export by @oyilmaz-nvidia :: PR: #9593
[NeMo-UX] Fix Megatron-optimizer by @marcromeyn :: PR: #9599
Chat template support for megatron_gpt_eval.py by @akoumpa :: PR: #9354
[NeMo-UX] Add PEFT by @cuichenx :: PR: #9490
Alit/mamba tmp by @JRD971000 :: PR: #9612
Enable MCore checkpointing optimizations by @mikolajblaz :: PR: #9505
Change mixtral moe key name for trt-llm by @oyilmaz-nvidia :: PR: #9620
fix ckpt load bug by @dimapihtar :: PR: #9621
Alit/mamba by @JRD971000 :: PR: #9575
Unwrap ckpt_io for model opt (async save) by @mikolajblaz :: PR: #9622
MCore T5 support for NeMo - Training by @huvunvidia :: PR: #9432
[Nemo-UX] Expose transformer_layer_spec inside GPTConfig by @marcromeyn :: PR: #9592
Update NeMo Clip to Use MCore Modules by @yaoyu-33 :: PR: #9594
Mistral + Mixtral Support for NeVa by @paul-gibbons :: PR: #9459
Adding support for mcore generate by @shanmugamr1992 :: PR: #9566
Improve error messaging during trt-llm export by @oyilmaz-nvidia :: PR: #9638
[Cherrypick] support lora when kv_channel != hidden_size / num_heads by @cuichenx :: PR: #9644
Parametrize FPS group by @mikolajblaz :: PR: #9648
Cherry-pick megatron export fix from main by @borisfom :: PR: #9643
add documentation for reset_lr feature by @dimapihta
chore: Pin branch in notebooks by @ko3n1g :: PR: #9697
Cherry pick: LITA Integration by @Slyne :: PR: #9684
SDXL improvements (and support for Draft+) by @rohitrango :: PR: #9654
Gemma 2 by @cuichenx :: PR: #9672
Allows non-strict load with distributed checkpoints by @mikolajblaz :: PR: #9613
refactor: notebook branch release by @ko3n1g :: PR: #9711
[NeMo-UX] Make TE and Apex dependencies optional by @ashors1 :: PR: #9550
Alit/r2.0.0 by @JRD971000 :: PR: #9718
Manually cherry-pick from PR 9679 (PR to main - Support SFT/Eval/PEFT for mcore T5) by @huvunvidia :: PR: #9737
In framework export by @oyilmaz-nvidia :: PR: #9658
T5 changes based on mcore changes by @pablo-garay :: PR: #9829
[NeMo-UX] Use single instance of loss reductions in GPTModel by @hemildesai :: PR: #9801
deprecate NeMo NLP tutorial by @dimapihtar :: PR: #9864
Disable nvFuser setup with PyTorch 23.11 and later by @athitten :: PR: #9837
make torch_dist ckpt strategy as default by @dimapihtar :: PR: #9852
add rampup bs documentation by @dimapihtar :: PR: #9884
copy of #9576 by @dimapihtar :: PR: #9986
Support Nvidia Torch and Arch versions by @thomasdhc :: PR: #9897
Bug fix for pooler causing dist checkpointing exception by @shanmugamr1992 :: PR: #10008

Export

Changelog

Update nemo.export module for quantized models by @janekl :: PR: #9218
Add save option to the TRT-LLM export test script by @oyilmaz-nvidia :: PR: #9221
Add TRT-LLM params like max_num_tokens and opt_num_tokens by @oyilmaz-nvidia :: PR: #9210
TRT-LLM Export Code Cleanup by @oyilmaz-nvidia :: PR: #9270
Re-org export code by @oyilmaz-nvidia :: PR: #9353
Use TensorRT-LLM native parameter names in nemo.export module by @janekl :: PR: #9424
TRT-LLM 0.10 Update by @oyilmaz-nvidia :: PR: #9402
vLLM Export Support by @apanteleev :: PR: #9381
Add page context fmha option in TensorRTLLM export by @meatybobby :: PR: #9526
Test C++ runtime on demand in nemo_export.py to avoid possible OOMs by @janekl :: PR: #9544
Fix nemo export test by @oyilmaz-nvidia :: PR: #9547
Add tps and pps params to the export script by @oyilmaz-nvidia :: PR: #9558
Add Multimodal Exporter by @meatybobby :: PR: #9256
Set n_gpu to None in nemo export by @oyilmaz-nvidia :: PR: #9593
Inflight nemo model export support by @JimmyZhang12 :: PR: #9527
vLLM Export Improvements by @apanteleev :: PR: #9596
Akoumparouli/nemo ux mixtral export by @akoumpa :: PR: #9603
Change mixtral moe key name for trt-llm by @oyilmaz-nvidia :: PR: #9620
Fix the arguments of forward_for_export function in msdd_models by @tango4j :: PR: #9624
Improve error messaging during trt-llm export by @oyilmaz-nvidia :: PR: #9638
Cherry-pick megatron export fix from main by @borisfom :: PR: #9643
In framework export by @oyilmaz-nvidia :: PR: #9658
Add missing imports for torch dist ckpt in export by @oyilmaz-nvidia :: PR: #9826~

Bugfixes

Changelog

use get with fallback when reading checkpoint_callback_params by @akoumpa :: PR: #9223
fix import by @akoumpa :: PR: #9240
Remove .nemo instead of renaming by @mikolajblaz :: PR: #9281
call set_expert_model_parallel_world_size instead of set_cpu_expert_m… by @akoumpa :: PR: #9275
Fix typos in Mixtral NeMo->HF and Starcoder2 NeMo->HF conversion scripts by @evellasques :: PR: #9325
Skip sequence_parallel allreduce when using Mcore DistOpt by @akoumpa :: PR: #9344
Add OpenAI format response to r2.0.0rc1 by @athitten :: PR: #9796
[NeMo UX] Support generating datasets using different train/valid/test distributions by @ashors1 :: PR: #9771
Add missing imports for torch dist ckpt in export by @oyilmaz-nvidia :: PR: #9826

General Improvements

Changelog

[Nemo CICD] run_cicd_for_release_branches_also by @pablo-garay :: PR: #9213
rename paths2audiofiles to audio by @github-actions[bot] :: PR: #9220
Fix ASR_Context_Biasing.ipynb contains FileNotFoundError by @github-actions[bot] :: PR: #9234
ci: Remove duplicated job by @ko3n1g :: PR: #9258
Fix document links by @yaoyu-33 :: PR: #9260
Pin transformers by @github-actions[bot] :: PR: #9273
Fix loading github raw images on notebook by @github-actions[bot] :: PR: #9283
Accept None as an argument to decoder_lengths in GreedyBatchedCTCInfer::forward by @github-actions[bot] :: PR: #9278
Refactor Sequence Packing Script by @cuichenx :: PR: #9271
[Nemo-UX] Move code to collections + fix some small bugs by @marcromeyn :: PR: #9277
Fix typo in HF tutorial by @github-actions[bot] :: PR: #9304
Expand documentation for data parallelism and distributed optimizer by @timmoon10 :: PR: #9227
Install alerting by @ko3n1g :: PR: #9311
typos by @github-actions[bot] :: PR: #9315
FP8 feature documentation by @ksivaman :: PR: #9265
[Nemo CICD] Comment out flaky tests by @pablo-garay :: PR: #9333
Fixed typos in README.rst by @gdevakumar :: PR: #9322
Update README.rst to clarify installation via Conda by @SimonCW :: PR: #9323
[Nemo CICD] update flaky test by @pablo-garay :: PR: #9339
fix lora and ptuning and isort/black by @github-actions[bot] :: PR: #9295
Fix P-tuning for Llama based models by @github-actions[bot] :: PR: #9300
add large model stable training fix and contrastive loss update for variable seq by @github-actions[bot] :: PR: #9348
Guard cuda memory allocator update by @github-actions[bot] :: PR: #9313
[Nemo CICD] Remove unnecessary commented out code by @pablo-garay :: PR: #9364
Update Gemma conversion script by @yaoyu-33 :: PR: #9365
Fix GreedyBatchedCTCInfer regression from GreedyCTCInfer. (#9347) by @github-actions[bot] :: PR: #9371
Re-enable cuda graphs in training modes. by @github-actions[bot] :: PR: #9343
fix typo infer_seq_lenght -> infer_seq_length by @akoumpa :: PR: #9370
Make a backward compatibility for old MSDD configs in label models by @github-actions[bot] :: PR: #9378
Dgalvez/fix greedy batch strategy name r2.0.0rc0 by @github-actions[bot] :: PR: #9253
Update README.rst by @jgerh :: PR: #9393
Force diarizer to use CUDA if cuda is available and if device=None. by @github-actions[bot] :: PR: #9390
ci: Properly catch failed tests by introduction of workflow templates by @ko3n1g :: PR: #9324
Fix T5 G2P Input and Output Types by @github-actions[bot] :: PR: #9269
Huvu/rag pipeline citest by @huvunvidia :: PR: #9384
Fix circular import for MM dataprep notebook by @github-actions[bot] :: PR: #9292
add check if num layers is divisible by pp size by @github-actions[bot] :: PR: #9298
[Nemo CICD] timeouts fix by @pablo-garay :: PR: #9407
[NeMo-UX] Removing un-used ModelConfig class by @marcromeyn :: PR: #9389
Add tutorial for Llama-3-8B lora training and deployment by @shashank3959 :: PR: #9359
[NeMo-UX] Removing default_path from ModelConnector by @marcromeyn :: PR: #9401
Fix README by @ericharper :: PR: #9415
[SD] Fix SD CUDA Graph Failure by @alpha0422 :: PR: #9319
[NeMo-UX] Adding file-lock to Connector by @marcromeyn :: PR: #9400
Add Dev Container Bug Report by @pablo-garay :: PR: #9430
Akoumparouli/profiling docs by @akoumpa :: PR: #9420
ci: Enrich notifications by @ko3n1g :: PR: #9412
Fix failing RIR unit test with lhotse 1.24+ by @pzelasko :: PR: #9444
[NeMo-UX] Adding support for mcore distributed optimizer by @marcromeyn :: PR: #9435
Use ModelOpt build_tensorrt_llm for building engines for qnemo checkpoints by @janekl :: PR: #9452
ci(notifications): Fix extraction of last 2K chars by @ko3n1g :: PR: #9450
Update readme with mlperf news by @ericharper :: PR: #9457
[NeMo-UX] Add nsys callback by @ashors1 :: PR: #9461
[NeMo UX] Introducing optimizer module by @marcromeyn :: PR: #9454
Fix minor import bug in deploy module by @oyilmaz-nvidia :: PR: #9463
ci(notifications): Fetch all jobs by @ko3n1g :: PR: #9465
Update build_dataset.py by @stevehuang52 :: PR: #9467
bionemo: bn2/add pipelineparallel dtype by @skothenhill-nv :: PR: #9475
[NeMo-UX] Integrate experiment manager features with NeMo-UX APIs by @ashors1 :: PR: #9460
Add python_requires by @galv :: PR: #9431
[NeMo-UX] Fixing imports of NeMoLogging, AutoResume & ModelCheckpoint by @marcromeyn :: PR: #9476
Modelopt Refactor for SDXL Quantization by @suiyoubi :: PR: #9279
[NeMo-UX] Fixing defaults in llm.train & Mistral7BModel by @marcromeyn :: PR: #9486
In framework deploy using deploy script by @oyilmaz-nvidia :: PR: #9468
[NeMo-UX] Integrate tokenizer import into model.import_ckpt by @marcromeyn :: PR: #9485
append to file by @malay-nagda :: PR: #9483
[NeMo-UX] Fix bug in import_ckpt by @marcromeyn :: PR: #9492
Add nemotron news by @ericharper :: PR: #9510
Add CICD test for Stable Diffusion by @michal2409 :: PR: #9464
Akoumparouli/nemo ux mixtral by @akoumpa :: PR: #9446
[NeMo-UX] Llama and Gemma by @cuichenx :: PR: #9528
[NeMo-UX] minor logging bug fixes by @ashors1 :: PR: #9529
Update neva conversion script from and to HF by @yaoyu-33 :: PR: #9296
[Nemo-UX] IO fixes by @marcromeyn :: PR: #9512
Fix lhotse tests for v1.24.2 by @pzelasko :: PR: #9546
[Nemo CICD] Make GPU Unit Tests non-optional by @pablo-garay :: PR: #9551
Add Python AIStore SDK to container and bump min Lhotse version by @pzelasko :: PR: #9537
[NeMo-UX] Fix tokenizer IO by @marcromeyn :: PR: #9555
[NeMo UX] Move mistral_7b.py to mistral.py by @akoumpa :: PR: #9545
ci: Do not attempt to send slack on fork by @ko3n1g :: PR: #9556
Fix SDXL incorrect name in Docs by @suiyoubi :: PR: #9534
Bump PTL version by @athitten :: PR: #9557
[Resiliency] Straggler detection by @jbieniusiewi :: PR: #9473
[NeMo-UX] Switch to torch_dist as default distributed checkpointing backend by @ashors1 :: PR: #9541
[NeMo-UX] Checkpointing bug fixes by @ashors1 :: PR: #9562
Expose MCore path_to_cache option by @maanug-nv :: PR: #9570
[NeMo-UX] Fix Trainer serialization by @marcromeyn :: PR: #9571
Update click version requirement by @thomasdhc :: PR: #9580
[Fault tolerance] Heartbeat detection by @maanug-nv :: PR: #9352
[Nemo-UX] Add fabric-API for manual forward-pass by @marcromeyn :: PR: #9577
[Nemo-UX] Add SDK-factories to llm-collection by @marcromeyn :: PR: #9589
[NeMo-UX] Some improvements to NeMoLogger by @marcromeyn :: PR: #9591
Set no_sync_func & grad_sync_fucn by @akoumpa :: PR: #9601
[NeMo-UX] Fix nemo logger when trainer has no loggers by @ashors1 :: PR: #9607
Fix the dictionary format returned by the scheduler method by @sararb :: PR: #9609
[NeMo-UX] Dataloading enhancements and bug fixes by @ashors1 :: PR: #9595
Fix serialization of AutoResume by @sararb :: PR: #9616
Jsonl support by @adityavavre :: PR: #9611
Akoumparouli/mistral import instruct chat template fix by @akoumpa :: PR: #9567
Remove .cuda calls, use device isntead by @akoumpa :: PR: #9602
fix converter defautl args by @akoumpa :: PR: #9565
fix: remove non_blocking from PTL's .cuda call by @akoumpa :: PR: #9618
NeVA Minor Fixes by @yaoyu-33 :: PR: #9608
[NeMo-UX] fix pretrianing data sizes and weights by @cuichenx :: PR: #9627
[NeMo-UX] async checkpointing support by @ashors1 :: PR: #9466
Change default parallel_save to False by @mikolajblaz :: PR: #9632
Add REST API to deploy module by @athitten :: PR: #9539
ci: Timeout per step, not job by @ko3n1g :: PR: #9635
[NeMo-UX] Fix when optimizers are setup for PEFT by @marcromeyn :: PR: #9619
[NeMo-UX] Fix pipeline parallel bug by @ashors1 :: PR: #9637
Fixing import error fior llama-index (RAG pipeline) by @pablo-garay :: PR: #9662
llama CI fix by @rohitrango :: PR: #9663
[NeMo-UX] Make 'load_directly_on_device' configurable by @ashors1 :: PR: #9657
[Nemo-UX] Including all trainable-params in a PEFT-checkpoint by @marcromeyn :: PR: #9650
[NeMo-UX] Fix imports so local configuration of runs works again by @marcromeyn :: PR: #9690
Set TE flag in legacy -> mcore conversion script by @terrykong :: PR: #9722
Update starthere docs text by @erastorgueva-nv :: PR: #9724
TorchAudio installation workaround for incorrect PYTORCH_VERSION variable by @artbataev :: PR: #9736
[NeMo-UX] Match nemo 1's default behavior for drop_last and pad_samples_to_global_batch_size by @ashors1 :: PR: #9707
add a bit more for timeout (#9702) by @pablo-garay :: PR: #9754
Fix missing parallelisms by @maanug-nv :: PR: #9725
update branch by @nithinraok :: PR: #9764
Fix data preprocessing script by @cuichenx :: PR: #9759
vLLM 0.5.1 update by @apanteleev :: PR: #9779
upper bound hf-hub by @akoumpa :: PR: #9805
Fix few issues and docs for neva and clip in r2.0.0rc1 by @yaoyu-33 :: PR: #9681
add dummy vision and text transformer config (assumed mcore to be false) by @rohitrango :: PR: #9699
fix lita bugs by @Slyne :: PR: #9810
[NeMo-UX] Log val_loss by @ashors1 :: PR: #9814
[NeMo-UX] Fix some dataloading bugs by @ashors1 :: PR: #9807
[NeMo-UX] Adding recipes by @marcromeyn :: PR: #9720
[NeMo-UX] Set async_save from strategy rather than ModelCheckpoint by @ashors1 :: PR: #9800
Fix hf hub for 0.24+ by @titu1994 :: PR: #9806
[NeMo-UX] Fix a minor bug with async checkpointing by @ashors1 :: PR: #9856
[NeMo-UX] make progress bar easier to parse by @ashors1 :: PR: #9877
Docs: add "Nemo Fundamentals" page by @erastorgueva-nv :: PR: #9835
Create init.py by @stevehuang52 :: PR: #9892
[NeMo-UX] Fixes to make PreemptionCallback work by @hemildesai :: PR: #9830
Fix Docker build. Make Dockerfile consistent with CI by @artbataev :: PR: #9784
Multimodal data prep notebook fix by @cuichenx :: PR: #9910
[NeMo-UX] Add distributed checkpointing unit tests by @ashors1 :: PR: #9794
r2.0.0rc1 fix for dist checkpoint loading by @yaoyu-33 :: PR: #9854
[NeMo-UX] Rename sdk references to NeMo Run by @hemildesai :: PR: #9872
[NeMo-UX] Fix some serialization bugs by @ashors1 :: PR: #9868
add mixtral neva tutorial (moe + token fusion + siglip) by @paul-gibbons :: PR: #9926
[NeMo-UX] Add more NeMo Logger tests by @ashors1 :: PR: #9795
Akoumparouli/mixtral fixes for r2.0.0rc1 by @akoumpa :: PR: #9911
R2.0.0rc1 clip fix by @Slyne :: PR: #9871
[NeMo-UX] Add missing docstrings and update some defaults by @ashors1 :: PR: #9895
Add REST service requirements.txt by @oyilmaz-nvidia :: PR: #9923
add bert latest fix by @JRD971000 :: PR: #9921
remove empy reconfigure_limit_batches by @akoumpa :: PR: #9934
fix mem by @terrykong :: PR: #9964
Run a sample query for a quantized model conditionally by @janekl :: PR: #9965
Add pydantic-settings by @oyilmaz-nvidia :: PR: #9961
Resiliency features update by @jbieniusiewi :: PR: #9714
[NeMo-UX] Wrap task config save in a try/except by @ashors1 :: PR: #9956
[NeMo-UX] Update default PTL logging save_dir by @ashors1 :: PR: #9954
Fix lita tutorial by @Slyne :: PR: #9980
Add deploy and REST API support to NeMo 2.0 by @athitten :: PR: #9834
ci: Allow changelog manual (#10156) by @ko3n1g :: PR: #10157
docs: Add changelog by @ko3n1g :: PR: #10155
add manifest file by @ko3n1g :: PR: #10161

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Neural Modules 2.0.0rc1

Highlights

Large language models

Multimodal

ASR

Export & Deploy

Detailed Changelogs

ASR

TTS

LLM/Multimodal

Export

Bugfixes

General Improvements

Contributors