WIP: bitnet.c – a 0 dependency BitNet implementation in C

This is my attempt to implement neural network training and inference with the BitLinear layer from the BitNet paper from scratch in C for learning purposes. The long term goal is to work towards an implementation of a smaller version of the LLaMA architecture. This repo also implements inference for a BPE tokenizer trained with the tiktoken library.

To keep things concise, the source files for layers, data structures and other utilities are implemented as single header libraries.

Usage

Training

The train program initializes a new model and trains it on the dataset specified. For example,

gcc mnist_train.c -o train_mnist -lm
./train_mnist

Project Structure

├── experiments/    # miscellaneous programs used to test ideas
├── layers/         # source files for layers of the LLM
├── utils/          # utility functions (data structures, matrix functions, dataloaders, etc.)
├── tests/          # unit tests for various libraries and functions
├── tokenizer.h     # single header library for inference on BPE tokenizer
└── mnist_bitmlp.c  # train and test bit multi layer perceptron on MNIST dataset

Some conventions

Function names for layers contain suffix corresponding to their forward and backward pass.

_fwd – forward pass
_bkwd – backpropagation

Gradient variables are prefixed with d eg. gradient of output of a layer is dy. Additionally, quantised variables contain a q suffix eg. quantised activations will be xq.

Roadmap

BitLinear implementation
- RMSNorm layer
- BitLinear layer
  - Bit matrix multiplications
  - GELU activation
  - Weight and activation quantisation/dequantisation functions
- BitLinear MLP Block
- Cross entropy loss implementation
- Training weight initialisation and allocation
- AdamW optimiser implementation
- Training loop on MNIST dataset for BitMLP
- Train a multilayer perceptron classifier for the MNIST dataset
- Parallelize code using OpenMP
Tokenizer implementation
- Loading tokenizer from file
- Base64 decoding
- Hashtable implementation
- PriorityQueue implementation
- Encode text to input ids using tokenizer
- Decode input ids to text using tokenizer
- Verify correctness of tokenizer implementation on sample corpus
BitNet transformer implementation
- Token embedding layer
- Grouped query attention block
- Forward and backward pass for BitNet architecture
- Dataloader implementation
- Saving and loading model weights
- Training loop implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WIP: bitnet.c – a 0 dependency BitNet implementation in C

Usage

Training

Project Structure

Some conventions

Roadmap

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
experiments		experiments
layers		layers
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md
mnist_bitmlp.c		mnist_bitmlp.c
tokenizer.h		tokenizer.h
tokenizer.model		tokenizer.model

kevin-pek/bitnet.c

Folders and files

Latest commit

History

Repository files navigation

WIP: bitnet.c – a 0 dependency BitNet implementation in C

Usage

Training

Project Structure

Some conventions

Roadmap

About

Topics

Resources

Stars

Watchers

Forks

Languages