GitHub - saifkhanali9/causal-shapley

Running the code:

Just run the code by pressing play button.
You can make configurations in you run by changing the very last line of causal_shaplye.py
- main(version='4', file_name='synthetic_discrete_2', local_shap=15, is_classification=True, global_shap=False)
Argument file_name has all the necessary information for the dataset. A csv file is located under output/dataset/file_name.csv which contains the complete dataset to be used for Shapley value computation.While causal structure of the data is located under which is located under output/dataset/file_name/causal_struct.json
Argument version specifies which version of shapley value you want to run. There are three versions at the moment
- a) version='1' -> Marginal shapley value
- b) version='2' -> Marginal shapley value (Optimised versions, i.e all the counts of unique rows of dataset are pre calculated)
- c) version='3' -> Conditional shapley value
- d) version='4' -> Causal shapley value

Run synthetic_data_gen.py
- uncomment gen_desc() to generate discrete dataset. Modify _add_features() method to add causality in the dataset.
- for continuous data use gen_dataset()
- In both cases, supply file name. It creates a csv file under output/dataset/file_name.csv
Train the model
- Run train(model_type='classification',file_name='synthetic_discrete_2', save_model=True) by specifying relevant arguments. It creates a folder of output/dataset/file_name under which train and test files are stored.
- Manually create a json file under output/dataset/file_name with name causal_struct.json with syntax
  - { "0": [ ], "1": [ ], "2": [ 0, 1 ], "3": [ 0, 1, 2 ] }
  - With keys being feature_id and value being the parents of that feature_id
- Model is saved under output/model/file_name.sav

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
notebooks		notebooks
output		output
scripts		scripts
README.md		README.md
requirements.txt		requirements.txt