Skip to content

mstrielnikov/score-standardization-spark

Repository files navigation

jooble-test

PySpark module to calculate Standardization (Z-score Normalization) using input data from train source and test source CSV-files

Z-score calculation

File with full answer

Peace of processed data:

+----+--------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
|  id|              id_job|feature_type_1_stand_0|feature_type_1_stand_1|feature_type_1_stand_2|feature_type_1_stand_3|feature_type_1_stand_4|feature_type_1_stand_5|feature_type_1_stand_6|feature_type_1_stand_7|feature_type_1_stand_8|feature_type_1_stand_9|+----+--------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
|1000|-6241722208723555...|        1.874119|       1.6593564|       0.8987508|       1.1503949|       1.0942013|      0.68286616|      0.90509194|       1.2402406|        1.580777|       1.3857063|
|1001|-5096317892853693...|      -0.5670121|     -0.57688075|      -1.2766525|       -1.335712|       0.3332974|      -0.5454587|      -5.0053487|      0.11135264|      -1.7098045|      -0.2194992|
|1002|29967041948702897681|      -1.0181122|      0.13161355|       0.7485238|      0.74760365|     -0.30442262|       0.2719799|       0.6487112|       0.7961051|     0.056004073|       0.5310727|
|1003|-5441291940773558...|      0.70644784|       1.0294902|      0.68424404|       0.9197511|       0.7333715|       0.3716675|       0.6505363|       1.2201006|      0.17830738|       1.2971437|
|1004|36626971604780743751|     -0.30054343|     0.064822346|      -1.7576681|      -1.1777712|      -0.3382161|    -0.106833324|      -6.4213724|     -0.10700457|      -1.4053856|     0.036226884|
|1005|20382130652641948441|      -1.4063373|      -1.0613286|       0.3498444|      0.49439707|      -4.6779833|    0.0032566471|      0.36313412|      -1.1203536|      0.14706217|       0.0860433|
|1006|27668848362618962891|      0.27131617|       1.0328721|       0.7738022|       1.1269964|       0.6450721|     -0.05828949|       0.8184155|       1.3197397|       0.6505587|       1.0491669|
|1007|-1458668781782492...|       1.7214233|       1.4293919|       1.1414255|       1.5189236|      0.15560827|      0.60484976|       1.0538112|       1.4373983|       1.5727428|        1.570582|
|1008|-3330932239266492...|       0.7403804|      0.35735062|       1.0359774|       1.3058287|       0.8511045|      0.27978128|      0.86494684|       1.3070196|       0.9487287|       1.1178035|
|1009|-5769748607494874...|       -0.913321|     -0.56927186|      -1.0975356|      -1.3616179|     0.006261757|      -0.3018741|      -1.5081708|       0.0742532|      -2.5034368|      -0.3357384|
|1010|65514535298139862831|      0.23937993|      0.75640714|      -0.6302431|      -1.0106381|      0.72247046|     -0.17704783|       -0.489035|      0.71766615|      -1.7026632|       0.6340272|
|1011|42392514306665462181|      -1.6109293|      -0.8169909|      0.29784268|       0.2846449|      -1.4283363|       -0.309676|      0.21167837|      -0.3539818|     -0.17074777|      0.10154177|
|1012|82248639158819132711|      0.09067627|      0.62958854|       0.5672402|       0.7726738|     0.028063875|       -4.909175|       0.4853941|      0.60000753|      0.29971784|      0.98274475|
|1013|-6353107627924350921|      -0.7466532|     -0.88378215|       0.5354613|      0.09829126|     -0.49410313|      0.18096067|       0.6377629|    -0.115484625|       0.3961321|      -1.1682309|
|1014|-1611141040193629191|      -1.7815892|     -0.25983414|      0.15772705|      -0.2317969|     -0.32731506|     -0.22472467|     -0.07663579|     -0.72921807|      -0.6429991|     -0.62688965|
|1015|13870754363486088161|      0.28628582|      0.42667812|        1.160926|       1.5030463|      -1.1677972|       0.5927139|       1.1167654|     -0.11760495|        1.266538|        1.105626|
|1016| 3165019896624897921|       1.6435788|       1.5773469|       1.1977606|        1.644274|       0.5687634|       0.5216319|       1.1185905|       1.7628151|       1.4441905|       1.8905158|
|1017|-1841292869397685...|    -0.014114834|    -0.026487194|      -1.2549853|     -0.71230507|     -0.24337554|    -0.034884825|     -0.82023084|      0.10181305|      -1.1179285|      0.30412978|
|1018|39464197741528034281|     -0.99715406|      0.17895901|       0.9175293|       1.1495596|    -0.020991214|       0.3205232|      0.81111634|      0.20463194|      0.67555493|       1.3071067|
|1019|-8357176701973689...|       -0.728689|      -0.3114071|       0.3794562|       0.4083231|      -1.3051524|       -5.244645|      0.19251779|      0.62544703|     -0.09218829|      0.46243614|
+----+--------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
only showing top 20 rows

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published