Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model hyperparameter tuning #933

Closed
therealnb opened this issue Feb 5, 2025 · 2 comments
Closed

Model hyperparameter tuning #933

therealnb opened this issue Feb 5, 2025 · 2 comments
Assignees

Comments

@therealnb
Copy link
Contributor

We need to explore the dimensions and learning parameters of the ANN and optimise it.

@poppysec
Copy link
Member

poppysec commented Feb 5, 2025

I've started a notebook here - nn_hyperparameters

Using Optuna optimisation with 50 trials
Best parameters found via Optuna: {'hidden_dim': 170, 'learning_rate': 0.0014823754382745518, 'epochs': 83, 'test_size': 0.200317595018103, 'random_state': 42} Best validation accuracy: 0.9928516281300459
Using the MiniLM-L6 ANN with these parameters yields:

{'true_positive': 176,
 'false_positive': 38,
 'true_negative': 81,
 'false_negative': 16,
 'total_time': 12.742049217224121,
 'mean_time': 0.041103384571690715,
 'max_time': 0.37363195419311523,
 'min_time': 0.006579160690307617,
 'count': 310}

{'sensitivity': 0.9166666666666666,
  'specificity': 0.680672268907563,
  'precision': 0.822429906542056,
  'recall': 0.9166666666666666,
  'f1_score': 0.8669950738916257}

@poppysec poppysec self-assigned this Feb 5, 2025
@poppysec
Copy link
Member

poppysec commented Mar 5, 2025

See the most up to date tuning notebook here, using the LlamaCPP MiniLM version for consistency with CodeGate - nn_hyperparameters-llamacpp.ipynb

Training on Linux and MacOS command data only (n=6087), and extending the test dataset size with further synthetic commands to n=699

Training set:

count
os category
linux bad 1529
good 437
macos good 2494
bad 1627

Test set:
category
bad 419
good 280

Note: the test set is not labelled by OS, this is something I should do but for now treating all of them as Linux/MacOS should be a reasonable approximation due to their similarity.

The best trial hyperparameter were:

{'hidden_dim': 80, 'learning_rate': 0.000275870479272048, 'epochs': 62}
Final validation loss: 0.0978560414854679

True Positives (TP): 259
False Positives (FP): 99
True Negatives (TN): 320
False Negatives (FN): 21

🔍 Model Evaluation on Test Set:

Accuracy: 0.8283
Precision: 0.7235
Recall: 0.9250
F1 Score: 0.8119

I've added this model to the repository in ONNX (nn-0503.onnx) and PT formats (nn-0503.pt).

I'll close this out for now.

@poppysec poppysec closed this as completed Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants