Skip to main content
Press slash or control plus K to focus the search. Use the arrow keys to navigate results and press enter to open a threat.
Reconnecting to live updates…

[Open Source] A Decision-Support Tool for Assisted Malware Triage using EMBER2024, SHAP, and MCP Orchestration

0
Medium
Published: 06/29/2026 (06/29/2026, 07:40:12 UTC)
Source: Reddit Cybersecurity

Description

This is an open-source decision-support tool designed to assist human analysts in triaging Windows malware samples. It integrates multiple analysis techniques including static feature extraction (EMBER2024), SHAP explanations for model interpretability, reverse engineering tools, threat intelligence lookups, and optional sandbox execution to generate draft reports. The tool is an early prototype focused on improving the practical usefulness of model explanations by mapping features to concrete artifacts and indicating confidence levels. It is not itself malware or a vulnerability but a research project to aid malware analysis.

Reddit Discussion

r/cybersecurity·posted by u/Altruistic_Crab5125
00

Hi everyone,

We’re a group of students and this is a project we built for one of our courses. It’s an open-source assisted triage system for Windows malware:

https://github.com/zenniskayy2k4/xAI-in-Malware-Detection

The LLM works as an orchestrator in the pipeline. It coordinates evidence collection from static analysis, LightGBM scoring using EMBER2024 features, SHAP explanations, compiler-aware reverse engineering (Ghidra, dnSpy, etc.), threat intelligence lookups (VirusTotal, OTX, MalwareBazaar), a local HybridRAG knowledge base, and optional CAPE sandbox execution. The output is a draft report for a human analyst to review.

We focused on making the model explanations more useful in practice. Instead of just showing raw SHAP values, we try to map influential features back to concrete artifacts in the sample (imports, APIs, sections, strings, entropy, etc.) and indicate how reliable each mapping is. Feature hashing in EMBER-style models means some mappings stay ambiguous, so we tried to surface that uncertainty clearly.

Current limitations:

This is still an early prototype. Sandbox results are often inconclusive because of evasion techniques, missing arguments, or environment triggers. Decompilation and unpacking are best-effort and can fail on certain samples. The model can still over-interpret weak or conflicting evidence. We used a conservative calibration approach that requires evidence to converge across multiple sources before leaning toward a conclusion.

We’re interested in whether this kind of explainable, multi-source workflow is actually helpful for analysts. Any feedback or thoughts from people who work with malware would be appreciated.

Thanks!

AI-Powered Analysis

Machine-generated threat intelligence

AILast updated: 06/29/2026, 07:51:22 UTC

Technical Analysis

The project is an open-source assisted triage system for Windows malware that uses a large language model orchestrator to coordinate evidence collection from static analysis features (EMBER2024), SHAP-based model explanations, reverse engineering tools (Ghidra, dnSpy), threat intelligence sources (VirusTotal, OTX, MalwareBazaar), a local knowledge base, and optional sandbox execution (CAPE). It aims to provide explainable, multi-source analysis outputs to support human analysts in malware investigation. The tool is a prototype with limitations such as inconclusive sandbox results and potential over-interpretation of weak evidence. It is intended as an aid, not a threat or exploit.

Potential Impact

There is no direct security impact or exploitation associated with this tool. It is a benign open-source project intended to assist malware analysts by improving triage workflows and explainability of machine learning models. No known exploits or vulnerabilities are reported in relation to this tool.

Mitigation Recommendations

No mitigation or remediation is required as this is not a vulnerability or threat. The tool is an open-source research project for malware triage assistance. Users should evaluate it carefully before use and consider it an early prototype with limitations as described by the authors.

Pro Console: star threats, build custom feeds, automate alerts via Slack, email & webhooks.Upgrade to Pro

Technical Details

Source Type
reddit
Subreddit
cybersecurity
Reddit Score
0
Discussion Level
minimal
Content Source
reddit_link_post
Post Type
link
Domain
null
Newsworthiness Assessment
{"score":33,"reasons":["external_link","newsworthy_keywords:rce,malware","established_author","very_recent"],"isNewsworthy":true,"foundNewsworthy":["rce","malware"],"foundNonNewsworthy":[]}
Has External Source
true
Trusted Domain
false

Threat ID: 6a4223f327e9c7971977fbf0

Added to database: 06/29/2026, 07:51:15 UTC

Last enriched: 06/29/2026, 07:51:22 UTC

Last updated: 06/29/2026, 11:51:10 UTC

Views: 8

Community Reviews

0 reviews

Crowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.

Sort by
Loading community insights…

Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.

Actions

PRO

Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.

Please log in to the Console to use AI analysis features.

Need more coverage?

Upgrade to Pro Console for AI refresh and higher limits.

For incident response and remediation, OffSeq services can help resolve threats faster.

Latest Threats

Breach by OffSeqOFFSEQFRIENDS — 25% OFF

Check if your credentials are on the dark web

Instant breach scanning across billions of leaked records. Free tier available.

Scan now
OffSeq TrainingCredly Certified

Lead Pen Test Professional

Technical5-day eLearningPECB Accredited
View courses