This plugin adds K-means clustering analysis capabilities to Wireshark for detecting network anomalies and patterns in real-time or post-capture analysis.
- Real-time Analysis: Analyze packets as they are captured
- K-means Clustering: Group similar packets using machine learning
- Anomaly Detection: Identify unusual network behavior
- 🎨 Automatic Graph Generation: Creates 4 professional visualization graphs
- 🖼️ Auto-View Graphs: Automatically opens generated graphs in your default image viewer
- 🧹 Lua Conflict-Free: Enhanced isolation prevents matplotlib/Wireshark Lua conflicts
- Interactive Configuration: Adjust clustering parameters through Wireshark's GUI
- Statistics Dashboard: View analysis results and network statistics
- Multi-format Output: Export results in JSON format for further analysis
- Wireshark (any recent version)
- Python 3.7 or higher
- pip (Python package manager)
-
Clone or download this repository:
git clone <repository-url> cd WireSharkPlugin
-
Run the fixed installation script (recommended):
chmod +x install_plugin_fixed.sh ./install_plugin_fixed.sh
This installer fixes common issues including:
- Externally-managed Python environments (macOS with Homebrew)
- Matplotlib/Lua conflicts in virtual environments
- Wireshark API compatibility issues
- Virtual environment creation and management
-
Restart Wireshark to load the plugin
If you encounter issues, try these methods in order:
./install_plugin_enhanced.shcd WireSharkPlugin
python3 -m venv venv
source venv/bin/activate
pip install -r requirements_minimal.txt # Note: uses minimal requirements without matplotlib
./install_plugin.shpip3 install --user pandas numpy scikit-learn
./install_plugin.shbrew install python-pandas python-scikit-learn
./install_plugin.shIf the automatic installation doesn't work, follow these steps:
-
Install Python dependencies:
pip3 install -r requirements.txt
-
Find your Wireshark plugin directory:
- macOS:
~/.local/lib/wireshark/pluginsor~/.wireshark/plugins - Linux:
~/.local/lib/wireshark/pluginsor~/.wireshark/plugins - Windows:
%APPDATA%\Wireshark\plugins
- macOS:
-
Copy plugin files:
cp kmeans_analyzer.lua /path/to/wireshark/plugins/ cp wireshark_kmeans_backend.py /path/to/wireshark/plugins/ chmod +x /path/to/wireshark/plugins/wireshark_kmeans_backend.py
The improved plugin works seamlessly with Wireshark's opened packet captures. Here are several ways to analyze your data:
-
Start Wireshark and open your packet capture file (File > Open)
-
Open Lua Console: Go to
Tools > Lua > Console -
Enable Real Data Mode (NEW!):
toggle_real_data_mode() # Enable real packet analysis
-
Run Analysis: In the console, type:
force_collect_packets() # Now collects REAL packets from your file! run_kmeans_analysis() # Analyzes actual network traffic
-
View Results: The console will show analysis progress and results:
K-means Analyzer: Successfully collected 17 REAL packets K-means Analyzer: Protocols found: SIP(2), ICMP(1), unknown(14) === Analysis Complete === Total packets analyzed: 17 (REAL DATA) Number of clusters: 5 High anomaly packets: 2
NEW FEATURES 🎯:
- ✅ Real packet collection - Reads actual data from opened capture files
- ✅ Protocol detection - Shows real network protocols (SIP, HTTP, TCP, etc.)
- ✅ Accurate counts - Matches your actual capture file packet count
- ✅ Automatic fallback - Uses sample data only if real collection fails
- ✅ Enhanced statistics - Real protocol distribution and analysis
If the Lua integration doesn't capture all packets, use the standalone export script:
-
Export your capture: In Wireshark, go to
File > Export Packet Dissections > As CSV... -
Save the CSV file to your desktop or downloads folder
-
Run the analyzer:
./export_and_analyze.sh
Choose option 2 and enter the path to your CSV file
For advanced users who prefer command-line analysis:
# Export current capture to CSV
tshark -r your_capture.pcap -T csv -E header=y > capture.csv
# Run analysis
python3 wireshark_kmeans_backend.py capture.csv --clusters 5 --format json
# Or use the virtual environment
~/.local/lib/wireshark/plugins/venv/bin/python \
~/.local/lib/wireshark/plugins/wireshark_kmeans_backend.py \
capture.csv --clusters 5 --output results.jsonOnce the plugin is loaded, these commands are available in Wireshark's Lua console:
run_kmeans_analysis()- Analyze currently collected packet data (REAL or sample)force_collect_packets()- NOW COLLECTS REAL PACKETS from opened capture filestoggle_real_data_mode()- NEW! Switch between real packet data and sample datashow_kmeans_stats()- Display packet collection statistics with real protocol infoclear_kmeans_data()- Clear collected packet datashow_kmeans_config()- Show current configurationset_kmeans_clusters(N)- Set number of clusters (2-20)toggle_kmeans_realtime()- Enable/disable real-time analysis
✨ NEW: Real Data Features:
- Real packet collection uses
tsharkto extract actual packet data - Automatic detection of opened capture files
- Real protocol analysis (shows actual SIP, HTTP, TCP, DNS, etc.)
- Accurate packet counts matching your capture files
- Fallback to sample data only if real collection fails
-
Start Wireshark and begin capturing packets or open an existing capture file
-
For Real Packet Analysis with Graphs (Recommended - Uses Actual Capture Data):
# Enhanced script with native PCAP/PCAPNG support + AUTOMATIC GRAPHS ./analyze_real_data_enhanced.sh # Or specify a file directly (generates graphs automatically) ./analyze_real_data_enhanced.sh /path/to/capture.pcap # Original script (also generates graphs now) ./analyze_real_data.sh
The enhanced script provides:
- Real Packet Data - Analyzes actual network traffic, not synthetic data
- 🎨 Automatic Graph Generation - Creates 4 professional visualizations
- Native PCAP/PCAPNG support - Direct analysis without manual export
- Auto-detection - Finds capture files in Downloads, Desktop, current directory
- File validation - Checks file integrity before analysis
- Interactive selection - Choose from multiple files when found
- Real-time feedback - Shows packet count, file size, protocols detected
- Support for multiple formats - .pcap, .pcapng, .cap files
-
For Demo/Learning with Sample Data (Wireshark Console):
# Open Wireshark > Tools > Lua > Console force_collect_packets() # Generates sample data run_kmeans_analysis() # Analyzes sample patterns
⚠️ Note: Console commands use synthetic data for demonstration -
View Generated Graphs: After analysis, check the current directory for:
- 📊
kmeans_cluster_distribution.png- Cluster sizes and anomaly distribution - 🗺️
kmeans_pca_clusters.png- 2D cluster visualization with centroids - 📈
kmeans_feature_importance.png- Feature variance analysis - 🚨
kmeans_anomaly_analysis.png- Anomaly detection timeline
- 📊
-
Access the plugin through the Tools menu:
Tools > K-means Analyzer > Configuration- Configure analysis parametersTools > K-means Analyzer > Run Analysis- Perform analysis on sample dataTools > K-means Analyzer > Statistics- View current analysis statisticsTools > K-means Analyzer > Clear Data- Reset analysis data
-
View results in the packet details pane - each packet will show:
- Cluster assignment
- Anomaly score
- Extracted features
- Analysis results
- Default: 5
- Range: 2-20
- Description: Number of clusters for K-means algorithm. More clusters provide finer granularity but may over-segment the data.
- Default: 100
- Range: 50-10000
- Description: Minimum number of packets required before analysis can be performed.
- Default: Disabled
- Description: When enabled, analysis runs automatically every 500 packets. May impact performance on high-traffic captures.
- Default: Auto-detected
- Description: Path to the Python backend script. Usually auto-detected during installation.
Packets are grouped into clusters based on similarities in:
- Packet length
- Protocol type
- IP address patterns (local vs external)
- Timing patterns
- Error flags and connection states
Anomalies are detected using multiple methods:
- High Anomaly Score: Packets far from any cluster center
- Small Clusters: Clusters containing very few packets (< 5% of total)
- Unusual Patterns: Packets with rare protocol combinations or error flags
- Range: 0.0 to 1.0
- Low scores (0.0-0.3): Normal traffic patterns
- Medium scores (0.3-0.7): Potentially interesting traffic
- High scores (0.7-1.0): Likely anomalies requiring investigation
-
Quick Analysis of any capture file:
./analyze_real_data_enhanced.sh
The script will automatically find and analyze real packet capture files.
-
Direct File Analysis:
./analyze_real_data_enhanced.sh /path/to/capture.pcap
Analyze a specific PCAP/PCAPNG file directly.
-
Results Review: The analysis shows:
- Real protocol distribution (SIP, HTTP, DNS, etc.)
- Cluster assignments for packet patterns
- Anomaly detection results
- Feature extraction from actual network data
- Start capture on your network interface
- Enable real-time analysis in the configuration
- Monitor for high anomaly scores and small clusters
- Investigate flagged packets for potential security issues
- Load a suspicious packet capture file
- Run full analysis with appropriate cluster count
- Examine anomaly reports and cluster characteristics
- Export results for further analysis or reporting
- Capture traffic during performance issues
- Analyze traffic patterns and cluster distribution
- Identify unusual protocols or connection patterns
- Correlate with performance metrics
Your plugin now includes powerful real data analysis capabilities that work directly with PCAP/PCAPNG files:
# Auto-detect and analyze capture files (WITH GRAPHS!)
./analyze_real_data_enhanced.sh
# Analyze a specific file (WITH GRAPHS!)
./analyze_real_data_enhanced.sh /path/to/capture.pcapThe K-means analysis now automatically generates 4 professional visualization graphs:
- 📊 Cluster Distribution Chart - Shows the size of each cluster and anomaly score distribution
- 🗺️ PCA Cluster Plot - 2D visualization of packet clusters using Principal Component Analysis
- 📈 Feature Importance Plot - Shows which packet features are most important for clustering
- 🚨 Anomaly Analysis Plot - Timeline of anomaly scores and packet length correlation
Generated Files:
kmeans_cluster_distribution.png- Cluster sizes and anomaly distributionkmeans_pca_clusters.png- 2D cluster visualization with centroidskmeans_feature_importance.png- Feature variance analysiskmeans_anomaly_analysis.png- Anomaly detection timeline
Example Graph Output:
🎨 Generating visualization graphs...
📊 Generated cluster distribution plot: ./kmeans_cluster_distribution.png
📊 Generated PCA cluster plot: ./kmeans_pca_clusters.png
📊 Generated feature importance plot: ./kmeans_feature_importance.png
📊 Generated anomaly analysis plot: ./kmeans_anomaly_analysis.png
🎨 Graph generation complete! Generated 4 visualization files
- ✅ Actual Network Protocols: SIP, HTTP, DNS, ICMP, TCP, UDP, etc.
- ✅ Real Traffic Patterns: Genuine packet timing, sizes, and characteristics
- ✅ Authentic Anomalies: True network anomalies, not synthetic patterns
- ✅ Security Insights: Real attack patterns, malware communications, etc.
- ✅ Professional Graphs: High-resolution PNG visualizations for reports and analysis
.pcap- Standard packet capture format.pcapng- Next generation packet capture format.cap- Alternative packet capture format
📦 Packets: 17
🔬 Protocols detected: SIP (2), ICMP (1), unknown (14)
🧠 Clusters found: 5
🚨 Anomalies: Small clusters detected (potential security incidents)
- Sample Data: Generic synthetic packets for testing
- Real Data: Actual network traffic from your captures
- Detection: Script automatically validates you're analyzing real traffic
The plugin extracts the following features from each packet:
- Packet length: Size in bytes
- Protocol encoding: Numeric representation of protocol type
- IP locality: Whether source/destination IPs are local
- Timing: Time delta from previous packet
- Flags: Error flags, SYN/FIN flags, DNS queries
- Feature standardization: Z-score normalization
- K-means clustering: Sklearn implementation with k-means++
- Anomaly scoring: Distance to nearest cluster centroid
- PCA visualization: Dimensionality reduction for visualization
- Memory usage: ~100 bytes per packet for feature storage
- CPU usage: Analysis runs in separate Python process
- Real-time limits: Recommended for captures < 10,000 packets/second
- Batch processing: Better for large historical captures
If you see errors like:
Lua: Error during loading:
...matplotlib/mpl-data/kpsewhich.lua:2: attempt to index a nil value (global 'kpse')
Lua: Error during loading:
...kmeans_analyzer.lua:374: attempt to call a nil value (global 'register_init_routine')
Lua: Error during loading:
...kmeans_analyzer.lua:424: bad argument #1 to 'register_postdissector' (userdata expected, got function)
Lua: Error during execution of Menu callback:
...kmeans_analyzer.lua:355: attempt to call a nil value (field 'maxn')
The matplotlib Lua conflict error has been completely eliminated using our enhanced isolation system.
✅ FIXED SOLUTIONS:
- Automatic Fix Applied: The matplotlib Lua files have been disabled in your environment
- Clean tshark Wrapper: All analysis now uses
tshark_clean.shwhich completely isolates Lua environments - Enhanced Analysis Script:
simple_analysis.shv4.1.0 includes built-in Lua conflict prevention
How the fix works:
- ✅ Matplotlib's problematic
kpsewhich.luafile renamed to.disabled - ✅ Clean environment wrappers isolate Python from Wireshark Lua
- ✅ All tshark operations use completely clean Lua environment
- ✅ Analysis scripts automatically detect and use conflict-free methods
Verification:
# This should now run without any Lua errors:
./tshark_clean.sh --version
# Analysis also runs clean:
./simple_analysis.shIf you still see Lua errors:
# Run the automatic fixer:
./fix_matplotlib_lua_conflict.sh
# Or use the clean launcher:
./run_wireshark_clean.shThe enhanced analysis system automatically:
- ✅ Isolates Python/matplotlib from Wireshark's Lua environment
- ✅ Generates graphs using a non-interactive backend
- ✅ Opens graphs automatically in your default image viewer
- ✅ Works despite the harmless Lua warning
NEW FEATURE: Graphs now open automatically after analysis!
What happens:
- Analysis completes and generates 4 professional graphs
- Graphs automatically open in your default image viewer (Preview on macOS)
- You can immediately see the visualization results
Example output:
🖼️ Opening 4 graphs in default viewer...
🖼️ Opened: kmeans_cluster_distribution.png
🖼️ Opened: kmeans_pca_clusters.png
🖼️ Opened: kmeans_feature_importance.png
🖼️ Opened: kmeans_anomaly_analysis.png
✅ Graph viewing complete!
Control auto-opening:
# Disable auto-opening
python3 wireshark_kmeans_backend_enhanced.py capture.csv --no-auto-open
# Force auto-opening (default)
python3 wireshark_kmeans_backend_enhanced.py capture.csv --auto-openThis installer:
- Removes matplotlib to avoid Lua conflicts
- Uses Wireshark-compatible Lua API functions
- Removes problematic post-dissector registration
- Fixes table.maxn compatibility for modern Lua versions
- Adds packet collection functionality for existing captures
- Creates a clean virtual environment
- Provides better error handling
Great News: The Wireshark plugin has been enhanced to work with REAL packet data from your capture files!
What's New in Version 3.0.0:
- ✅ Real packet collection - No more synthetic data limitations
- ✅ tshark integration - Extracts actual packet data from opened files
- ✅ Accurate analysis - Shows real packet counts and protocols
- ✅ Smart fallback - Uses sample data only if real collection fails
How to Use Real Data in Wireshark:
- Open Wireshark and load your capture file (File > Open)
- Open Lua Console (Tools > Lua > Console)
- Enable real data mode:
toggle_real_data_mode() # Enables real packet analysis
- Collect real packets:
force_collect_packets() # Now reads your actual capture data!
- Run analysis:
run_kmeans_analysis() # Analyzes real network traffic
Example Output with Real Data:
K-means Analyzer: Successfully collected 17 REAL packets
K-means Analyzer: Protocols found: SIP(2), ICMP(1), unknown(14)
Total packets analyzed: 17 (matches your capture file!)
Alternative Methods:
- ✅ Enhanced Scripts:
./analyze_real_data_enhanced.sh(still recommended for batch analysis) - ✅ Wireshark Plugin: Now supports real data with console commands above
-
Use the enhanced script for real analysis:
./analyze_real_data_enhanced.sh
-
Use Wireshark plugin for:
- Learning how the analysis works
- Quick demonstration with sample data
- Understanding cluster analysis concepts
After successful installation, the plugin provides console commands:
run_kmeans_analysis()- Perform full analysisshow_kmeans_stats()- Display packet statisticsclear_kmeans_data()- Clear collected datashow_kmeans_config()- Show current configuration
If you see this error during installation:
error: externally-managed-environment
× This environment is externally managed
This is common on macOS with Python installed via Homebrew. Solutions:
-
Use the enhanced installer (recommended):
./install_plugin_enhanced.sh
-
Create a virtual environment:
python3 -m venv venv source venv/bin/activate pip install -r requirements.txt -
Install with user flag:
pip3 install --user -r requirements.txt
-
Use Homebrew packages:
brew install python-pandas python-scikit-learn python-matplotlib
- Check Wireshark console for error messages
- Verify plugin directory is correct for your system
- Ensure Lua support is enabled in Wireshark
- Check file permissions on plugin files
-
Verify Python installation:
python3 --version
-
Check dependencies:
python3 -c "import pandas, numpy, sklearn; print('Dependencies OK')" -
Test backend manually:
python3 wireshark_kmeans_backend.py --help
- Check minimum packet threshold in configuration
- Verify CSV export format is correct
- Check available disk space for temporary files
- Review Python script path in configuration
- Disable real-time analysis for large captures
- Increase minimum packet threshold
- Use packet sampling for very large captures
- Close other resource-intensive applications
Modify wireshark_kmeans_backend.py to add custom features:
def extract_custom_features(self, df):
"""Add your custom feature extraction logic here"""
features = self.extract_features(df)
# Add custom features
features['custom_feature'] = your_calculation
return featuresExport analysis results for use with other security tools:
# Export to JSON for SIEM integration
python3 wireshark_kmeans_backend.py capture.csv --format json --output results.json
# Process results with jq
jq '.anomalies[] | select(.anomaly_score > 0.8)' results.jsonProcess multiple capture files:
#!/bin/bash
for file in *.pcapng; do
# Convert to CSV (requires tshark)
tshark -r "$file" -T csv > "${file%.pcapng}.csv"
# Analyze
python3 wireshark_kmeans_backend.py "${file%.pcapng}.csv" \
--output "${file%.pcapng}_analysis.json"
done- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- Check the troubleshooting section above
- Review Wireshark's plugin documentation
- Submit an issue with detailed error information
After cleanup, the repository contains only the essential files:
kmeans_analyzer_simple.lua- Final working Wireshark plugin with Lua conflict preventionwireshark_kmeans_backend_enhanced.py- Enhanced Python backend with graph generation
simple_analysis.sh- Main Wireshark integration script with auto-detectionanalyze_real_data_enhanced.sh- Standalone enhanced analysis scriptexport_and_analyze.sh- Manual CSV export and analysis workflow
tshark_clean.sh- Clean tshark wrapper (eliminates Lua errors)run_wireshark_clean.sh- Clean Wireshark launcherrun_analysis_isolated.sh- Isolated Python environment runnerfix_matplotlib_lua_conflict.sh- Automatic matplotlib conflict fixer
install_plugin.sh- Basic installationinstall_plugin_enhanced.sh- Enhanced installer for externally-managed environmentsinstall_plugin_fixed.sh- Fixed installer with comprehensive error handling
README.md- Complete documentation and usage guiderequirements.txt- Full Python dependenciesrequirements_minimal.txt- Minimal dependencies (without matplotlib)wiresharkanalyzer.py- Original source analyzer (reference)
- ✅ Complete Lua Conflict Elimination - Zero matplotlib/Wireshark Lua errors
- ✅ Automatic Graph Generation - 4 professional visualization graphs created automatically
- ✅ Auto-Opening Graphs - Generated graphs open automatically in default image viewer
- ✅ Enhanced Real Data Analysis - Works with currently opened Wireshark capture files
- ✅ Clean Environment Isolation - Completely isolated Python/Lua environments
- ✅ Repository Cleanup - Streamlined to essential files only
- ✅ Cross-Platform Graph Support - Works on macOS, Linux, and Windows
- ✅ Production Ready - Stable, reliable, and conflict-free operation
- Real packet data collection from opened capture files
- tshark integration for actual packet analysis
- Smart fallback between real and sample data
- Enhanced protocol detection
- Wireshark API compatibility fixes
- Virtual environment support
- Enhanced error handling
- Multiple installation methods
- Initial release
- K-means clustering analysis
- Real-time and batch processing
- Anomaly detection
- Wireshark GUI integration