pip install pygwalker
PyGWalker
PyGWalker is a python library developed by Kanaries Open Source Community.
PyGWalker can simplify your Jupyter Notebook data analysis and data visualization workflow, by turning your pandas dataframe into an interactive user interface for visual exploration.
PyGWalker (pronounced like โPig Walkerโ, just for fun) is named as an abbreviation of โPython binding of Graphic Walkerโ. It integrates Jupyter Notebook with Graphic Walker, an open-source alternative to Tableau. It allows data scientists to visualize / clean / annotates the data with simple drag-and-drop operations and even natural language queries.
Objectives of PyGWalker
PyGWalker (Python + Graphic Walker) is designed to bridge the gap between data analysis and interactive visualization in Jupyter Notebooks. Its main objectives are:
๐น 1. Simplify Data Exploration
- Provide a drag-and-drop interface for visualizing data without writing complex code.
- Enable quick analysis from large datasets.
๐น 2. Enhance Productivity for Data Scientists
- Less time is spent on coding and debugging visualization scripts , allowing users to focus on data analysis.
๐น 3. Provide an Intuitive UI Like Tableau
- Offer an Tableau-like experience inside Jupyter Notebooks.
- Make the viewing and manipulation of data interactive.
๐น 4. Efficiently Handle Large Datasets
- Use DuckDB for efficient data aggregation and query execution.
- Handle large datasets without slowing down.
๐น 5. Seamless Pandas Integration
๐น 6. Allow Customizable & Interactive Visualizations
- Support for bar, line, scatter, pie charts, and more.
- Allow users to filter, group, and aggregate data dynamically.
๐น 7. Promote Open-Source & Collaboration
- Provide a lightweight, open-source alternative to BI(Bussiness Intelligence) tools like Tableau.
- Support exporting visualisations as images, HTML, or shareable reports.
Why Use PyGWalker?
- No need to manually code every visualization
- quicker investigation and analysis
- Handles large datasets efficiently
- Interactive, shareable insights
To install PyGWalker, open command propmt and execute the following code
now we can import the library
import pygwalker as pyg
import pandas as pd
here we use the dataset kaggle_income(source link at end) which contains information of income at households in different cities of all states of the USA
= pd.read_csv('C:\Users\hp\PSDV\Assignment 4\income.csv' , encoding = 'latin-1') df
now we will read the data using PyGWalker
pyg.walk(df)
Key features
1) Drag and drop to X and Y axis
You can simply drag your dataset columns onto the X and Y axes to create charts.
Want to see row count per state? Drag State_Name to X and Row count to Y
Need an area chart to compare county wise mean income? Drage State_name to X and Mean to Y
2) Aggregations
PyGWalker automatically aggregates data to give you meaningful insights.
Want to see total income per sate? โ PyGWalker will sum it for you. But hey, you realise that different states have different no of cities considered, so the states with more number of cities considered will have a higher value. so it would be better to average these out.
How do I do this?
Just switch the aggregation to MEAN instead of SUM.
to use aggregation, aggregation mode must be toggled on.
Following aggregations can be used: Sum โ Total of all values Count โ Number of occurrences Median โ Median of all values Min/Max โ Smallest and largest values Standard Deviation โ standard deviation of all values Variation โ variation of all values
3) Filters
Data exploration is incomplete without filters! PyGWalker allows you to filter data interactively:
Want to see the data for only some specific states? Just drag and drop state_Name column in filter box.
Need to analyze only states with cities with median income only in a specific range? Apply range filter on median.
4) Color
Make your charts visually appealing and informative by adding color categories.
Assign different colors to regions, product categories, or user segments.
Example: In a bar chart of State_Name vs Mean income, drag the mean block to color filter for better comparison.
5) Opacity, Size, Shape, Details
Take your scatter plots and bubble charts to the next level: - Opacity โ Adjust transparency to handle dense data points. - Size โ Map a column (e.g., income) to bubble size in a scatter plot. - Shape โ Differentiate categories using shapes instead of colors. - Details โ Add extra information on hover (e.g., show Customer Name when hovering over a data point).
6) Sorting
Quickly sort data in ascending or descending order:
Want to see State with highest avg income? just click on sort in descending order.
7) Transpose
Use this feature if you want to swap X and Y axis.
8) Layout Customisation
Organize your visualizations exactly how you want. PyGWalker provides: - Grid layout โ Arrange multiple charts side by side. - Resizable panels โ Adjust chart sizes for better comparisons. - Multiple tabs โ Separate different analyses into different tabs.
9) Painter Mode
Want to highlight trends, annotate data points, or mark key areas?
PyGWalker lets you draw directly on charts to emphasize insights. This is great for presentations and storytelling.
10) Different Chart Types
PyGWalker supports a variety of built-in visualizations, including: - Bar Charts โ Compare categorical data - Line Charts โ Show trends over time - Scatter Plots โ Analyze relationships between two variables - Geospatial Maps โ Visualize location-based data - Pie Charts โ Display proportions
We can change between chart types without having to write a single line of code.
11) Data tab
Need to see your raw data while analyzing?
The Data Tab gives you a spreadsheet-like view right inside PyGWalker.
Quickly scan through row-level data. Apply filters, sorting, and calculations directly
12) Export Options
PyGWalker lets you export your insights for presentations and sharing: - Save as an Image โ PNG/JPG format - Export as HTML โ Share an interactive report - Copy to Clipboard โ Paste directly into documents or emails
Applications of PyGWalker
Whether youโre a data scientist, business analyst, marketer, or researcher, PyGWalker helps you uncover insights and create visual stories efficiently. Here are some of its key applications:
1. Exploratory Data Analysis (EDA)
Before diving into machine learning or reporting, itโs crucial to understand the dataset. PyGWalker helps with:
- Identifying missing values and data distributions - Understanding relationships between variables
- Spotting trends and patterns in time-series data
Example: A data scientist analyzing customer demographics can quickly visualize age distributions and spending habits.
2. Business Intelligence and Decision-Making
Managers and analysts rely on dashboards for data-driven decisions. PyGWalker makes it easier to:
- Track sales trends over time
- Analyze revenue by product category
- Segment customers based on behavior
- Compare regional market performance
Example: A sales manager can filter data by region and product type to spot profitable trends.
3. Marketing Analytics and Customer Insights
Marketing teams use data to optimize campaigns. PyGWalker supports:
- Conversion funnel analysis to understand drop-off points
- A/B testing results for performance comparisons
- Customer churn analysis to identify retention patterns
- Ad performance tracking across multiple channels
Example: A digital marketer can analyze campaign effectiveness without writing SQL queries.
4. Healthcare and Medical Research
Medical professionals and researchers rely on data visualization for better decision-making. PyGWalker helps with:
- Tracking disease outbreaks over time
- Analyzing patient demographics and health trends
- Comparing treatment effectiveness across groups
- Monitoring hospital resource utilization
Example: A public health researcher can visualize disease spread patterns across different regions.
5. Financial Data Analysis
Finance professionals need real-time data monitoring for risk management and investments. PyGWalker enables:
- Stock market trend analysis using line charts
- Expense vs. revenue tracking
- Fraud detection through anomaly identification
- Portfolio performance comparisons
Example: An investment analyst can visualize stock price movements and sort data by highest or lowest returns.
6. Engineering and System Monitoring
IT and DevOps professionals need real-time monitoring tools. PyGWalker can be used for:
- Analyzing server performance logs
- Detecting network anomalies in large datasets
- Monitoring IoT sensor data trends in smart cities and industries
Example: A DevOps engineer can filter logs and visualize server downtime incidents in real time.
7. Academic Research and Education
Researchers and students deal with large datasets in various fields. PyGWalker simplifies:
- Data exploration for statistical research
- Visualization of survey results
- Teaching data visualization concepts interactively
Example: A professor can use PyGWalker in a data science course to help students explore datasets without coding.
8. Geospatial Data Visualization
Understanding location-based insights is easier with PyGWalker. It supports:
- Customer distribution analysis to identify high-density areas
- Delivery route optimization based on travel times
- Geospatial heatmaps for trends like weather patterns or crime rates
Example: A logistics company can visualize delivery performance in different cities and optimize routes.
9. Integration with Streamlit
PyGWalker integrates seamlessly with Streamlit, allowing you to build interactive data exploration apps with minimal code. This combination enables users to explore datasets visually without needing to write complex queries or charts manually.
- No-code data visualization: Drag and drop to explore datasets dynamically.
- Interactive dashboards: Create live, filterable reports in a web app.
- Easy deployment: Share insights instantly without complex front-end coding.
To integrate PyGWalker with Streamlit, write the following code and save your file as app.py
import streamlit as st
import pandas as pd
import pygwalker as pyg
# Load sample DataFrame
= pd.DataFrame({'A': ['p', 'q', 'r'], 'B': ['x', 'y', 'z']})
df
# Generate PyGWalker visualization as HTML
= pyg.walk(df, return_html=True)
pyg_html
# Embed PyGWalker in Streamlit
=600, scrolling=True) st.components.v1.html(pyg_html, height
now open terminal at the project folder and execute
streamlit run app.py
now we can analyse data directly in the streamlit app
Refrences
csv file link:https://www.kaggle.com/datasets/goldenoakresearch/us-household-income-stats-geo-locations?resource=download All rights of the file reserved to the owner
documentation link:https://docs.kanaries.net/pygwalker