A comprehensive solution for AI governance, bias detection, risk assessment, and automated PII cleaning with GDPR compliance. Built for Nordic ecosystems and beyond.AI-Powered GDPR Compliance & Privacy Protection PlatformA Python package for detecting bias and analyzing risks in machine learning models. Provides comprehensive fairness metrics, privacy risk assessment, and ethical AI evaluation.

A comprehensive solution for AI governance, bias detection, risk assessment, and automated PII cleaning with GDPR compliance. Built for Nordic ecosystems and beyond.## Features

### 🎯 Bias Detection

🚀 Quick Start

- Fairness Metrics: Disparate Impact, Statistical Parity Difference, Equal Opportunity Difference

Prerequisites

Python 3.8+- Demographic Analysis: Group-wise performance evaluation
Node.js 18+
GPU (optional, for faster processing)- Violation Detection: Automatic flagging with severity levels

Installation

Clone the repository---### 🛡️ Risk Assessment


git clone https://github.com/PlatypusPus/MushroomEmpire.git- **Privacy Risks**: PII detection, GDPR compliance, data exposure analysis

cd MushroomEmpire

```## 🚀 Quick Start- **Ethical Risks**: Fairness, transparency, accountability, social impact



2. **Install Python dependencies**- **Compliance Risks**: Regulatory adherence (GDPR, CCPA, AI Act)

```powershell

pip install -r requirements.txt### Prerequisites- **Data Quality**: Missing data, class imbalance, outlier detection

python -m spacy download en_core_web_sm

```- Python 3.8+



3. **Install frontend dependencies**- Node.js 18+### 🤖 Machine Learning

```powershell

cd frontend- GPU (optional, for faster processing)- Generalized classification model (works with any dataset)

npm install

cd ..- Auto-detection of feature types and protected attributes

Installation- Comprehensive performance metrics

Running the Application

Feature importance analysis

Start the FastAPI backend (Terminal 1)


python start_api.py

``````powershell## Installation

Backend runs at: **http://localhost:8000**

git clone https://github.com/PlatypusPus/MushroomEmpire.git

2. **Start the Next.js frontend** (Terminal 2)

```powershellcd MushroomEmpire```bash

cd frontend

npm run dev```pip install -r requirements.txt

Frontend runs at: http://localhost:3000```

Access the application2. Install Python dependencies
- Frontend UI: http://localhost:3000
- Try It Page: http://localhost:3000/try```powershellOr install as a package:
- API Documentation: http://localhost:8000/docs
- Health Check: http://localhost:8000/healthpip install -r requirements.txt

---python -m spacy download en_core_web_sm```bash

📋 Features```pip install -e .

🎯 AI Governance & Bias Detection```

Fairness Metrics: Disparate Impact, Statistical Parity, Equal Opportunity
Demographic Analysis: Group-wise performance evaluation3. Install frontend dependencies
Violation Detection: Automatic flagging with severity levels (HIGH/MEDIUM/LOW)
Model Performance: Comprehensive ML metrics (accuracy, precision, recall, F1)```powershell## Quick Start

🛡️ Privacy Risk Assessmentcd frontend/nordic-privacy-ai

Privacy Risks: PII detection, GDPR compliance scoring, data exposure analysis
Ethical Risks: Fairness, transparency, accountability evaluationnpm install```python
Compliance Risks: Regulatory adherence (GDPR, CCPA, AI Act)
Data Quality: Missing data, class imbalance, outlier detectioncd ../..from ai_governance import AIGovernanceAnalyzer

🧹 Automated Data Cleaning```

PII Detection: Email, phone, SSN, credit cards, IP addresses, and more
GPU Acceleration: CUDA-enabled for 10x faster processing# Initialize analyzer
GDPR Compliance: Automatic anonymization with audit trails
Smart Anonymization: Context-aware masking and pseudonymization### Running the Applicationanalyzer = AIGovernanceAnalyzer()

🌐 Modern Web Interface

Drag & Drop Upload: Intuitive CSV file handling
Real-time Processing: Live feedback and progress tracking1. Start the FastAPI backend (Terminal 1)# Run complete analysis
Interactive Dashboards: Visualize bias metrics, risk scores, and results
Report Downloads: JSON reports, cleaned CSV, and audit logs```powershellreport = analyzer.analyze(

---python start_api.py data_path='your_data.csv',

🏗️ Project Structure``` target_column='target',


MushroomEmpire/

├── api/                          # FastAPI Backend)

│   ├── main.py                   # Application entry point

│   ├── routers/2. **Start the Next.js frontend** (Terminal 2)

│   │   ├── analyze.py           # POST /api/analyze - AI Governance

│   │   └── clean.py             # POST /api/clean - Data Cleaning```powershell# Access results

│   └── utils/                    # Helper utilities

│cd frontend/nordic-privacy-aiprint(f"Bias Score: {report['summary']['overall_bias_score']:.3f}")

├── ai_governance/                # Core AI Governance Module

│   ├── __init__.py              # AIGovernanceAnalyzer classnpm run devprint(f"Risk Level: {report['summary']['risk_level']}")

│   ├── data_processor.py        # Data preprocessing

│   ├── model_trainer.py         # ML model training```print(f"Model Accuracy: {report['summary']['model_accuracy']:.3f}")

│   ├── bias_analyzer.py         # Bias detection engine

│   ├── risk_analyzer.py         # Risk assessment engineFrontend runs at: **http://localhost:3000**

│   └── report_generator.py      # JSON report generation

│# Save report

├── data_cleaning/                # Data Cleaning Module

│   ├── __init__.py              # DataCleaner class3. **Access the application**analyzer.save_report(report, 'governance_report.json')

│   ├── cleaner.py               # PII detection & anonymization

│   └── config.py                # PII patterns & GDPR rules   - Frontend UI: http://localhost:3000```

│

├── frontend/                     # Next.js Frontend   - API Documentation: http://localhost:8000/docs

│   ├── app/                     # App Router pages

│   │   ├── page.tsx            # Landing page   - Health Check: http://localhost:8000/health## Module Structure

│   │   └── try/page.tsx        # Try it page (workflow UI)

│   ├── components/

│   │   └── try/

│   │       ├── CenterPanel.tsx  # File upload & results---```

│   │       ├── Sidebar.tsx      # Workflow tabs

│   │       └── ChatbotPanel.tsx # AI assistantai_governance/

│   └── lib/

│       ├── api.ts              # TypeScript API client## 📋 Features├── __init__.py              # Main API

│       └── indexeddb.ts        # Browser caching utilities

│├── data_processor.py        # Data preprocessing

├── Datasets/                     # Sample datasets

│   └── loan_data.csv            # Example: Loan approval dataset### 🎯 AI Governance & Bias Detection├── model_trainer.py         # ML model training

│

├── reports/                      # Generated reports (auto-created)- **Fairness Metrics**: Disparate Impact, Statistical Parity, Equal Opportunity├── bias_analyzer.py         # Bias detection

│   ├── governance_report_*.json

│   ├── cleaned_*.csv- **Demographic Analysis**: Group-wise performance evaluation├── risk_analyzer.py         # Risk assessment

│   └── cleaning_audit_*.json

│- **Violation Detection**: Automatic flagging with severity levels (HIGH/MEDIUM/LOW)└── report_generator.py      # Report generation

├── start_api.py                 # Backend startup script

├── setup.py                     # Package configuration- **Model Performance**: Comprehensive ML metrics (accuracy, precision, recall, F1)```

├── requirements.txt             # Python dependencies

└── README.md                    # This file

🛡️ Privacy Risk Assessment## API Reference

Privacy Risks: PII detection, GDPR compliance scoring, data exposure analysis

📡 API Reference

Ethical Risks: Fairness, transparency, accountability evaluation### AIGovernanceAnalyzer

Base URL


http://localhost:8000

```- **Data Quality**: Missing data, class imbalance, outlier detectionMain class for running AI governance analysis.



### Endpoints



#### **POST /api/analyze**### 🧹 Automated Data Cleaning```python

Analyze dataset for bias, fairness, and risk assessment.

- **PII Detection**: Email, phone, SSN, credit cards, IP addresses, and moreanalyzer = AIGovernanceAnalyzer()

**Request:**

```bash- **GPU Acceleration**: CUDA-enabled for 10x faster processing

curl -X POST "http://localhost:8000/api/analyze" \

  -F "file=@Datasets/loan_data.csv"- **GDPR Compliance**: Automatic anonymization with audit trails# Analyze from DataFrame

Smart Anonymization: Context-aware masking and pseudonymizationreport = analyzer.analyze_dataframe(

Response:


{

  "status": "success",### 🌐 Modern Web Interface    target_column='target',

  "filename": "loan_data.csv",

  "dataset_info": {- **Drag & Drop Upload**: Intuitive CSV file handling    protected_attributes=['gender', 'age']

    "rows": 1000,

    "columns": 15- **Real-time Processing**: Live feedback and progress tracking)

  },

  "model_performance": {- **Interactive Dashboards**: Visualize bias metrics, risk scores, and results

    "accuracy": 0.85,

    "precision": 0.82,- **Report Downloads**: JSON reports, cleaned CSV, and audit logs# Analyze from file

    "recall": 0.88,

    "f1_score": 0.85report = analyzer.analyze(

  },

  "bias_metrics": {---    data_path='data.csv',

    "overall_bias_score": 0.23,

    "violations_detected": []    target_column='target',

  },

  "risk_assessment": {## 🏗️ Project Structure    protected_attributes=['gender', 'age']

    "overall_risk_score": 0.35,

    "privacy_risks": [],)

    "ethical_risks": []

  },``````

  "recommendations": [

    "[HIGH] Privacy: Remove PII columns before deployment",MushroomEmpire/

    "[MEDIUM] Fairness: Monitor demographic parity over time"

  ],├── api/                          # FastAPI Backend### Individual Components

  "report_file": "/reports/governance_report_20251107_123456.json"

}│   ├── main.py                   # Application entry point

│ ├── routers/```python

POST /api/clean

Detect and anonymize PII in datasets.│ │ ├── analyze.py # POST /api/analyze - AI Governancefrom ai_governance import (

Request:│ │ └── clean.py # POST /api/clean - Data Cleaning DataProcessor,


curl -X POST "http://localhost:8000/api/clean" \│   └── utils/                    # Helper utilities    GeneralizedModelTrainer,

  -F "file=@Datasets/loan_data.csv"

```│    BiasAnalyzer,



**Response:**├── ai_governance/                # Core AI Governance Module    RiskAnalyzer,

```json

{│   ├── __init__.py              # AIGovernanceAnalyzer class    ReportGenerator

  "status": "success",

  "dataset_info": {│   ├── data_processor.py        # Data preprocessing)

    "original_rows": 1000,

    "original_columns": 15,│   ├── model_trainer.py         # ML model training

    "cleaned_rows": 1000,

    "cleaned_columns": 13│   ├── bias_analyzer.py         # Bias detection engine# Process data

  },

  "summary": {│   ├── risk_analyzer.py         # Risk assessment engineprocessor = DataProcessor(df)

    "columns_removed": ["ssn", "email"],

    "columns_anonymized": ["phone", "address"],│   └── report_generator.py      # JSON report generationprocessor.target_column = 'target'

    "total_cells_affected": 2847

  },│processor.protected_attributes = ['gender', 'age']

  "pii_detections": {

    "EMAIL": 1000,├── cleaning.py                   # Core PII detection & anonymizationprocessor.prepare_data()

    "PHONE": 987,

    "SSN": 1000├── cleaning_config.py           # Configuration for data cleaning

  },

  "gdpr_compliance": [├── test_cleaning.py             # Unit tests for cleaning module# Train model

    "Article 5(1)(c) - Data minimization",

    "Article 17 - Right to erasure",│trainer = GeneralizedModelTrainer(

    "Article 25 - Data protection by design"

  ],├── frontend/nordic-privacy-ai/  # Next.js Frontend    processor.X_train,

  "files": {

    "cleaned_csv": "/reports/cleaned_20251107_123456.csv",│   ├── app/                     # App Router pages    processor.X_test,

    "audit_report": "/reports/cleaning_audit_20251107_123456.json"

  }│   │   ├── page.tsx            # Landing page    processor.y_train,

}

```│   │   └── try/page.tsx        # Try it page (workflow UI)    processor.y_test,



#### **GET /health**│   ├── components/    processor.feature_names

Health check endpoint with GPU status.

│   │   └── try/)

**Response:**

```json│   │       ├── CenterPanel.tsx  # File upload & resultstrainer.train()

{

  "status": "healthy",│   │       ├── Sidebar.tsx      # Workflow tabstrainer.evaluate()

  "version": "1.0.0",

  "gpu_available": true│   │       └── ChatbotPanel.tsx # AI assistant

}

```│   └── lib/# Analyze bias



#### **GET /reports/{filename}**│       ├── api.ts              # TypeScript API clientbias_analyzer = BiasAnalyzer(

Download generated reports and cleaned files.

│       └── indexeddb.ts        # Browser caching utilities    processor.X_test,

---

│    processor.y_test,

## 🔧 Configuration

├── Datasets/                     # Sample datasets    trainer.y_pred,

### Environment Variables

│   └── loan_data.csv            # Example: Loan approval dataset    processor.df,

Create `.env` file in `frontend/`:

```env│    processor.protected_attributes,

NEXT_PUBLIC_API_URL=http://localhost:8000

```├── reports/                      # Generated reports (auto-created)    processor.target_column



### CORS Configuration│   ├── governance_report_*.json)



Edit `api/main.py` to add production domains:│   ├── cleaned_*.csvbias_results = bias_analyzer.analyze()

```python

origins = [│   └── cleaning_audit_*.json

    "http://localhost:3000",

    "https://your-production-domain.com"│# Assess risks

]

```├── start_api.py                 # Backend startup scriptrisk_analyzer = RiskAnalyzer(



### GPU Acceleration├── setup.py                     # Package configuration    processor.df,



GPU is automatically detected and used if available. To force CPU mode:├── requirements.txt             # Python dependencies    trainer.results,

```python

# In data_cleaning/cleaner.py or api endpoints└── README.md                    # This file    bias_results,

DataCleaner(use_gpu=False)

``````    processor.protected_attributes,



---    processor.target_column



## 🧪 Testing---)



### Test the Backendrisk_results = risk_analyzer.analyze()

```powershell

# Test analyze endpoint## 📡 API Reference

curl -X POST "http://localhost:8000/api/analyze" -F "file=@Datasets/loan_data.csv"

# Generate report

# Test clean endpoint

curl -X POST "http://localhost:8000/api/clean" -F "file=@Datasets/loan_data.csv"### Base URLreport_gen = ReportGenerator(



# Check health```    trainer.results,

curl http://localhost:8000/health

```http://localhost:8000    bias_results,



### Run Unit Tests```    risk_results,

```powershell

# Test cleaning module    processor.df

python test_cleaning.py

### Endpoints)

# Run all tests (if pytest configured)

pytestreport = report_gen.generate_report()

POST /api/analyze```

Analyze dataset for bias, fairness, and risk assessment.

📊 Usage Examples

Report Structure

Python SDK Usage

Request:


from ai_governance import AIGovernanceAnalyzer```bashThe module generates comprehensive JSON reports:



# Initialize analyzercurl -X POST "http://localhost:8000/api/analyze" \

analyzer = AIGovernanceAnalyzer()

  -F "file=@Datasets/loan_data.csv"```json

# Analyze dataset

report = analyzer.analyze(```{

    data_path='Datasets/loan_data.csv',

    target_column='loan_approved',  "metadata": {

    protected_attributes=['gender', 'age', 'race']

)**Response:**    "report_id": "unique_id",



# Print results```json    "generated_at": "timestamp",

print(f"Bias Score: {report['summary']['overall_bias_score']:.3f}")

print(f"Risk Level: {report['summary']['risk_level']}"){    "dataset_info": {}

print(f"Model Accuracy: {report['summary']['model_accuracy']:.3f}")

  "status": "success",  },

# Save report

analyzer.save_report(report, 'my_report.json')  "filename": "loan_data.csv",  "summary": {

"dataset_info": { "overall_bias_score": 0.0-1.0,

Data Cleaning Usage

"rows": 1000,    "overall_risk_score": 0.0-1.0,


from data_cleaning import DataCleaner    "columns": 15    "risk_level": "LOW|MEDIUM|HIGH",



# Initialize cleaner with GPU  },    "model_accuracy": 0.0-1.0,

cleaner = DataCleaner(use_gpu=True)

  "model_performance": {    "fairness_violations_count": 0

# Load and clean data

df = cleaner.load_data('Datasets/loan_data.csv')    "accuracy": 0.85,  },

cleaned_df, audit = cleaner.anonymize_pii(df)

    "precision": 0.82,  "model_performance": {},

# Save results

cleaner.save_cleaned_data(cleaned_df, 'cleaned_output.csv')    "recall": 0.88,  "bias_analysis": {},

cleaner.save_audit_report(audit, 'audit_report.json')

```    "f1_score": 0.85  "risk_assessment": {},



### Frontend Integration  },  "key_findings": [],



```typescript  "bias_metrics": {  "recommendations": []

import { analyzeDataset, cleanDataset } from '@/lib/api';

    "overall_bias_score": 0.23,}

// Analyze uploaded file

const handleAnalyze = async (file: File) => {    "violations_detected": []```

  const result = await analyzeDataset(file);

  console.log('Bias Score:', result.bias_metrics.overall_bias_score);  },

  console.log('Download:', result.report_file);

};  "risk_assessment": {## Metrics Interpretation



// Clean uploaded file    "overall_risk_score": 0.35,

const handleClean = async (file: File) => {

  const result = await cleanDataset(file);    "privacy_risks": [],### Bias Score (0-1, lower is better)

  console.log('Cells anonymized:', result.summary.total_cells_affected);

  console.log('Download cleaned:', result.files.cleaned_csv);    "ethical_risks": []- **0.0 - 0.3**: Low bias ✅

};

```  },- **0.3 - 0.5**: Moderate bias ⚠️



---  "recommendations": [- **0.5 - 1.0**: High bias ❌



## 📈 Metrics Interpretation    "[HIGH] Privacy: Remove PII columns before deployment",



### Bias Score (0-1, lower is better)    "[MEDIUM] Fairness: Monitor demographic parity over time"### Risk Score (0-1, lower is better)

- **0.0 - 0.3**: ✅ Low bias - Good fairness

- **0.3 - 0.5**: ⚠️ Moderate bias - Monitoring recommended  ],- **0.0 - 0.4**: LOW risk ✅

- **0.5 - 1.0**: ❌ High bias - Immediate action required

  "report_file": "/reports/governance_report_20251107_123456.json"- **0.4 - 0.7**: MEDIUM risk ⚠️

### Risk Score (0-1, lower is better)

- **0.0 - 0.4**: ✅ LOW risk}- **0.7 - 1.0**: HIGH risk ❌

- **0.4 - 0.7**: ⚠️ MEDIUM risk

- **0.7 - 1.0**: ❌ HIGH risk```



### Fairness Metrics### Fairness Metrics

- **Disparate Impact**: Fair range 0.8 - 1.25

- **Statistical Parity**: Fair threshold < 0.1#### **POST /api/clean**- **Disparate Impact**: Fair range 0.8 - 1.25

- **Equal Opportunity**: Fair threshold < 0.1

Detect and anonymize PII in datasets.- **Statistical Parity**: Fair threshold < 0.1

---

- **Equal Opportunity**: Fair threshold < 0.1

## 🛠️ Technology Stack

**Request:**

### Backend

- **FastAPI** - Modern Python web framework```bash## Requirements

- **scikit-learn** - Machine learning

- **spaCy** - NLP for PII detectioncurl -X POST "http://localhost:8000/api/clean" \

- **PyTorch** - GPU acceleration (optional)

- **pandas** - Data processing  -F "file=@Datasets/loan_data.csv"- Python 3.8+



### Frontend```- pandas >= 2.0.0

- **Next.js 14** - React framework with App Router

- **TypeScript** - Type safety- numpy >= 1.24.0

- **Tailwind CSS** - Styling

- **IndexedDB** - Browser storage**Response:**- scikit-learn >= 1.3.0



---```json



## 🤝 Contributing{See `requirements.txt` for complete list.



Contributions are welcome! Please follow these steps:  "status": "success",



1. Fork the repository  "dataset_info": {## Integration Examples

2. Create a feature branch (`git checkout -b feature/AmazingFeature`)

3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)    "original_rows": 1000,

4. Push to the branch (`git push origin feature/AmazingFeature`)

5. Open a Pull Request    "original_columns": 15,### FastAPI Backend



---    "cleaned_rows": 1000,



## 📝 License    "cleaned_columns": 13```python



This project is licensed under the MIT License - see the LICENSE file for details.  },from fastapi import FastAPI, UploadFile



---  "summary": {from ai_governance import AIGovernanceAnalyzer



## 🎓 Citation    "columns_removed": ["ssn", "email"],



If you use this project in your research or work, please cite:    "columns_anonymized": ["phone", "address"],app = FastAPI()



```bibtex    "total_cells_affected": 2847analyzer = AIGovernanceAnalyzer()

@software{nordic_privacy_ai,

  title = {Nordic Privacy AI - GDPR Compliance & AI Governance Platform},  },

  author = {PlatypusPus},

  year = {2025},  "pii_detections": {@app.post("/analyze")

  url = {https://github.com/PlatypusPus/MushroomEmpire}

}    "EMAIL": 1000,async def analyze(file: UploadFile, target: str, protected: list):

"PHONE": 987,    df = pd.read_csv(file.file)

"SSN": 1000    report = analyzer.analyze_dataframe(df, target, protected)

📧 Support

}, return report

Issues: GitHub Issues
Discussions: GitHub Discussions "gdpr_compliance": [```

--- "Article 5(1)(c) - Data minimization",

🙏 Acknowledgments "Article 17 - Right to erasure",### Flask Backend

Built for Nordic ecosystems (BankID, MitID, Suomi.fi) "Article 25 - Data protection by design"
Inspired by GDPR, CCPA, and EU AI Act requirements
Developed during a hackathon prototype ],```python

--- "files": {from flask import Flask, request, jsonify

Made with ❤️ by the Nordic Privacy AI Team "cleaned_csv": "/reports/cleaned_20251107_123456.csv",from ai_governance import AIGovernanceAnalyzer

"audit_report": "/reports/cleaning_audit_20251107_123456.json"

}app = Flask(name)

}analyzer = AIGovernanceAnalyzer()


@app.route('/analyze', methods=['POST'])

#### **GET /health**def analyze():

Health check endpoint with GPU status.    file = request.files['file']

    df = pd.read_csv(file)

**Response:**    report = analyzer.analyze_dataframe(

```json        df,

{        request.form['target'],

  "status": "healthy",        request.form.getlist('protected')

  "version": "1.0.0",    )

  "gpu_available": true    return jsonify(report)

}```

License

GET /reports/{filename}

Download generated reports and cleaned files.MIT License

---## Contributing

🔧 ConfigurationContributions welcome! Please open an issue or submit a pull request.

Environment Variables## Citation

Create .env file in frontend/nordic-privacy-ai/:If you use this module in your research or project, please cite:


NEXT_PUBLIC_API_URL=http://localhost:8000```

```AI Governance Module - Bias Detection and Risk Analysis

https://github.com/PlatypusPus/MushroomEmpire

### CORS Configuration```


Edit `api/main.py` to add production domains:
```python
origins = [
    "http://localhost:3000",
    "https://your-production-domain.com"
]

GPU Acceleration

GPU is automatically detected and used if available. To force CPU mode:

# In cleaning.py or api endpoints
DataCleaner(use_gpu=False)

🧪 Testing

Test the Backend

# Test analyze endpoint
curl -X POST "http://localhost:8000/api/analyze" -F "file=@Datasets/loan_data.csv"

# Test clean endpoint
curl -X POST "http://localhost:8000/api/clean" -F "file=@Datasets/loan_data.csv"

# Check health
curl http://localhost:8000/health

Run Unit Tests

# Test cleaning module
python test_cleaning.py

# Run all tests (if pytest configured)
pytest

📊 Usage Examples

Python SDK Usage

from ai_governance import AIGovernanceAnalyzer

# Initialize analyzer
analyzer = AIGovernanceAnalyzer()

# Analyze dataset
report = analyzer.analyze(
    data_path='Datasets/loan_data.csv',
    target_column='loan_approved',
    protected_attributes=['gender', 'age', 'race']
)

# Print results
print(f"Bias Score: {report['summary']['overall_bias_score']:.3f}")
print(f"Risk Level: {report['summary']['risk_level']}")
print(f"Model Accuracy: {report['summary']['model_accuracy']:.3f}")

# Save report
analyzer.save_report(report, 'my_report.json')

Data Cleaning Usage

from cleaning import DataCleaner

# Initialize cleaner with GPU
cleaner = DataCleaner(use_gpu=True)

# Load and clean data
df = cleaner.load_data('Datasets/loan_data.csv')
cleaned_df, audit = cleaner.anonymize_pii(df)

# Save results
cleaner.save_cleaned_data(cleaned_df, 'cleaned_output.csv')
cleaner.save_audit_report(audit, 'audit_report.json')

Frontend Integration

import { analyzeDataset, cleanDataset } from '@/lib/api';

// Analyze uploaded file
const handleAnalyze = async (file: File) => {
  const result = await analyzeDataset(file);
  console.log('Bias Score:', result.bias_metrics.overall_bias_score);
  console.log('Download:', result.report_file);
};

// Clean uploaded file
const handleClean = async (file: File) => {
  const result = await cleanDataset(file);
  console.log('Cells anonymized:', result.summary.total_cells_affected);
  console.log('Download cleaned:', result.files.cleaned_csv);
};

📈 Metrics Interpretation

Bias Score (0-1, lower is better)

0.0 - 0.3: ✅ Low bias - Good fairness
0.3 - 0.5: ⚠️ Moderate bias - Monitoring recommended
0.5 - 1.0: ❌ High bias - Immediate action required

Risk Score (0-1, lower is better)

0.0 - 0.4: ✅ LOW risk
0.4 - 0.7: ⚠️ MEDIUM risk
0.7 - 1.0: ❌ HIGH risk

Fairness Metrics

Disparate Impact: Fair range 0.8 - 1.25
Statistical Parity: Fair threshold < 0.1
Equal Opportunity: Fair threshold < 0.1

🛠️ Technology Stack

Backend

FastAPI - Modern Python web framework
scikit-learn - Machine learning
spaCy - NLP for PII detection
PyTorch - GPU acceleration (optional)
pandas - Data processing

Frontend

Next.js 14 - React framework with App Router
TypeScript - Type safety
Tailwind CSS - Styling
IndexedDB - Browser storage

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🎓 Citation

If you use this project in your research or work, please cite:

@software{nordic_privacy_ai,
  title = {Nordic Privacy AI - GDPR Compliance & AI Governance Platform},
  author = {PlatypusPus},
  year = {2025},
  url = {https://github.com/PlatypusPus/MushroomEmpire}
}

📧 Support

Issues: GitHub Issues
Discussions: GitHub Discussions

🙏 Acknowledgments

Built for Nordic ecosystems (BankID, MitID, Suomi.fi)
Inspired by GDPR, CCPA, and EU AI Act requirements
Developed during a hackathon prototype

Made with ❤️ by the Nordic Privacy AI Team