MushroomEmpire/README.md

# Nordic Privacy AI 🛡️# Nordic Privacy AI 🛡️# AI Governance Module


**AI-Powered GDPR Compliance & Privacy Protection Platform**


A comprehensive solution for AI governance, bias detection, risk assessment, and automated PII cleaning with GDPR compliance. Built for Nordic ecosystems and beyond.**AI-Powered GDPR Compliance & Privacy Protection Platform**A Python package for detecting bias and analyzing risks in machine learning models. Provides comprehensive fairness metrics, privacy risk assessment, and ethical AI evaluation.


[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/)

[![FastAPI](https://img.shields.io/badge/FastAPI-0.109+-green.svg)](https://fastapi.tiangolo.com/)

[![Next.js](https://img.shields.io/badge/Next.js-14.2+-black.svg)](https://nextjs.org/)A comprehensive solution for AI governance, bias detection, risk assessment, and automated PII cleaning with GDPR compliance. Built for Nordic ecosystems and beyond.## Features

[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)


---

[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/)### 🎯 Bias Detection

## 🚀 Quick Start

[![FastAPI](https://img.shields.io/badge/FastAPI-0.109+-green.svg)](https://fastapi.tiangolo.com/)- **Fairness Metrics**: Disparate Impact, Statistical Parity Difference, Equal Opportunity Difference

### Prerequisites

- Python 3.8+[![Next.js](https://img.shields.io/badge/Next.js-14.2+-black.svg)](https://nextjs.org/)- **Demographic Analysis**: Group-wise performance evaluation

- Node.js 18+

- GPU (optional, for faster processing)[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)- **Violation Detection**: Automatic flagging with severity levels


### Installation


1. **Clone the repository**---### 🛡️ Risk Assessment

```powershell

git clone https://github.com/PlatypusPus/MushroomEmpire.git- **Privacy Risks**: PII detection, GDPR compliance, data exposure analysis

cd MushroomEmpire

```## 🚀 Quick Start- **Ethical Risks**: Fairness, transparency, accountability, social impact


2. **Install Python dependencies**- **Compliance Risks**: Regulatory adherence (GDPR, CCPA, AI Act)

```powershell

pip install -r requirements.txt### Prerequisites- **Data Quality**: Missing data, class imbalance, outlier detection

python -m spacy download en_core_web_sm

```- Python 3.8+


3. **Install frontend dependencies**- Node.js 18+### 🤖 Machine Learning

```powershell

cd frontend- GPU (optional, for faster processing)- Generalized classification model (works with any dataset)

npm install

cd ..- Auto-detection of feature types and protected attributes

```

### Installation- Comprehensive performance metrics

### Running the Application

- Feature importance analysis

1. **Start the FastAPI backend** (Terminal 1)

```powershell1. **Clone the repository**

python start_api.py

``````powershell## Installation

Backend runs at: **http://localhost:8000**

git clone https://github.com/PlatypusPus/MushroomEmpire.git

2. **Start the Next.js frontend** (Terminal 2)

```powershellcd MushroomEmpire```bash

cd frontend

npm run dev```pip install -r requirements.txt

```

Frontend runs at: **http://localhost:3000**```


3. **Access the application**2. **Install Python dependencies**

   - Frontend UI: http://localhost:3000

   - Try It Page: http://localhost:3000/try```powershellOr install as a package:

   - API Documentation: http://localhost:8000/docs

   - Health Check: http://localhost:8000/healthpip install -r requirements.txt


---python -m spacy download en_core_web_sm```bash


## 📋 Features```pip install -e .


### 🎯 AI Governance & Bias Detection```

- **Fairness Metrics**: Disparate Impact, Statistical Parity, Equal Opportunity

- **Demographic Analysis**: Group-wise performance evaluation3. **Install frontend dependencies**

- **Violation Detection**: Automatic flagging with severity levels (HIGH/MEDIUM/LOW)

- **Model Performance**: Comprehensive ML metrics (accuracy, precision, recall, F1)```powershell## Quick Start


### 🛡️ Privacy Risk Assessmentcd frontend/nordic-privacy-ai

- **Privacy Risks**: PII detection, GDPR compliance scoring, data exposure analysis

- **Ethical Risks**: Fairness, transparency, accountability evaluationnpm install```python

- **Compliance Risks**: Regulatory adherence (GDPR, CCPA, AI Act)

- **Data Quality**: Missing data, class imbalance, outlier detectioncd ../..from ai_governance import AIGovernanceAnalyzer


### 🧹 Automated Data Cleaning```

- **PII Detection**: Email, phone, SSN, credit cards, IP addresses, and more

- **GPU Acceleration**: CUDA-enabled for 10x faster processing# Initialize analyzer

- **GDPR Compliance**: Automatic anonymization with audit trails

- **Smart Anonymization**: Context-aware masking and pseudonymization### Running the Applicationanalyzer = AIGovernanceAnalyzer()


### 🌐 Modern Web Interface

- **Drag & Drop Upload**: Intuitive CSV file handling

- **Real-time Processing**: Live feedback and progress tracking1. **Start the FastAPI backend** (Terminal 1)# Run complete analysis

- **Interactive Dashboards**: Visualize bias metrics, risk scores, and results

- **Report Downloads**: JSON reports, cleaned CSV, and audit logs```powershellreport = analyzer.analyze(


---python start_api.py    data_path='your_data.csv',


## 🏗️ Project Structure```    target_column='target',


```Backend runs at: **http://localhost:8000**    protected_attributes=['gender', 'age', 'race']

MushroomEmpire/

├── api/                          # FastAPI Backend)

│   ├── main.py                   # Application entry point

│   ├── routers/2. **Start the Next.js frontend** (Terminal 2)

│   │   ├── analyze.py           # POST /api/analyze - AI Governance

│   │   └── clean.py             # POST /api/clean - Data Cleaning```powershell# Access results

│   └── utils/                    # Helper utilities

│cd frontend/nordic-privacy-aiprint(f"Bias Score: {report['summary']['overall_bias_score']:.3f}")

├── ai_governance/                # Core AI Governance Module

│   ├── __init__.py              # AIGovernanceAnalyzer classnpm run devprint(f"Risk Level: {report['summary']['risk_level']}")

│   ├── data_processor.py        # Data preprocessing

│   ├── model_trainer.py         # ML model training```print(f"Model Accuracy: {report['summary']['model_accuracy']:.3f}")

│   ├── bias_analyzer.py         # Bias detection engine

│   ├── risk_analyzer.py         # Risk assessment engineFrontend runs at: **http://localhost:3000**

│   └── report_generator.py      # JSON report generation

│# Save report

├── data_cleaning/                # Data Cleaning Module

│   ├── __init__.py              # DataCleaner class3. **Access the application**analyzer.save_report(report, 'governance_report.json')

│   ├── cleaner.py               # PII detection & anonymization

│   └── config.py                # PII patterns & GDPR rules   - Frontend UI: http://localhost:3000```

│

├── frontend/                     # Next.js Frontend   - API Documentation: http://localhost:8000/docs

│   ├── app/                     # App Router pages

│   │   ├── page.tsx            # Landing page   - Health Check: http://localhost:8000/health## Module Structure

│   │   └── try/page.tsx        # Try it page (workflow UI)

│   ├── components/

│   │   └── try/

│   │       ├── CenterPanel.tsx  # File upload & results---```

│   │       ├── Sidebar.tsx      # Workflow tabs

│   │       └── ChatbotPanel.tsx # AI assistantai_governance/

│   └── lib/

│       ├── api.ts              # TypeScript API client## 📋 Features├── __init__.py              # Main API

│       └── indexeddb.ts        # Browser caching utilities

│├── data_processor.py        # Data preprocessing

├── Datasets/                     # Sample datasets

│   └── loan_data.csv            # Example: Loan approval dataset### 🎯 AI Governance & Bias Detection├── model_trainer.py         # ML model training

│

├── reports/                      # Generated reports (auto-created)- **Fairness Metrics**: Disparate Impact, Statistical Parity, Equal Opportunity├── bias_analyzer.py         # Bias detection

│   ├── governance_report_*.json

│   ├── cleaned_*.csv- **Demographic Analysis**: Group-wise performance evaluation├── risk_analyzer.py         # Risk assessment

│   └── cleaning_audit_*.json

│- **Violation Detection**: Automatic flagging with severity levels (HIGH/MEDIUM/LOW)└── report_generator.py      # Report generation

├── start_api.py                 # Backend startup script

├── setup.py                     # Package configuration- **Model Performance**: Comprehensive ML metrics (accuracy, precision, recall, F1)```

├── requirements.txt             # Python dependencies

└── README.md                    # This file

```

### 🛡️ Privacy Risk Assessment## API Reference

---

- **Privacy Risks**: PII detection, GDPR compliance scoring, data exposure analysis

## 📡 API Reference

- **Ethical Risks**: Fairness, transparency, accountability evaluation### AIGovernanceAnalyzer

### Base URL

```- **Compliance Risks**: Regulatory adherence (GDPR, CCPA, AI Act)

http://localhost:8000

```- **Data Quality**: Missing data, class imbalance, outlier detectionMain class for running AI governance analysis.


### Endpoints


#### **POST /api/analyze**### 🧹 Automated Data Cleaning```python

Analyze dataset for bias, fairness, and risk assessment.

- **PII Detection**: Email, phone, SSN, credit cards, IP addresses, and moreanalyzer = AIGovernanceAnalyzer()

**Request:**

```bash- **GPU Acceleration**: CUDA-enabled for 10x faster processing

curl -X POST "http://localhost:8000/api/analyze" \

  -F "file=@Datasets/loan_data.csv"- **GDPR Compliance**: Automatic anonymization with audit trails# Analyze from DataFrame

```

- **Smart Anonymization**: Context-aware masking and pseudonymizationreport = analyzer.analyze_dataframe(

**Response:**

```json    df=dataframe,

{

  "status": "success",### 🌐 Modern Web Interface    target_column='target',

  "filename": "loan_data.csv",

  "dataset_info": {- **Drag & Drop Upload**: Intuitive CSV file handling    protected_attributes=['gender', 'age']

    "rows": 1000,

    "columns": 15- **Real-time Processing**: Live feedback and progress tracking)

  },

  "model_performance": {- **Interactive Dashboards**: Visualize bias metrics, risk scores, and results

    "accuracy": 0.85,

    "precision": 0.82,- **Report Downloads**: JSON reports, cleaned CSV, and audit logs# Analyze from file

    "recall": 0.88,

    "f1_score": 0.85report = analyzer.analyze(

  },

  "bias_metrics": {---    data_path='data.csv',

    "overall_bias_score": 0.23,

    "violations_detected": []    target_column='target',

  },

  "risk_assessment": {## 🏗️ Project Structure    protected_attributes=['gender', 'age']

    "overall_risk_score": 0.35,

    "privacy_risks": [],)

    "ethical_risks": []

  },``````

  "recommendations": [

    "[HIGH] Privacy: Remove PII columns before deployment",MushroomEmpire/

    "[MEDIUM] Fairness: Monitor demographic parity over time"

  ],├── api/                          # FastAPI Backend### Individual Components

  "report_file": "/reports/governance_report_20251107_123456.json"

}│   ├── main.py                   # Application entry point

```

│   ├── routers/```python

#### **POST /api/clean**

Detect and anonymize PII in datasets.│   │   ├── analyze.py           # POST /api/analyze - AI Governancefrom ai_governance import (


**Request:**│   │   └── clean.py             # POST /api/clean - Data Cleaning    DataProcessor,

```bash

curl -X POST "http://localhost:8000/api/clean" \│   └── utils/                    # Helper utilities    GeneralizedModelTrainer,

  -F "file=@Datasets/loan_data.csv"

```│    BiasAnalyzer,


**Response:**├── ai_governance/                # Core AI Governance Module    RiskAnalyzer,

```json

{│   ├── __init__.py              # AIGovernanceAnalyzer class    ReportGenerator

  "status": "success",

  "dataset_info": {│   ├── data_processor.py        # Data preprocessing)

    "original_rows": 1000,

    "original_columns": 15,│   ├── model_trainer.py         # ML model training

    "cleaned_rows": 1000,

    "cleaned_columns": 13│   ├── bias_analyzer.py         # Bias detection engine# Process data

  },

  "summary": {│   ├── risk_analyzer.py         # Risk assessment engineprocessor = DataProcessor(df)

    "columns_removed": ["ssn", "email"],

    "columns_anonymized": ["phone", "address"],│   └── report_generator.py      # JSON report generationprocessor.target_column = 'target'

    "total_cells_affected": 2847

  },│processor.protected_attributes = ['gender', 'age']

  "pii_detections": {

    "EMAIL": 1000,├── cleaning.py                   # Core PII detection & anonymizationprocessor.prepare_data()

    "PHONE": 987,

    "SSN": 1000├── cleaning_config.py           # Configuration for data cleaning

  },

  "gdpr_compliance": [├── test_cleaning.py             # Unit tests for cleaning module# Train model

    "Article 5(1)(c) - Data minimization",

    "Article 17 - Right to erasure",│trainer = GeneralizedModelTrainer(

    "Article 25 - Data protection by design"

  ],├── frontend/nordic-privacy-ai/  # Next.js Frontend    processor.X_train,

  "files": {

    "cleaned_csv": "/reports/cleaned_20251107_123456.csv",│   ├── app/                     # App Router pages    processor.X_test,

    "audit_report": "/reports/cleaning_audit_20251107_123456.json"

  }│   │   ├── page.tsx            # Landing page    processor.y_train,

}

```│   │   └── try/page.tsx        # Try it page (workflow UI)    processor.y_test,


#### **GET /health**│   ├── components/    processor.feature_names

Health check endpoint with GPU status.

│   │   └── try/)

**Response:**

```json│   │       ├── CenterPanel.tsx  # File upload & resultstrainer.train()

{

  "status": "healthy",│   │       ├── Sidebar.tsx      # Workflow tabstrainer.evaluate()

  "version": "1.0.0",

  "gpu_available": true│   │       └── ChatbotPanel.tsx # AI assistant

}

```│   └── lib/# Analyze bias


#### **GET /reports/{filename}**│       ├── api.ts              # TypeScript API clientbias_analyzer = BiasAnalyzer(

Download generated reports and cleaned files.

│       └── indexeddb.ts        # Browser caching utilities    processor.X_test,

---

│    processor.y_test,

## 🔧 Configuration

├── Datasets/                     # Sample datasets    trainer.y_pred,

### Environment Variables

│   └── loan_data.csv            # Example: Loan approval dataset    processor.df,

Create `.env` file in `frontend/`:

```env│    processor.protected_attributes,

NEXT_PUBLIC_API_URL=http://localhost:8000

```├── reports/                      # Generated reports (auto-created)    processor.target_column


### CORS Configuration│   ├── governance_report_*.json)


Edit `api/main.py` to add production domains:│   ├── cleaned_*.csvbias_results = bias_analyzer.analyze()

```python

origins = [│   └── cleaning_audit_*.json

    "http://localhost:3000",

    "https://your-production-domain.com"│# Assess risks

]

```├── start_api.py                 # Backend startup scriptrisk_analyzer = RiskAnalyzer(


### GPU Acceleration├── setup.py                     # Package configuration    processor.df,


GPU is automatically detected and used if available. To force CPU mode:├── requirements.txt             # Python dependencies    trainer.results,

```python

# In data_cleaning/cleaner.py or api endpoints└── README.md                    # This file    bias_results,

DataCleaner(use_gpu=False)

``````    processor.protected_attributes,


---    processor.target_column


## 🧪 Testing---)


### Test the Backendrisk_results = risk_analyzer.analyze()

```powershell

# Test analyze endpoint## 📡 API Reference

curl -X POST "http://localhost:8000/api/analyze" -F "file=@Datasets/loan_data.csv"

# Generate report

# Test clean endpoint

curl -X POST "http://localhost:8000/api/clean" -F "file=@Datasets/loan_data.csv"### Base URLreport_gen = ReportGenerator(


# Check health```    trainer.results,

curl http://localhost:8000/health

```http://localhost:8000    bias_results,


### Run Unit Tests```    risk_results,

```powershell

# Test cleaning module    processor.df

python test_cleaning.py

### Endpoints)

# Run all tests (if pytest configured)

pytestreport = report_gen.generate_report()

```

#### **POST /api/analyze**```

---

Analyze dataset for bias, fairness, and risk assessment.

## 📊 Usage Examples

## Report Structure

### Python SDK Usage

**Request:**

```python

from ai_governance import AIGovernanceAnalyzer```bashThe module generates comprehensive JSON reports:


# Initialize analyzercurl -X POST "http://localhost:8000/api/analyze" \

analyzer = AIGovernanceAnalyzer()

  -F "file=@Datasets/loan_data.csv"```json

# Analyze dataset

report = analyzer.analyze(```{

    data_path='Datasets/loan_data.csv',

    target_column='loan_approved',  "metadata": {

    protected_attributes=['gender', 'age', 'race']

)**Response:**    "report_id": "unique_id",


# Print results```json    "generated_at": "timestamp",

print(f"Bias Score: {report['summary']['overall_bias_score']:.3f}")

print(f"Risk Level: {report['summary']['risk_level']}"){    "dataset_info": {}

print(f"Model Accuracy: {report['summary']['model_accuracy']:.3f}")

  "status": "success",  },

# Save report

analyzer.save_report(report, 'my_report.json')  "filename": "loan_data.csv",  "summary": {

```

  "dataset_info": {    "overall_bias_score": 0.0-1.0,

### Data Cleaning Usage

    "rows": 1000,    "overall_risk_score": 0.0-1.0,

```python

from data_cleaning import DataCleaner    "columns": 15    "risk_level": "LOW|MEDIUM|HIGH",


# Initialize cleaner with GPU  },    "model_accuracy": 0.0-1.0,

cleaner = DataCleaner(use_gpu=True)

  "model_performance": {    "fairness_violations_count": 0

# Load and clean data

df = cleaner.load_data('Datasets/loan_data.csv')    "accuracy": 0.85,  },

cleaned_df, audit = cleaner.anonymize_pii(df)

    "precision": 0.82,  "model_performance": {},

# Save results

cleaner.save_cleaned_data(cleaned_df, 'cleaned_output.csv')    "recall": 0.88,  "bias_analysis": {},

cleaner.save_audit_report(audit, 'audit_report.json')

```    "f1_score": 0.85  "risk_assessment": {},


### Frontend Integration  },  "key_findings": [],


```typescript  "bias_metrics": {  "recommendations": []

import { analyzeDataset, cleanDataset } from '@/lib/api';

    "overall_bias_score": 0.23,}

// Analyze uploaded file

const handleAnalyze = async (file: File) => {    "violations_detected": []```

  const result = await analyzeDataset(file);

  console.log('Bias Score:', result.bias_metrics.overall_bias_score);  },

  console.log('Download:', result.report_file);

};  "risk_assessment": {## Metrics Interpretation


// Clean uploaded file    "overall_risk_score": 0.35,

const handleClean = async (file: File) => {

  const result = await cleanDataset(file);    "privacy_risks": [],### Bias Score (0-1, lower is better)

  console.log('Cells anonymized:', result.summary.total_cells_affected);

  console.log('Download cleaned:', result.files.cleaned_csv);    "ethical_risks": []- **0.0 - 0.3**: Low bias ✅

};

```  },- **0.3 - 0.5**: Moderate bias ⚠️


---  "recommendations": [- **0.5 - 1.0**: High bias ❌


## 📈 Metrics Interpretation    "[HIGH] Privacy: Remove PII columns before deployment",


### Bias Score (0-1, lower is better)    "[MEDIUM] Fairness: Monitor demographic parity over time"### Risk Score (0-1, lower is better)

- **0.0 - 0.3**: ✅ Low bias - Good fairness

- **0.3 - 0.5**: ⚠️ Moderate bias - Monitoring recommended  ],- **0.0 - 0.4**: LOW risk ✅

- **0.5 - 1.0**: ❌ High bias - Immediate action required

  "report_file": "/reports/governance_report_20251107_123456.json"- **0.4 - 0.7**: MEDIUM risk ⚠️

### Risk Score (0-1, lower is better)

- **0.0 - 0.4**: ✅ LOW risk}- **0.7 - 1.0**: HIGH risk ❌

- **0.4 - 0.7**: ⚠️ MEDIUM risk

- **0.7 - 1.0**: ❌ HIGH risk```


### Fairness Metrics### Fairness Metrics

- **Disparate Impact**: Fair range 0.8 - 1.25

- **Statistical Parity**: Fair threshold < 0.1#### **POST /api/clean**- **Disparate Impact**: Fair range 0.8 - 1.25

- **Equal Opportunity**: Fair threshold < 0.1

Detect and anonymize PII in datasets.- **Statistical Parity**: Fair threshold < 0.1

---

- **Equal Opportunity**: Fair threshold < 0.1

## 🛠️ Technology Stack

**Request:**

### Backend

- **FastAPI** - Modern Python web framework```bash## Requirements

- **scikit-learn** - Machine learning

- **spaCy** - NLP for PII detectioncurl -X POST "http://localhost:8000/api/clean" \

- **PyTorch** - GPU acceleration (optional)

- **pandas** - Data processing  -F "file=@Datasets/loan_data.csv"- Python 3.8+


### Frontend```- pandas >= 2.0.0

- **Next.js 14** - React framework with App Router

- **TypeScript** - Type safety- numpy >= 1.24.0

- **Tailwind CSS** - Styling

- **IndexedDB** - Browser storage**Response:**- scikit-learn >= 1.3.0


---```json


## 🤝 Contributing{See `requirements.txt` for complete list.


Contributions are welcome! Please follow these steps:  "status": "success",


1. Fork the repository  "dataset_info": {## Integration Examples

2. Create a feature branch (`git checkout -b feature/AmazingFeature`)

3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)    "original_rows": 1000,

4. Push to the branch (`git push origin feature/AmazingFeature`)

5. Open a Pull Request    "original_columns": 15,### FastAPI Backend


---    "cleaned_rows": 1000,


## 📝 License    "cleaned_columns": 13```python


This project is licensed under the MIT License - see the LICENSE file for details.  },from fastapi import FastAPI, UploadFile


---  "summary": {from ai_governance import AIGovernanceAnalyzer


## 🎓 Citation    "columns_removed": ["ssn", "email"],


If you use this project in your research or work, please cite:    "columns_anonymized": ["phone", "address"],app = FastAPI()


```bibtex    "total_cells_affected": 2847analyzer = AIGovernanceAnalyzer()

@software{nordic_privacy_ai,

  title = {Nordic Privacy AI - GDPR Compliance & AI Governance Platform},  },

  author = {PlatypusPus},

  year = {2025},  "pii_detections": {@app.post("/analyze")

  url = {https://github.com/PlatypusPus/MushroomEmpire}

}    "EMAIL": 1000,async def analyze(file: UploadFile, target: str, protected: list):

```

    "PHONE": 987,    df = pd.read_csv(file.file)

---

    "SSN": 1000    report = analyzer.analyze_dataframe(df, target, protected)

## 📧 Support

  },    return report

- **Issues**: [GitHub Issues](https://github.com/PlatypusPus/MushroomEmpire/issues)

- **Discussions**: [GitHub Discussions](https://github.com/PlatypusPus/MushroomEmpire/discussions)  "gdpr_compliance": [```


---    "Article 5(1)(c) - Data minimization",


## 🙏 Acknowledgments    "Article 17 - Right to erasure",### Flask Backend


- Built for Nordic ecosystems (BankID, MitID, Suomi.fi)    "Article 25 - Data protection by design"

- Inspired by GDPR, CCPA, and EU AI Act requirements

- Developed during a hackathon prototype  ],```python


---  "files": {from flask import Flask, request, jsonify


**Made with ❤️ by the Nordic Privacy AI Team**    "cleaned_csv": "/reports/cleaned_20251107_123456.csv",from ai_governance import AIGovernanceAnalyzer


    "audit_report": "/reports/cleaning_audit_20251107_123456.json"

  }app = Flask(__name__)

}analyzer = AIGovernanceAnalyzer()

```

@app.route('/analyze', methods=['POST'])

#### **GET /health**def analyze():

Health check endpoint with GPU status.    file = request.files['file']

    df = pd.read_csv(file)

**Response:**    report = analyzer.analyze_dataframe(

```json        df,

{        request.form['target'],

  "status": "healthy",        request.form.getlist('protected')

  "version": "1.0.0",    )

  "gpu_available": true    return jsonify(report)

}```

```

## License

#### **GET /reports/{filename}**

Download generated reports and cleaned files.MIT License


---## Contributing


## 🔧 ConfigurationContributions welcome! Please open an issue or submit a pull request.


### Environment Variables## Citation


Create `.env` file in `frontend/nordic-privacy-ai/`:If you use this module in your research or project, please cite:

```env

NEXT_PUBLIC_API_URL=http://localhost:8000```

```AI Governance Module - Bias Detection and Risk Analysis

https://github.com/PlatypusPus/MushroomEmpire

### CORS Configuration```


Edit `api/main.py` to add production domains:
```python
origins = [
    "http://localhost:3000",
    "https://your-production-domain.com"
]
```

### GPU Acceleration

GPU is automatically detected and used if available. To force CPU mode:
```python
# In cleaning.py or api endpoints
DataCleaner(use_gpu=False)
```

---

## 🧪 Testing

### Test the Backend
```powershell
# Test analyze endpoint
curl -X POST "http://localhost:8000/api/analyze" -F "file=@Datasets/loan_data.csv"

# Test clean endpoint
curl -X POST "http://localhost:8000/api/clean" -F "file=@Datasets/loan_data.csv"

# Check health
curl http://localhost:8000/health
```

### Run Unit Tests
```powershell
# Test cleaning module
python test_cleaning.py

# Run all tests (if pytest configured)
pytest
```

---

## 📊 Usage Examples

### Python SDK Usage

```python
from ai_governance import AIGovernanceAnalyzer

# Initialize analyzer
analyzer = AIGovernanceAnalyzer()

# Analyze dataset
report = analyzer.analyze(
    data_path='Datasets/loan_data.csv',
    target_column='loan_approved',
    protected_attributes=['gender', 'age', 'race']
)

# Print results
print(f"Bias Score: {report['summary']['overall_bias_score']:.3f}")
print(f"Risk Level: {report['summary']['risk_level']}")
print(f"Model Accuracy: {report['summary']['model_accuracy']:.3f}")

# Save report
analyzer.save_report(report, 'my_report.json')
```

### Data Cleaning Usage

```python
from cleaning import DataCleaner

# Initialize cleaner with GPU
cleaner = DataCleaner(use_gpu=True)

# Load and clean data
df = cleaner.load_data('Datasets/loan_data.csv')
cleaned_df, audit = cleaner.anonymize_pii(df)

# Save results
cleaner.save_cleaned_data(cleaned_df, 'cleaned_output.csv')
cleaner.save_audit_report(audit, 'audit_report.json')
```

### Frontend Integration

```typescript
import { analyzeDataset, cleanDataset } from '@/lib/api';

// Analyze uploaded file
const handleAnalyze = async (file: File) => {
  const result = await analyzeDataset(file);
  console.log('Bias Score:', result.bias_metrics.overall_bias_score);
  console.log('Download:', result.report_file);
};

// Clean uploaded file
const handleClean = async (file: File) => {
  const result = await cleanDataset(file);
  console.log('Cells anonymized:', result.summary.total_cells_affected);
  console.log('Download cleaned:', result.files.cleaned_csv);
};
```

---

## 📈 Metrics Interpretation

### Bias Score (0-1, lower is better)
- **0.0 - 0.3**: ✅ Low bias - Good fairness
- **0.3 - 0.5**: ⚠️ Moderate bias - Monitoring recommended
- **0.5 - 1.0**: ❌ High bias - Immediate action required

### Risk Score (0-1, lower is better)
- **0.0 - 0.4**: ✅ LOW risk
- **0.4 - 0.7**: ⚠️ MEDIUM risk
- **0.7 - 1.0**: ❌ HIGH risk

### Fairness Metrics
- **Disparate Impact**: Fair range 0.8 - 1.25
- **Statistical Parity**: Fair threshold < 0.1
- **Equal Opportunity**: Fair threshold < 0.1

---

## 🛠️ Technology Stack

### Backend
- **FastAPI** - Modern Python web framework
- **scikit-learn** - Machine learning
- **spaCy** - NLP for PII detection
- **PyTorch** - GPU acceleration (optional)
- **pandas** - Data processing

### Frontend
- **Next.js 14** - React framework with App Router
- **TypeScript** - Type safety
- **Tailwind CSS** - Styling
- **IndexedDB** - Browser storage

---

## 🤝 Contributing

Contributions are welcome! Please follow these steps:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

---

## 📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

---

## 🎓 Citation

If you use this project in your research or work, please cite:

```bibtex
@software{nordic_privacy_ai,
  title = {Nordic Privacy AI - GDPR Compliance & AI Governance Platform},
  author = {PlatypusPus},
  year = {2025},
  url = {https://github.com/PlatypusPus/MushroomEmpire}
}
```

---

## 📧 Support

- **Issues**: [GitHub Issues](https://github.com/PlatypusPus/MushroomEmpire/issues)
- **Discussions**: [GitHub Discussions](https://github.com/PlatypusPus/MushroomEmpire/discussions)

---

## 🙏 Acknowledgments

- Built for Nordic ecosystems (BankID, MitID, Suomi.fi)
- Inspired by GDPR, CCPA, and EU AI Act requirements
- Developed during a hackathon prototype

---

**Made with ❤️ by the Nordic Privacy AI Team**