agdar/DATA_GENERATION_GUIDE.md
2025-11-02 14:35:35 +03:00

289 lines
7.5 KiB
Markdown

# Saudi-Influenced Test Data Generation Guide
## Overview
This project includes a comprehensive management command to generate realistic Saudi-influenced test data for all applications in the AgdarCentre healthcare platform.
## Features
The `generate_test_data` command creates:
### Saudi Cultural Context
- **Arabic Names**: Both English transliteration and Arabic script
- Male names: محمد (Mohammed), عبدالله (Abdullah), فهد (Fahad), etc.
- Female names: نورة (Noura), فاطمة (Fatima), سارة (Sarah), etc.
- Family names: العتيبي (Al-Otaibi), الغامدي (Al-Ghamdi), etc.
- **Saudi Phone Numbers**: Proper format with Saudi mobile prefixes
- Format: +966 5X XXX XXXX
- Valid prefixes: 50, 53, 54, 55, 56, 57, 58, 59
- **National IDs**: 10-digit Saudi national ID format
- Format: 1XXXXXXXXX (Saudi) or 2XXXXXXXXX (Resident)
- **Addresses**: Saudi cities and districts
- Cities: Riyadh, Jeddah, Mecca, Medina, Dammam, etc.
- Riyadh districts: Al-Olaya, Al-Malaz, Al-Naseem, etc.
- **Currency**: All financial data in SAR (Saudi Riyals)
- **Work Schedule**: Saudi work week (Sunday-Thursday)
- Morning shift: 08:00-12:00
- Afternoon shift: 15:00-19:00 (after prayer break)
- **Insurance**: Saudi insurance companies
- Bupa Arabia, Tawuniya, Medgulf, Malath, etc.
### Generated Data
The command generates data for all apps:
1. **Core App**
- Tenants (healthcare organizations)
- Users (with various roles: doctors, nurses, therapists, admin, etc.)
- Patients (with Saudi demographics)
- Clinics/Departments
- Files and SubFiles
- Notification Preferences
2. **Appointments App**
- Providers
- Rooms
- Schedules (Sunday-Thursday)
- Appointments (with varied statuses)
- Appointment Reminders
- Appointment Confirmations
3. **Finance App**
- Services (billable services)
- Packages (session bundles)
- Payers (insurance companies)
- Invoices (with VAT)
- Payments
- Package Purchases
4. **Clinical Apps**
- Medical Consultations
- Nursing Encounters (with vital signs)
- ABA Consultations
- OT Sessions
- SLP Interventions
5. **Notifications App**
- Message Templates (bilingual)
- Messages (SMS/WhatsApp)
6. **Referrals App**
- Internal and external referrals
## Usage
### Basic Usage
```bash
python manage.py generate_test_data
```
This will create:
- 1 tenant
- 50 patients per tenant
- 100 appointments per tenant
- Associated clinical, financial, and communication records
### Command Options
```bash
python manage.py generate_test_data [OPTIONS]
```
**Options:**
- `--tenants N`: Number of tenants to create (default: 1)
```bash
python manage.py generate_test_data --tenants 2
```
- `--patients N`: Number of patients per tenant (default: 50)
```bash
python manage.py generate_test_data --patients 100
```
- `--appointments N`: Number of appointments per tenant (default: 100)
```bash
python manage.py generate_test_data --appointments 200
```
- `--clear`: Clear existing data before generating new data
```bash
python manage.py generate_test_data --clear
```
**⚠️ Warning**: This will delete all existing data except superuser accounts!
### Examples
**Generate data for a single tenant with default settings:**
```bash
python manage.py generate_test_data
```
**Generate data for multiple tenants:**
```bash
python manage.py generate_test_data --tenants 3 --patients 30 --appointments 80
```
**Clear existing data and generate fresh data:**
```bash
python manage.py generate_test_data --clear --patients 100 --appointments 200
```
**Generate large dataset for testing:**
```bash
python manage.py generate_test_data --patients 200 --appointments 500
```
## Data Characteristics
### Patient Demographics
- **Age Distribution**: Weighted towards children (therapy center context)
- 60% children (2-12 years)
- 25% teenagers (13-18 years)
- 15% adults (19-60 years)
### Appointment Distribution
- **Time Range**: Past 3 months + next month
- **Status Distribution**:
- Past appointments: 75% completed, 15% no-show, 10% cancelled
- Today's appointments: Mix of confirmed, arrived, in-progress
- Future appointments: 70% confirmed, 30% booked
- **Scheduling**: Excludes Saudi weekends (Friday & Saturday)
### Financial Data
- **Services**: 5 service types per clinic
- **Pricing**: Realistic Saudi healthcare pricing (200-400 SAR)
- **VAT**: 15% tax applied to all invoices
- **Insurance**: Mix of self-pay and insured patients
### Clinical Records
- Generated for completed appointments only
- Includes realistic vital signs and measurements
- Age-appropriate clinical data
## Dependencies
The command requires the following Python packages:
- `django` - Django framework
- `faker` - For generating fake data
- `phonenumbers` / `django-phonenumber-field` - For phone number handling
These should already be installed in your project environment.
## Output
The command provides detailed progress output:
```
Starting Saudi-influenced test data generation...
Generating data for tenant: Agdar Rehabilitation Center
Created 18 users
Created 5 clinics
Created 50 patients
Created 15 providers
Created 14 rooms
Created 150 schedules
Created 100 appointments
Created clinical records
Created 25 services
Created 5 packages
Created financial records
Created communication records
Created integration records
============================================================
DATA GENERATION SUMMARY
============================================================
Appointments: 100
Clinics: 5
Patients: 50
Providers: 15
Rooms: 14
Schedules: 150
Services: 25
Tenants: 1
Users: 18
============================================================
✓ Test data generation completed successfully!
```
## Notes
- All generated data is realistic and follows Saudi cultural norms
- Patient names are bilingual (English and Arabic)
- Phone numbers follow Saudi mobile format
- Addresses use real Saudi cities and districts
- Work schedules respect Saudi work week and prayer times
- Financial data uses SAR currency with proper VAT
- Clinical data is age-appropriate and realistic
## Troubleshooting
**Issue**: Command not found
```bash
python manage.py generate_test_data
# Error: No module named 'core.management.commands.generate_test_data'
```
**Solution**: Ensure the management command directory structure exists:
```
core/
management/
__init__.py
commands/
__init__.py
generate_test_data.py
```
**Issue**: Import errors
```bash
# Error: No module named 'faker'
```
**Solution**: Install required dependencies:
```bash
pip install faker
```
**Issue**: Database errors
```bash
# Error: UNIQUE constraint failed
```
**Solution**: Use the `--clear` flag to clear existing data first:
```bash
python manage.py generate_test_data --clear
```
## Best Practices
1. **Development**: Use smaller datasets for faster generation
```bash
python manage.py generate_test_data --patients 20 --appointments 40
```
2. **Testing**: Use moderate datasets
```bash
python manage.py generate_test_data --patients 50 --appointments 100
```
3. **Demo**: Use larger datasets for realistic demos
```bash
python manage.py generate_test_data --patients 200 --appointments 500
```
4. **Fresh Start**: Always use `--clear` when you want to reset data
```bash
python manage.py generate_test_data --clear
```
## Support
For issues or questions about the data generation command, please refer to the project documentation or contact the development team.