agdar/DATA_GENERATION_GUIDE.md
2025-11-02 14:35:35 +03:00

7.5 KiB

Saudi-Influenced Test Data Generation Guide

Overview

This project includes a comprehensive management command to generate realistic Saudi-influenced test data for all applications in the AgdarCentre healthcare platform.

Features

The generate_test_data command creates:

Saudi Cultural Context

  • Arabic Names: Both English transliteration and Arabic script

    • Male names: محمد (Mohammed), عبدالله (Abdullah), فهد (Fahad), etc.
    • Female names: نورة (Noura), فاطمة (Fatima), سارة (Sarah), etc.
    • Family names: العتيبي (Al-Otaibi), الغامدي (Al-Ghamdi), etc.
  • Saudi Phone Numbers: Proper format with Saudi mobile prefixes

    • Format: +966 5X XXX XXXX
    • Valid prefixes: 50, 53, 54, 55, 56, 57, 58, 59
  • National IDs: 10-digit Saudi national ID format

    • Format: 1XXXXXXXXX (Saudi) or 2XXXXXXXXX (Resident)
  • Addresses: Saudi cities and districts

    • Cities: Riyadh, Jeddah, Mecca, Medina, Dammam, etc.
    • Riyadh districts: Al-Olaya, Al-Malaz, Al-Naseem, etc.
  • Currency: All financial data in SAR (Saudi Riyals)

  • Work Schedule: Saudi work week (Sunday-Thursday)

    • Morning shift: 08:00-12:00
    • Afternoon shift: 15:00-19:00 (after prayer break)
  • Insurance: Saudi insurance companies

    • Bupa Arabia, Tawuniya, Medgulf, Malath, etc.

Generated Data

The command generates data for all apps:

  1. Core App

    • Tenants (healthcare organizations)
    • Users (with various roles: doctors, nurses, therapists, admin, etc.)
    • Patients (with Saudi demographics)
    • Clinics/Departments
    • Files and SubFiles
    • Notification Preferences
  2. Appointments App

    • Providers
    • Rooms
    • Schedules (Sunday-Thursday)
    • Appointments (with varied statuses)
    • Appointment Reminders
    • Appointment Confirmations
  3. Finance App

    • Services (billable services)
    • Packages (session bundles)
    • Payers (insurance companies)
    • Invoices (with VAT)
    • Payments
    • Package Purchases
  4. Clinical Apps

    • Medical Consultations
    • Nursing Encounters (with vital signs)
    • ABA Consultations
    • OT Sessions
    • SLP Interventions
  5. Notifications App

    • Message Templates (bilingual)
    • Messages (SMS/WhatsApp)
  6. Referrals App

    • Internal and external referrals

Usage

Basic Usage

python manage.py generate_test_data

This will create:

  • 1 tenant
  • 50 patients per tenant
  • 100 appointments per tenant
  • Associated clinical, financial, and communication records

Command Options

python manage.py generate_test_data [OPTIONS]

Options:

  • --tenants N: Number of tenants to create (default: 1)

    python manage.py generate_test_data --tenants 2
    
  • --patients N: Number of patients per tenant (default: 50)

    python manage.py generate_test_data --patients 100
    
  • --appointments N: Number of appointments per tenant (default: 100)

    python manage.py generate_test_data --appointments 200
    
  • --clear: Clear existing data before generating new data

    python manage.py generate_test_data --clear
    

    ⚠️ Warning: This will delete all existing data except superuser accounts!

Examples

Generate data for a single tenant with default settings:

python manage.py generate_test_data

Generate data for multiple tenants:

python manage.py generate_test_data --tenants 3 --patients 30 --appointments 80

Clear existing data and generate fresh data:

python manage.py generate_test_data --clear --patients 100 --appointments 200

Generate large dataset for testing:

python manage.py generate_test_data --patients 200 --appointments 500

Data Characteristics

Patient Demographics

  • Age Distribution: Weighted towards children (therapy center context)
    • 60% children (2-12 years)
    • 25% teenagers (13-18 years)
    • 15% adults (19-60 years)

Appointment Distribution

  • Time Range: Past 3 months + next month
  • Status Distribution:
    • Past appointments: 75% completed, 15% no-show, 10% cancelled
    • Today's appointments: Mix of confirmed, arrived, in-progress
    • Future appointments: 70% confirmed, 30% booked
  • Scheduling: Excludes Saudi weekends (Friday & Saturday)

Financial Data

  • Services: 5 service types per clinic
  • Pricing: Realistic Saudi healthcare pricing (200-400 SAR)
  • VAT: 15% tax applied to all invoices
  • Insurance: Mix of self-pay and insured patients

Clinical Records

  • Generated for completed appointments only
  • Includes realistic vital signs and measurements
  • Age-appropriate clinical data

Dependencies

The command requires the following Python packages:

  • django - Django framework
  • faker - For generating fake data
  • phonenumbers / django-phonenumber-field - For phone number handling

These should already be installed in your project environment.

Output

The command provides detailed progress output:

Starting Saudi-influenced test data generation...

Generating data for tenant: Agdar Rehabilitation Center
  Created 18 users
  Created 5 clinics
  Created 50 patients
  Created 15 providers
  Created 14 rooms
  Created 150 schedules
  Created 100 appointments
  Created clinical records
  Created 25 services
  Created 5 packages
  Created financial records
  Created communication records
  Created integration records

============================================================
DATA GENERATION SUMMARY
============================================================
  Appointments: 100
  Clinics: 5
  Patients: 50
  Providers: 15
  Rooms: 14
  Schedules: 150
  Services: 25
  Tenants: 1
  Users: 18
============================================================

✓ Test data generation completed successfully!

Notes

  • All generated data is realistic and follows Saudi cultural norms
  • Patient names are bilingual (English and Arabic)
  • Phone numbers follow Saudi mobile format
  • Addresses use real Saudi cities and districts
  • Work schedules respect Saudi work week and prayer times
  • Financial data uses SAR currency with proper VAT
  • Clinical data is age-appropriate and realistic

Troubleshooting

Issue: Command not found

python manage.py generate_test_data
# Error: No module named 'core.management.commands.generate_test_data'

Solution: Ensure the management command directory structure exists:

core/
  management/
    __init__.py
    commands/
      __init__.py
      generate_test_data.py

Issue: Import errors

# Error: No module named 'faker'

Solution: Install required dependencies:

pip install faker

Issue: Database errors

# Error: UNIQUE constraint failed

Solution: Use the --clear flag to clear existing data first:

python manage.py generate_test_data --clear

Best Practices

  1. Development: Use smaller datasets for faster generation

    python manage.py generate_test_data --patients 20 --appointments 40
    
  2. Testing: Use moderate datasets

    python manage.py generate_test_data --patients 50 --appointments 100
    
  3. Demo: Use larger datasets for realistic demos

    python manage.py generate_test_data --patients 200 --appointments 500
    
  4. Fresh Start: Always use --clear when you want to reset data

    python manage.py generate_test_data --clear
    

Support

For issues or questions about the data generation command, please refer to the project documentation or contact the development team.