# Copilot Instructions for wohn-bot ## Project Overview A Python-based apartment monitoring bot for Berlin's public housing portal (inberlinwohnen.de) and WG rooms (wgcompany.de). Monitors listings from 6 housing companies (HOWOGE, Gewobag, Degewo, Gesobau, Stadt und Land, WBM) plus WGcompany, and sends Telegram notifications with optional auto-application via Playwright browser automation. ## Architecture **Modularized structure** with the following key components: - `main.py`: Entry point for the bot. Runs the monitoring loop and autocleaning every 48 hours. - `handlers/`: Contains company-specific handlers for auto-apply functionality. Each handler is responsible for automating the application process for a specific housing company. Includes: - `howoge_handler.py` - `gewobag_handler.py` - `degewo_handler.py` - `gesobau_handler.py` - `stadtundland_handler.py` - `wbm_handler.py` - `wgcompany_notifier.py`: Handles WGcompany listing fetching, deduplication, and notification - `base_handler.py`: Provides shared functionality for all handlers. - `application_handler.py`: Delegates application tasks to the appropriate handler based on the company. Enforces valid browser context. - `telegram_bot.py`: Fully async Telegram bot handler for commands and notifications. Uses httpx for messaging. - `autoclean_debug.py`: Deletes debug files (screenshots, HTML) older than 48 hours. - `helper_functions/`: Contains data merge utilities for combining stats from multiple sources: - `merge_listing_times.py` - `merge_applications.py` - `merge_dict_json.py` - `merge_wgcompany_times.py` **Data flow**: Fetch listings → Compare with `listings.json` / `wgcompany_listings.json` → Detect new → Log to CSV → Auto-apply if autopilot enabled → Save to `applications.json` → Send Telegram notification → Autoclean debug files every 48 hours. ## Key Patterns ### Company-specific handlers Each housing company has a dedicated handler in the `handlers/` directory. When adding support for a new company: 1. Create a new handler file in `handlers/` (e.g., `newcompany_handler.py`). 2. Implement the handler by extending `BaseHandler` and overriding necessary methods. 3. Update `application_handler.py` to include the new handler in the `handlers` dictionary. ### Listing identification Listings are hashed by `md5(key_fields)[:12]` to generate stable IDs: - InBerlin: `md5(rooms+size+price+address)` - WGcompany: `md5(link+price+size)` ### State management - `state.json` - Runtime state (autopilot toggle) - `listings.json` - Previously seen inberlinwohnen listings - `wgcompany_listings.json` - Previously seen WGcompany listings - `applications.json` - Application history with success/failure status, timestamps, and listing details - `listing_times.csv` / `wgcompany_times.csv` - Time-series data for pattern analysis - `monitor.log` - Centralized logs with rotation (RotatingFileHandler) ### Logging All modules use centralized logging configured in `main.py`: - `RotatingFileHandler` writes to `data/monitor.log` (max 5MB, 5 backups) - `StreamHandler` outputs to console/Docker logs - All handlers, notifiers, and utilities use `logging.getLogger(__name__)` for consistent logging ### Autocleaning Debug material (screenshots, HTML files) older than 48 hours is automatically deleted by `autoclean_debug.py`, which runs every 48 hours in the main loop. ## Development ### Run locally ```bash # Install dependencies (requires Playwright) pip install -r requirements.txt playwright install chromium # Set env vars and run export TELEGRAM_BOT_TOKEN=... TELEGRAM_CHAT_ID=... python main.py ``` ### Docker (production) ```bash cp .env.example .env # Configure credentials docker compose up -d docker compose logs -f ``` ### Debugging - Screenshots saved to `data/` on application failures (`*_nobtn_*.png`) - HTML saved to `data/debug_page.html` (inberlin) and `data/wgcompany_debug.html` - Full logs in `data/monitor.log` with rotation - Debug files older than 48 hours are autocleaned ## Environment Variables Required: `TELEGRAM_BOT_TOKEN`, `TELEGRAM_CHAT_ID` InBerlin login: `INBERLIN_EMAIL`, `INBERLIN_PASSWORD` Form data: `FORM_ANREDE`, `FORM_VORNAME`, `FORM_NACHNAME`, `FORM_EMAIL`, `FORM_PHONE`, `FORM_STRASSE`, `FORM_HAUSNUMMER`, `FORM_PLZ`, `FORM_ORT`, `FORM_PERSONS`, `FORM_CHILDREN`, `FORM_INCOME` WGcompany: `WGCOMPANY_ENABLED`, `WGCOMPANY_MIN_SIZE`, `WGCOMPANY_MAX_SIZE`, `WGCOMPANY_MIN_PRICE`, `WGCOMPANY_MAX_PRICE`, `WGCOMPANY_BEZIRK` ## Telegram Commands - `/autopilot on|off` - Enable or disable automatic applications - `/status` - Show current status and statistics (autopilot state, application counts by company) - `/plot` - Generate and send a weekly listing-patterns plot - `/errorrate` - Generate and send an autopilot success vs failure plot - `/retryfailed` - Retry all failed applications - `/resetlistings` - Reset seen listings (marks all current as failed to avoid spam) - `/help` - Show available commands and usage information ## Common Tasks ### Fix a broken company handler Check `data/*_nobtn_*.png` screenshots and `data/debug_page.html` to see actual page structure. Update selectors in the corresponding handler file in `handlers/`. ### Add Telegram command 1. Add a case in `TelegramBot._handle_update()`. 2. Implement the corresponding `_handle_{command}_command()` method. ### Modify listing extraction - InBerlin: Update regex patterns in `InBerlinMonitor.fetch_listings()`. Test against `data/debug_page.html`. - WGcompany: Update parsing in `WGCompanyMonitor.fetch_listings()`. Test against `data/wgcompany_debug.html`. ### Merge data from another machine Use the helper scripts in `helper_functions/`: - `merge_listing_times.py` - Merge listing_times.csv files - `merge_applications.py` - Merge applications.json files - `merge_dict_json.py` - Merge listings.json and wgcompany_listings.json - `merge_wgcompany_times.py` - Merge wgcompany_times.csv files All scripts deduplicate by key and timestamp, and output merged results to the current data folder. ## Unit Tests ### Overview The project includes unit tests to ensure functionality and reliability. Key test files: - `tests/test_telegram_bot.py`: Tests the Telegram bot's commands and messaging functionality. - `tests/test_error_rate_plot.py`: Tests the error rate plot generator for autopilot applications. ### Running Tests To run the tests, use: ```bash pytest tests/ ``` Ensure all dependencies are installed and the environment is configured correctly before running the tests.