# Copilot Instructions for inberlin-monitor ## Project Overview A Python-based apartment monitoring bot for Berlin's public housing portal (inberlinwohnen.de) and WG rooms (wgcompany.de). Monitors listings from 6 housing companies (HOWOGE, Gewobag, Degewo, Gesobau, Stadt und Land, WBM) plus WGcompany, and sends Telegram notifications with optional auto-application via Playwright browser automation. ## Architecture **Single-file monolith** (`monitor.py`, ~1600 lines) with five main classes: - `InBerlinMonitor` - Core scraping/monitoring loop for inberlinwohnen.de, login handling, listing detection - `WGCompanyMonitor` - Monitors wgcompany.de WG rooms with configurable search filters - `ApplicationHandler` - Company-specific form automation (each `_apply_*` method handles one housing company) - `TelegramBot` - Command handling via long-polling in a daemon thread - Main loop runs synchronous with `asyncio.get_event_loop().run_until_complete()` for Playwright calls **Data flow**: Fetch listings → Compare with `listings.json` / `wgcompany_listings.json` → Detect new → Log to CSV → Auto-apply if autopilot enabled (inberlin only) → Save to `applications.json` → Send Telegram notification ## Key Patterns ### Company-specific handlers Each housing company has a dedicated `_apply_{company}()` method in `ApplicationHandler`. When adding support for a new company: 1. Add detection in `_detect_company()` (line ~350) 2. Add handler call in `apply()` switch (line ~330) 3. Implement `_apply_newcompany()` following existing patterns (cookie dismiss → find button → fill form → submit → screenshot) ### Listing identification Listings are hashed by `md5(key_fields)[:12]` to generate stable IDs: - InBerlin: `md5(rooms+size+price+address)` - WGcompany: `md5(link+price+size)` ### State management - `state.json` - Runtime state (autopilot toggle) - `listings.json` - Previously seen inberlinwohnen listings - `wgcompany_listings.json` - Previously seen WGcompany listings - `applications.json` - Application history with success/failure status - `listing_times.csv` / `wgcompany_times.csv` - Time-series data for pattern analysis ## Development ### Run locally ```bash # Install deps (requires Playwright) pip install -r requirements.txt playwright install chromium # Set env vars and run export TELEGRAM_BOT_TOKEN=... TELEGRAM_CHAT_ID=... python monitor.py ``` ### Docker (production) ```bash cp .env.example .env # Configure credentials docker compose up -d docker compose logs -f ``` ### Debugging - Screenshots saved to `data/` on application failures (`*_nobtn_*.png`) - HTML saved to `data/debug_page.html` (inberlin) and `data/wgcompany_debug.html` - Full logs in `data/monitor.log` ## Environment Variables Required: `TELEGRAM_BOT_TOKEN`, `TELEGRAM_CHAT_ID` InBerlin login: `INBERLIN_EMAIL`, `INBERLIN_PASSWORD` Form data: `FORM_ANREDE`, `FORM_VORNAME`, `FORM_NACHNAME`, `FORM_EMAIL`, `FORM_PHONE`, `FORM_STRASSE`, `FORM_HAUSNUMMER`, `FORM_PLZ`, `FORM_ORT`, `FORM_PERSONS`, `FORM_CHILDREN`, `FORM_INCOME` WGcompany: `WGCOMPANY_ENABLED`, `WGCOMPANY_MIN_SIZE`, `WGCOMPANY_MAX_SIZE`, `WGCOMPANY_MIN_PRICE`, `WGCOMPANY_MAX_PRICE`, `WGCOMPANY_BEZIRK` ## Common Tasks ### Fix a broken company handler Check `data/*_nobtn_*.png` screenshots and `data/debug_page.html` to see actual page structure. Update selectors in the corresponding `_apply_{company}()` method. ### Add Telegram command 1. Add case in `TelegramBot._handle_update()` (line ~95) 2. Implement `_handle_{command}_command()` method ### Modify listing extraction - InBerlin: Update regex patterns in `InBerlinMonitor.fetch_listings()`. Test against `data/debug_page.html`. - WGcompany: Update parsing in `WGCompanyMonitor.fetch_listings()`. Test against `data/wgcompany_debug.html`.