6.4 KiB
Copilot Instructions for wohn-bot
Project Overview
A Python-based apartment monitoring bot for Berlin's public housing portal (inberlinwohnen.de) and WG rooms (wgcompany.de). Monitors listings from 6 housing companies (HOWOGE, Gewobag, Degewo, Gesobau, Stadt und Land, WBM) plus WGcompany, and sends Telegram notifications with optional auto-application via Playwright browser automation.
Architecture
Modularized structure with the following key components:
main.py: Entry point for the bot. Runs the monitoring loop and autocleaning every 48 hours.handlers/: Contains company-specific handlers for auto-apply functionality. Each handler is responsible for automating the application process for a specific housing company. Includes:howoge_handler.pygewobag_handler.pydegewo_handler.pygesobau_handler.pystadtundland_handler.pywbm_handler.pywgcompany_notifier.py: Handles WGcompany listing fetching, deduplication, and notificationbase_handler.py: Provides shared functionality for all handlers.
application_handler.py: Delegates application tasks to the appropriate handler based on the company. Enforces valid browser context.telegram_bot.py: Fully async Telegram bot handler for commands and notifications. Uses httpx for messaging.autoclean_debug.py: Deletes debug files (screenshots, HTML) older than 48 hours.helper_functions/: Contains data merge utilities for combining stats from multiple sources:merge_listing_times.pymerge_applications.pymerge_dict_json.pymerge_wgcompany_times.py
Data flow: Fetch listings → Compare with listings.json / wgcompany_listings.json → Detect new → Log to CSV → Auto-apply if autopilot enabled → Save to applications.json → Send Telegram notification → Autoclean debug files every 48 hours.
Key Patterns
Company-specific handlers
Each housing company has a dedicated handler in the handlers/ directory. When adding support for a new company:
- Create a new handler file in
handlers/(e.g.,newcompany_handler.py). - Implement the handler by extending
BaseHandlerand overriding necessary methods. - Update
application_handler.pyto include the new handler in thehandlersdictionary.
Listing identification
Listings are hashed by md5(key_fields)[:12] to generate stable IDs:
- InBerlin:
md5(rooms+size+price+address) - WGcompany:
md5(link+price+size)
State management
state.json- Runtime state (autopilot toggle)listings.json- Previously seen inberlinwohnen listingswgcompany_listings.json- Previously seen WGcompany listingsapplications.json- Application history with success/failure status, timestamps, and listing detailslisting_times.csv/wgcompany_times.csv- Time-series data for pattern analysismonitor.log- Centralized logs with rotation (RotatingFileHandler)
Logging
All modules use centralized logging configured in main.py:
RotatingFileHandlerwrites todata/monitor.log(max 5MB, 5 backups)StreamHandleroutputs to console/Docker logs- All handlers, notifiers, and utilities use
logging.getLogger(__name__)for consistent logging
Autocleaning
Debug material (screenshots, HTML files) older than 48 hours is automatically deleted by autoclean_debug.py, which runs every 48 hours in the main loop.
Development
Run locally
# Install dependencies (requires Playwright)
pip install -r requirements.txt
playwright install chromium
# Set env vars and run
export TELEGRAM_BOT_TOKEN=... TELEGRAM_CHAT_ID=...
python main.py
Docker (production)
cp .env.example .env # Configure credentials
docker compose up -d
docker compose logs -f
Debugging
- Screenshots saved to
data/on application failures (*_nobtn_*.png) - HTML saved to
data/debug_page.html(inberlin) anddata/wgcompany_debug.html - Full logs in
data/monitor.logwith rotation - Debug files older than 48 hours are autocleaned
Environment Variables
Required: TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID
InBerlin login: INBERLIN_EMAIL, INBERLIN_PASSWORD
Form data: FORM_ANREDE, FORM_VORNAME, FORM_NACHNAME, FORM_EMAIL, FORM_PHONE, FORM_STRASSE, FORM_HAUSNUMMER, FORM_PLZ, FORM_ORT, FORM_PERSONS, FORM_CHILDREN, FORM_INCOME
WGcompany: WGCOMPANY_ENABLED, WGCOMPANY_MIN_SIZE, WGCOMPANY_MAX_SIZE, WGCOMPANY_MIN_PRICE, WGCOMPANY_MAX_PRICE, WGCOMPANY_BEZIRK
Telegram Commands
/autopilot on|off- Enable or disable automatic applications/status- Show current status and statistics (autopilot state, application counts by company)/plot- Generate and send a weekly listing-patterns plot/errorrate- Generate and send an autopilot success vs failure plot/retryfailed- Retry all failed applications/resetlistings- Reset seen listings (marks all current as failed to avoid spam)/help- Show available commands and usage information
Common Tasks
Fix a broken company handler
Check data/*_nobtn_*.png screenshots and data/debug_page.html to see actual page structure. Update selectors in the corresponding handler file in handlers/.
Add Telegram command
- Add a case in
TelegramBot._handle_update(). - Implement the corresponding
_handle_{command}_command()method.
Modify listing extraction
- InBerlin: Update regex patterns in
InBerlinMonitor.fetch_listings(). Test againstdata/debug_page.html. - WGcompany: Update parsing in
WGCompanyMonitor.fetch_listings(). Test againstdata/wgcompany_debug.html.
Merge data from another machine
Use the helper scripts in helper_functions/:
merge_listing_times.py- Merge listing_times.csv filesmerge_applications.py- Merge applications.json filesmerge_dict_json.py- Merge listings.json and wgcompany_listings.jsonmerge_wgcompany_times.py- Merge wgcompany_times.csv files
All scripts deduplicate by key and timestamp, and output merged results to the current data folder.
Unit Tests
Overview
The project includes unit tests to ensure functionality and reliability. Key test files:
tests/test_telegram_bot.py: Tests the Telegram bot's commands and messaging functionality.tests/test_error_rate_plot.py: Tests the error rate plot generator for autopilot applications.
Running Tests
To run the tests, use:
pytest tests/
Ensure all dependencies are installed and the environment is configured correctly before running the tests.