wohnbot/.github/copilot-instructions.md
2026-01-01 15:27:25 +01:00

6.4 KiB

Copilot Instructions for wohn-bot

Project Overview

A Python-based apartment monitoring bot for Berlin's public housing portal (inberlinwohnen.de) and WG rooms (wgcompany.de). Monitors listings from 6 housing companies (HOWOGE, Gewobag, Degewo, Gesobau, Stadt und Land, WBM) plus WGcompany, and sends Telegram notifications with optional auto-application via Playwright browser automation.

Architecture

Modularized structure with the following key components:

  • main.py: Entry point for the bot. Runs the monitoring loop and autocleaning every 48 hours.
  • handlers/: Contains company-specific handlers for auto-apply functionality. Each handler is responsible for automating the application process for a specific housing company. Includes:
    • howoge_handler.py
    • gewobag_handler.py
    • degewo_handler.py
    • gesobau_handler.py
    • stadtundland_handler.py
    • wbm_handler.py
    • wgcompany_notifier.py: Handles WGcompany listing fetching, deduplication, and notification
    • base_handler.py: Provides shared functionality for all handlers.
  • application_handler.py: Delegates application tasks to the appropriate handler based on the company. Enforces valid browser context.
  • telegram_bot.py: Fully async Telegram bot handler for commands and notifications. Uses httpx for messaging.
  • autoclean_debug.py: Deletes debug files (screenshots, HTML) older than 48 hours.
  • helper_functions/: Contains data merge utilities for combining stats from multiple sources:
    • merge_listing_times.py
    • merge_applications.py
    • merge_dict_json.py
    • merge_wgcompany_times.py

Data flow: Fetch listings → Compare with listings.json / wgcompany_listings.json → Detect new → Log to CSV → Auto-apply if autopilot enabled → Save to applications.json → Send Telegram notification → Autoclean debug files every 48 hours.

Key Patterns

Company-specific handlers

Each housing company has a dedicated handler in the handlers/ directory. When adding support for a new company:

  1. Create a new handler file in handlers/ (e.g., newcompany_handler.py).
  2. Implement the handler by extending BaseHandler and overriding necessary methods.
  3. Update application_handler.py to include the new handler in the handlers dictionary.

Listing identification

Listings are hashed by md5(key_fields)[:12] to generate stable IDs:

  • InBerlin: md5(rooms+size+price+address)
  • WGcompany: md5(link+price+size)

State management

  • state.json - Runtime state (autopilot toggle)
  • listings.json - Previously seen inberlinwohnen listings
  • wgcompany_listings.json - Previously seen WGcompany listings
  • applications.json - Application history with success/failure status, timestamps, and listing details
  • listing_times.csv / wgcompany_times.csv - Time-series data for pattern analysis
  • monitor.log - Centralized logs with rotation (RotatingFileHandler)

Logging

All modules use centralized logging configured in main.py:

  • RotatingFileHandler writes to data/monitor.log (max 5MB, 5 backups)
  • StreamHandler outputs to console/Docker logs
  • All handlers, notifiers, and utilities use logging.getLogger(__name__) for consistent logging

Autocleaning

Debug material (screenshots, HTML files) older than 48 hours is automatically deleted by autoclean_debug.py, which runs every 48 hours in the main loop.

Development

Run locally

# Install dependencies (requires Playwright)
pip install -r requirements.txt
playwright install chromium

# Set env vars and run
export TELEGRAM_BOT_TOKEN=... TELEGRAM_CHAT_ID=...
python main.py

Docker (production)

cp .env.example .env  # Configure credentials
docker compose up -d
docker compose logs -f

Debugging

  • Screenshots saved to data/ on application failures (*_nobtn_*.png)
  • HTML saved to data/debug_page.html (inberlin) and data/wgcompany_debug.html
  • Full logs in data/monitor.log with rotation
  • Debug files older than 48 hours are autocleaned

Environment Variables

Required: TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID InBerlin login: INBERLIN_EMAIL, INBERLIN_PASSWORD Form data: FORM_ANREDE, FORM_VORNAME, FORM_NACHNAME, FORM_EMAIL, FORM_PHONE, FORM_STRASSE, FORM_HAUSNUMMER, FORM_PLZ, FORM_ORT, FORM_PERSONS, FORM_CHILDREN, FORM_INCOME WGcompany: WGCOMPANY_ENABLED, WGCOMPANY_MIN_SIZE, WGCOMPANY_MAX_SIZE, WGCOMPANY_MIN_PRICE, WGCOMPANY_MAX_PRICE, WGCOMPANY_BEZIRK

Telegram Commands

  • /autopilot on|off - Enable or disable automatic applications
  • /status - Show current status and statistics (autopilot state, application counts by company)
  • /plot - Generate and send a weekly listing-patterns plot
  • /errorrate - Generate and send an autopilot success vs failure plot
  • /retryfailed - Retry all failed applications
  • /resetlistings - Reset seen listings (marks all current as failed to avoid spam)
  • /help - Show available commands and usage information

Common Tasks

Fix a broken company handler

Check data/*_nobtn_*.png screenshots and data/debug_page.html to see actual page structure. Update selectors in the corresponding handler file in handlers/.

Add Telegram command

  1. Add a case in TelegramBot._handle_update().
  2. Implement the corresponding _handle_{command}_command() method.

Modify listing extraction

  • InBerlin: Update regex patterns in InBerlinMonitor.fetch_listings(). Test against data/debug_page.html.
  • WGcompany: Update parsing in WGCompanyMonitor.fetch_listings(). Test against data/wgcompany_debug.html.

Merge data from another machine

Use the helper scripts in helper_functions/:

  • merge_listing_times.py - Merge listing_times.csv files
  • merge_applications.py - Merge applications.json files
  • merge_dict_json.py - Merge listings.json and wgcompany_listings.json
  • merge_wgcompany_times.py - Merge wgcompany_times.csv files

All scripts deduplicate by key and timestamp, and output merged results to the current data folder.

Unit Tests

Overview

The project includes unit tests to ensure functionality and reliability. Key test files:

  • tests/test_telegram_bot.py: Tests the Telegram bot's commands and messaging functionality.
  • tests/test_error_rate_plot.py: Tests the error rate plot generator for autopilot applications.

Running Tests

To run the tests, use:

pytest tests/

Ensure all dependencies are installed and the environment is configured correctly before running the tests.