No description
Find a file
2026-01-08 21:04:43 +01:00
.github prod 2026-01-01 15:27:25 +01:00
archive working app 2025-12-29 22:46:10 +01:00
handlers upd gewobag 2026-01-08 21:04:43 +01:00
helper_functions upd gewobag 2026-01-08 21:04:43 +01:00
tests prod 2026-01-01 15:27:25 +01:00
.env.example upd gewobag 2026-01-08 21:04:43 +01:00
.gitignore Fix Degewo auto-apply (Wohnungshelden iframe), update dependencies, cleanup for production 2025-12-09 11:30:17 +01:00
.python-version new errorlisting 2025-12-16 13:51:25 +01:00
application_handler.py logic 2026-01-05 15:46:30 +01:00
autoclean_debug.py with autoclean 2025-12-31 16:19:14 +01:00
bot_logo.png Fix Degewo auto-apply (Wohnungshelden iframe), update dependencies, cleanup for production 2025-12-09 11:30:17 +01:00
BOTFATHER_COMMANDS.txt logs n 2026-01-04 18:29:56 +01:00
docker-compose.dev.yml prod 2026-01-01 15:27:25 +01:00
docker-compose.yml interval 2026-01-01 15:32:50 +01:00
Dockerfile prod 2026-01-01 15:27:25 +01:00
LICENSE prod 2026-01-01 15:27:25 +01:00
main.py fix timezone 2026-01-04 11:26:43 +01:00
pytest.ini major refactor (untested) 2025-12-27 11:59:04 +01:00
README.md gewobag fix 2026-01-05 13:40:12 +01:00
requirements.txt prod 2026-01-01 15:27:25 +01:00
state_manager.py add start stop 2026-01-02 13:41:21 +01:00
telegram_bot.py logs n 2026-01-04 18:29:56 +01:00

wohn-bot

A Python bot that monitors Berlin's public housing portal (inberlinwohnen.de) and WG rooms (wgcompany.de). Sends Telegram notifications when new listings appear and can automatically apply to some listings.

What it does

  • Monitors inberlinwohnen.de for new apartment listings from 6 housing companies (HOWOGE, Gewobag, Degewo, Gesobau, Stadt und Land, WBM)
  • Monitors wgcompany.de for WG room listings with configurable filters
  • Notifies via Telegram with rich listing details and application status
  • Logs listing times to CSV for pattern analysis and visualization
  • Auto-applies to new listings when autopilot is enabled (all 6 companies supported)
  • Generates weekly listing pattern plots and autopilot performance analytics
  • Autocleans debug files older than 48 hours to manage disk space
  • Tracks application history with success/failure reasons in JSON

Auto-Apply Support

All six housing companies monitored by this bot now support the autopilot (automatic application) feature. Use autopilot with care — automatic form submission is destructive and may send many requests if configured incorrectly.

Company Status Notes
HOWOGE Working Fully automated and tested
Degewo Working Uses Wohnungshelden portal; automated
Stadt und Land Working Embedded form handled automatically
Gewobag Working Wohnungshelden iframe handled automatically
Gesobau Working Automated form submission implemented
WBM Working Automated form submission implemented
WGcompany Monitoring only WGcompany monitoring only (no autopilot)

Recommended precautions:

  • Run with /autopilot off while testing new selectors or after changing config.
  • Inspect data/applications.json and saved screenshots in data/ after enabling autopilot.
  • Respect site terms of use and rate limits; set CHECK_INTERVAL appropriately.

Setup

cp .env.example .env
# Edit .env with your credentials
docker compose up -d

Local development

pip install -r requirements.txt
playwright install chromium

export TELEGRAM_BOT_TOKEN=your_token
export TELEGRAM_CHAT_ID=your_chat_id
# ... other env vars (see .env.example)

python main.py

Helper Scripts

The helper_functions/ directory contains utilities for merging data from multiple machines:

  • merge_listing_times.py - Merge listing_times.csv files
  • merge_applications.py - Merge applications.json files
  • merge_dict_json.py - Merge listings.json and wgcompany_listings.json
  • merge_wgcompany_times.py - Merge wgcompany_times.csv files

All scripts deduplicate by key and timestamp.

Configuration

Required environment variables

  • TELEGRAM_BOT_TOKEN - Bot token from @BotFather
  • TELEGRAM_CHAT_ID - Your Telegram chat ID

InBerlin login (required for auto-apply)

  • INBERLIN_EMAIL - Your inberlinwohnen.de email
  • INBERLIN_PASSWORD - Your inberlinwohnen.de password

Form data (for auto-apply)

  • FORM_ANREDE - Salutation (Herr/Frau)
  • FORM_VORNAME - First name
  • FORM_NACHNAME - Last name
  • FORM_EMAIL - Email address
  • FORM_PHONE - Phone number
  • FORM_STRASSE - Street name
  • FORM_HAUSNUMMER - House number
  • FORM_PLZ - Postal code
  • FORM_ORT - City
  • FORM_PERSONS - Number of persons in household
  • FORM_ADULTS - Number of adults (for GEWOBAG forms, defaults to 1)
  • FORM_CHILDREN - Number of children (defaults to 0)
  • FORM_INCOME - Monthly net income

WGcompany filters

  • WGCOMPANY_ENABLED - Enable WGcompany monitoring (true/false)
  • WGCOMPANY_MIN_SIZE - Minimum room size in sqm
  • WGCOMPANY_MAX_SIZE - Maximum room size in sqm
  • WGCOMPANY_MIN_PRICE - Minimum price in EUR
  • WGCOMPANY_MAX_PRICE - Maximum price in EUR
  • WGCOMPANY_BEZIRK - District filter (optional)

Telegram Commands

  • /autopilot on|off - Enable or disable automatic applications
  • /status - Show current status and statistics (autopilot state, application counts by company)
  • /plot - Generate and send a weekly listing-patterns plot with heatmap and charts (high-res, seaborn-styled)
  • /errorrate - Generate and send an autopilot performance analysis with success/failure rates by company (high-res, seaborn-styled)
  • /retryfailed - Retry all previously failed applications
  • /resetlistings - Reset seen listings (marks all current as failed to avoid spam)
  • /help - Show available commands and usage information

Important: The bot only processes commands from the configured TELEGRAM_CHAT_ID. Use /autopilot off while testing selector changes or after modifying configuration to avoid accidental submissions.

Plot Features: All plots are generated at 300 DPI with seaborn styling for publication-quality output.

Data files

All data is stored in the data/ directory:

Persistent State:

  • listings.json - Previously seen inberlinwohnen listings (deduplicated by hash)
  • wgcompany_listings.json - Previously seen WGcompany listings (deduplicated by hash)
  • applications.json - Application history with timestamps, success/failure status, and error messages
  • listing_times.csv - Time series data for inberlinwohnen listings (for pattern analysis)
  • wgcompany_times.csv - Time series data for WGcompany listings
  • state.json - Runtime state (autopilot toggle, persistent across restarts)
  • monitor.log - Rotating application logs (max 5MB, 5 backups)

Generated Plots:

  • weekly_plot.png - Weekly listing patterns (heatmap + charts, 300 DPI)
  • error_rate.png - Autopilot performance analysis (3-panel chart, 300 DPI)

Debug Files (auto-cleaned after 48 hours):

  • data/<company>/*.png - Screenshots from failed applications
  • data/<company>/*.html - Page HTML snapshots for debugging
  • data/debug_page.html - InBerlin page snapshot
  • data/wgcompany_debug.html - WGcompany page snapshot

Note: Debug files (screenshots, HTML) are automatically deleted after 48 hours to save disk space. Listing data, applications, and logs are never deleted.

Debugging

When applications fail, the bot saves debug material to help diagnose issues:

Company-specific folders:

  • data/howoge/ - Howoge screenshots and HTML
  • data/gewobag/ - Gewobag screenshots and HTML
  • data/degewo/ - Degewo screenshots and HTML
  • data/gesobau/ - Gesobau screenshots and HTML
  • data/stadtundland/ - Stadt und Land screenshots and HTML
  • data/wbm/ - WBM screenshots and HTML

General debug files:

  • data/debug_page.html - InBerlin page snapshot
  • data/wgcompany_debug.html - WGcompany page snapshot

Check applications.json for error messages and timestamps. Debug files are automatically cleaned after 48 hours but can be manually inspected while fresh.

Code Structure

The bot has been modularized for better maintainability. The main components are:

Core:

  • main.py - Entry point, orchestrates monitoring loop and autoclean
  • application_handler.py - Delegates applications to company handlers, generates plots
  • telegram_bot.py - Async Telegram bot with httpx for commands and notifications
  • state_manager.py - Manages persistent state (autopilot toggle)
  • autoclean_debug.py - Deletes debug files older than 48 hours

Handlers:

  • handlers/base_handler.py - Abstract base class with shared functionality (cookie handling, consent, logging)
  • handlers/howoge_handler.py - HOWOGE application automation
  • handlers/gewobag_handler.py - Gewobag application automation
  • handlers/degewo_handler.py - Degewo application automation (Wohnungshelden)
  • handlers/gesobau_handler.py - Gesobau application automation
  • handlers/stadtundland_handler.py - Stadt und Land application automation
  • handlers/wbm_handler.py - WBM application automation
  • handlers/wgcompany_notifier.py - WGcompany monitoring (notification only, no autopilot)

Utilities:

  • helper_functions/ - Data merge utilities for combining stats from multiple sources
    • merge_listing_times.py
    • merge_applications.py
    • merge_dict_json.py
    • merge_wgcompany_times.py

Tests:

  • tests/ - Comprehensive unit tests (48 tests total)
    • test_telegram_bot.py - Telegram bot commands and messaging
    • test_error_rate_plot.py - Plot generation
    • test_wgcompany_notifier.py - WGcompany monitoring
    • test_handlers.py - Handler initialization
    • test_application_handler.py - Application orchestration
    • test_helper_functions.py - Merge utilities
    • test_autoclean.py - Autoclean script validation

Unit Tests

The project includes comprehensive unit tests (48 tests total) to ensure functionality and reliability:

  • test_telegram_bot.py - Telegram bot commands and messaging (13 tests)
  • test_error_rate_plot.py - Plot generation and data analysis (2 tests)
  • test_wgcompany_notifier.py - WGcompany monitoring (7 tests)
  • test_handlers.py - Handler initialization and structure (6 tests)
  • test_application_handler.py - Application orchestration (10 tests)
  • test_company_detection.py - Company detection from URLs (6 tests)
  • test_state_manager.py - State persistence (2 tests)
  • test_helper_functions.py - Merge utilities (2 tests)
  • test_autoclean.py - Autoclean script validation (1 test)

Running Tests

pytest tests/ -v

All tests use mocking to avoid external dependencies and can run offline.

Workflow Diagram

flowchart TD
    Start([Start Bot]) --> Init[Initialize Browser & Telegram Bot]
    Init --> Loop{Main Loop}
    
    %% InBerlin Monitoring
    Loop --> InBerlin[Fetch InBerlin Listings]
    InBerlin --> ParseIB[Parse & Hash Listings]
    ParseIB --> LoadIB[Load Previous InBerlin Listings]
    LoadIB --> DedupeIB{New InBerlin Listings?}
    
    DedupeIB -- Yes --> LogIB[Log to listing_times.csv]
    LogIB --> SaveIB[Save to listings.json]
    DedupeIB -- No --> WG
    
    %% WGcompany Monitoring
    SaveIB --> WG[Fetch WGcompany Listings]
    WG --> ParseWG[Parse & Hash Listings]
    ParseWG --> LoadWG[Load Previous WGcompany Listings]
    LoadWG --> DedupeWG{New WGcompany Listings?}
    
    DedupeWG -- Yes --> LogWG[Log to wgcompany_times.csv]
    LogWG --> SaveWG[Save to wgcompany_listings.json]
    DedupeWG -- No --> CheckAutopilot
    
    %% Autopilot Decision
    SaveWG --> CheckAutopilot{Autopilot Enabled?}
    SaveIB --> CheckAutopilot
    
    CheckAutopilot -- Off --> NotifyOnly[Send Telegram Notifications]
    NotifyOnly --> CheckClean
    
    CheckAutopilot -- On --> CheckApplied{Already Applied?}
    CheckApplied -- Yes --> Skip[Skip Listing]
    CheckApplied -- No --> DetectCompany[Detect Company]
    
    %% Application Flow
    DetectCompany --> SelectHandler[Select Handler]
    SelectHandler --> OpenPage[Open Listing Page]
    OpenPage --> Check404{404 or Deactivated?}
    
    Check404 -- Yes --> MarkPermanent[Mark deactivated]
    MarkPermanent --> SaveFail[Save to applications.json]
    SaveFail --> NotifyFail[Notify: Application Failed]
    
    Check404 -- No --> HandleCookies[Handle Cookie Banners]
    HandleCookies --> FindButton[Find Application Button]
    FindButton --> ButtonFound{Button Found?}
    
    ButtonFound -- No --> Screenshot1[Save Screenshot & HTML]
    Screenshot1 --> SaveFail
    
    ButtonFound -- Yes --> ClickButton[Click Application Button]
    ClickButton --> MultiStep{Multi-Step Form?}
    
    MultiStep -- Yes --> NavigateSteps[Navigate Form Steps]
    NavigateSteps --> FillForm
    MultiStep -- No --> FillForm[Fill Form Fields]
    
    FillForm --> SubmitForm[Submit Application]
    SubmitForm --> CheckConfirm{Confirmation Detected?}
    
    CheckConfirm -- Yes --> SaveSuccess[Save success to applications.json]
    SaveSuccess --> NotifySuccess[Notify: Application Success]
    
    CheckConfirm -- No --> Screenshot2[Save Screenshot & HTML]
    Screenshot2 --> SaveFail
    
    NotifySuccess --> CheckClean
    NotifyFail --> CheckClean
    Skip --> CheckClean
    
    %% Autoclean
    CheckClean{Time for Autoclean?}
    CheckClean -- Yes --> RunClean[Delete Debug Files >48h]
    RunClean --> Sleep
    CheckClean -- No --> Sleep[Sleep CHECK_INTERVAL]
    
    Sleep --> TelegramCmd{Telegram Command?}
    TelegramCmd -- /autopilot --> ToggleAutopilot[Toggle Autopilot State]
    TelegramCmd -- /status --> ShowStatus[Show Status & Stats]
    TelegramCmd -- /plot --> GenPlot[Generate Weekly Plot]
    TelegramCmd -- /errorrate --> GenError[Generate Error Rate Plot]
    TelegramCmd -- /retryfailed --> RetryFailed[Retry Failed Applications]
    TelegramCmd -- /resetlistings --> ResetListings[Reset Seen Listings]
    TelegramCmd -- /help --> ShowHelp[Show Help]
    TelegramCmd -- None --> Loop
    
    ToggleAutopilot --> Loop
    ShowStatus --> Loop
    GenPlot --> Loop
    GenError --> Loop
    RetryFailed --> Loop
    ResetListings --> Loop
    ShowHelp --> Loop
    
    style Start fill:#90EE90
    style SaveSuccess fill:#90EE90
    style SaveFail fill:#FFB6C1
    style MarkPermanent fill:#FFB6C1
    style RunClean fill:#87CEEB
    style CheckAutopilot fill:#FFD700
    style Check404 fill:#FFD700
    style ButtonFound fill:#FFD700
    style CheckConfirm fill:#FFD700

Key Features:

  • Dual Monitoring: Tracks both InBerlin (6 companies) and WGcompany listings
  • Smart Deduplication: MD5 hashing prevents duplicate notifications
  • Autopilot: Automated application with company-specific handlers
  • Error Handling: 404 detection, permanent fail tracking, debug screenshots
  • Autoclean: Automatic cleanup of debug files every 48 hours
  • Rich Commands: Status, plots, retry failed, reset listings
  • High-Res Analytics: 300 DPI seaborn-styled plots for pattern analysis

License

This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License.

You are free to:

  • Share — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material

Under the following terms:

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • NonCommercial — You may not use the material for commercial purposes.

For more details, see the full license text.