prod
This commit is contained in:
parent
d596ed7e19
commit
aa6626d80d
21 changed files with 1051 additions and 333 deletions
49
.github/copilot-instructions.md
vendored
49
.github/copilot-instructions.md
vendored
|
|
@ -8,7 +8,7 @@ A Python-based apartment monitoring bot for Berlin's public housing portal (inbe
|
|||
|
||||
**Modularized structure** with the following key components:
|
||||
|
||||
- `main.py`: Entry point for the bot.
|
||||
- `main.py`: Entry point for the bot. Runs the monitoring loop and autocleaning every 48 hours.
|
||||
- `handlers/`: Contains company-specific handlers for auto-apply functionality. Each handler is responsible for automating the application process for a specific housing company. Includes:
|
||||
- `howoge_handler.py`
|
||||
- `gewobag_handler.py`
|
||||
|
|
@ -16,11 +16,18 @@ A Python-based apartment monitoring bot for Berlin's public housing portal (inbe
|
|||
- `gesobau_handler.py`
|
||||
- `stadtundland_handler.py`
|
||||
- `wbm_handler.py`
|
||||
- `wgcompany_notifier.py`: Handles WGcompany listing fetching, deduplication, and notification
|
||||
- `base_handler.py`: Provides shared functionality for all handlers.
|
||||
- `application_handler.py`: Delegates application tasks to the appropriate handler based on the company.
|
||||
- `telegram_bot.py`: Handles Telegram bot commands and notifications.
|
||||
- `application_handler.py`: Delegates application tasks to the appropriate handler based on the company. Enforces valid browser context.
|
||||
- `telegram_bot.py`: Fully async Telegram bot handler for commands and notifications. Uses httpx for messaging.
|
||||
- `autoclean_debug.py`: Deletes debug files (screenshots, HTML) older than 48 hours.
|
||||
- `helper_functions/`: Contains data merge utilities for combining stats from multiple sources:
|
||||
- `merge_listing_times.py`
|
||||
- `merge_applications.py`
|
||||
- `merge_dict_json.py`
|
||||
- `merge_wgcompany_times.py`
|
||||
|
||||
**Data flow**: Fetch listings → Compare with `listings.json` / `wgcompany_listings.json` → Detect new → Log to CSV → Auto-apply if autopilot enabled → Save to `applications.json` → Send Telegram notification.
|
||||
**Data flow**: Fetch listings → Compare with `listings.json` / `wgcompany_listings.json` → Detect new → Log to CSV → Auto-apply if autopilot enabled → Save to `applications.json` → Send Telegram notification → Autoclean debug files every 48 hours.
|
||||
|
||||
## Key Patterns
|
||||
|
||||
|
|
@ -39,8 +46,18 @@ Listings are hashed by `md5(key_fields)[:12]` to generate stable IDs:
|
|||
- `state.json` - Runtime state (autopilot toggle)
|
||||
- `listings.json` - Previously seen inberlinwohnen listings
|
||||
- `wgcompany_listings.json` - Previously seen WGcompany listings
|
||||
- `applications.json` - Application history with success/failure status
|
||||
- `applications.json` - Application history with success/failure status, timestamps, and listing details
|
||||
- `listing_times.csv` / `wgcompany_times.csv` - Time-series data for pattern analysis
|
||||
- `monitor.log` - Centralized logs with rotation (RotatingFileHandler)
|
||||
|
||||
### Logging
|
||||
All modules use centralized logging configured in `main.py`:
|
||||
- `RotatingFileHandler` writes to `data/monitor.log` (max 5MB, 5 backups)
|
||||
- `StreamHandler` outputs to console/Docker logs
|
||||
- All handlers, notifiers, and utilities use `logging.getLogger(__name__)` for consistent logging
|
||||
|
||||
### Autocleaning
|
||||
Debug material (screenshots, HTML files) older than 48 hours is automatically deleted by `autoclean_debug.py`, which runs every 48 hours in the main loop.
|
||||
|
||||
## Development
|
||||
|
||||
|
|
@ -65,7 +82,8 @@ docker compose logs -f
|
|||
### Debugging
|
||||
- Screenshots saved to `data/` on application failures (`*_nobtn_*.png`)
|
||||
- HTML saved to `data/debug_page.html` (inberlin) and `data/wgcompany_debug.html`
|
||||
- Full logs in `data/monitor.log`
|
||||
- Full logs in `data/monitor.log` with rotation
|
||||
- Debug files older than 48 hours are autocleaned
|
||||
|
||||
## Environment Variables
|
||||
|
||||
|
|
@ -74,6 +92,16 @@ InBerlin login: `INBERLIN_EMAIL`, `INBERLIN_PASSWORD`
|
|||
Form data: `FORM_ANREDE`, `FORM_VORNAME`, `FORM_NACHNAME`, `FORM_EMAIL`, `FORM_PHONE`, `FORM_STRASSE`, `FORM_HAUSNUMMER`, `FORM_PLZ`, `FORM_ORT`, `FORM_PERSONS`, `FORM_CHILDREN`, `FORM_INCOME`
|
||||
WGcompany: `WGCOMPANY_ENABLED`, `WGCOMPANY_MIN_SIZE`, `WGCOMPANY_MAX_SIZE`, `WGCOMPANY_MIN_PRICE`, `WGCOMPANY_MAX_PRICE`, `WGCOMPANY_BEZIRK`
|
||||
|
||||
## Telegram Commands
|
||||
|
||||
- `/autopilot on|off` - Enable or disable automatic applications
|
||||
- `/status` - Show current status and statistics (autopilot state, application counts by company)
|
||||
- `/plot` - Generate and send a weekly listing-patterns plot
|
||||
- `/errorrate` - Generate and send an autopilot success vs failure plot
|
||||
- `/retryfailed` - Retry all failed applications
|
||||
- `/resetlistings` - Reset seen listings (marks all current as failed to avoid spam)
|
||||
- `/help` - Show available commands and usage information
|
||||
|
||||
## Common Tasks
|
||||
|
||||
### Fix a broken company handler
|
||||
|
|
@ -87,6 +115,15 @@ Check `data/*_nobtn_*.png` screenshots and `data/debug_page.html` to see actual
|
|||
- InBerlin: Update regex patterns in `InBerlinMonitor.fetch_listings()`. Test against `data/debug_page.html`.
|
||||
- WGcompany: Update parsing in `WGCompanyMonitor.fetch_listings()`. Test against `data/wgcompany_debug.html`.
|
||||
|
||||
### Merge data from another machine
|
||||
Use the helper scripts in `helper_functions/`:
|
||||
- `merge_listing_times.py` - Merge listing_times.csv files
|
||||
- `merge_applications.py` - Merge applications.json files
|
||||
- `merge_dict_json.py` - Merge listings.json and wgcompany_listings.json
|
||||
- `merge_wgcompany_times.py` - Merge wgcompany_times.csv files
|
||||
|
||||
All scripts deduplicate by key and timestamp, and output merged results to the current data folder.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
### Overview
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue