4 Commits

Author SHA1 Message Date
224b9c3122 llm_agent now responsible for extraction. 2025-12-05 17:23:31 +01:00
7dca4c9159 refactor(job_scraper): improve page loading and typing in linkedin scraper
- Change page load strategy from 'load' to 'domcontentloaded' and 'networkidle' for better performance
- Make search_keywords parameter optional to handle empty searches
- Update type imports to include List for better type hints
- Set headless mode to true by default for production use
2025-11-23 09:27:05 +01:00
458e914d71 feat(scraping): enhance job scraping with session persistence and feedback system
- Add config module for spoof data management
- Implement session persistence to reuse authenticated sessions
- Add feedback system to track success rates and adjust fingerprinting
- Improve job link collection with pagination and scroll detection
- Separate verified/unverified job listings into different folders
- Enhance error handling for CAPTCHA and Cloudflare challenges
2025-11-21 16:51:26 +01:00
f52868edfa Add job_scraper.py 2025-11-20 18:59:46 +00:00