Adapting Web Scraping Routines to Changing Web Pages (AutoHotkey Tip)

When the Horoscope Web Page I Use for E-mails Altered Its Format, I Quickly Adjusted the Script

Last year, I wrote a script that e-mails a daily horoscope to my wife, “E-mail the Daily Horoscope to Yourself (AutoHotkey Trick).” Every morning she receives on her tablet an e-mail containing her daily horoscope. (I don’t send it to myself because I don’t want to know that much about my future—and I don’t listen to advice.) Recently, she pointed out that the e-mail started coming up blank. I immediately realized that the target Web site had changed its source code. (I’ve experienced the same problem with the SynonymLookup.ahk script.) I knew I could repair the Regular Expression (RegEx) in the broken script fairly quickly by following some basic steps:

  1. Access the source code for the target Web page and locate the key text.
  2. Copy the critical portion of the source code, including any unique HTML tags surrounding the target text, then paste the selection into Ryan’s RegEx Tester.
  3. Adjust the RegEx to include key unique tags surrounding the text—then extracting the paragraph.
  4. In the script, replace the old RegEx found in the RegExMatch() function with the new one from Ryan’s RegEx Tester.
  5. Make any necessary adjustments to the RegEx—primarily escaping double quotation marks.

The new horoscope e-mail script now includes more details and a link to the site.
Continue reading