Adapting Web Scraping Routines to Changing Web Pages (AutoHotkey Tip)

When the Horoscope Web Page I Use for E-mails Altered Its Format, I Quickly Adjusted the Script

Last year, I wrote a script that e-mails a daily horoscope to my wife, “E-mail the Daily Horoscope to Yourself (AutoHotkey Trick).” Every morning she receives on her tablet an e-mail containing her daily horoscope. (I don’t send it to myself because I don’t want to know that much about my future—and I don’t listen to advice.) Recently, she pointed out that the e-mail started coming up blank. I immediately realized that the target Web site had changed its source code. (I’ve experienced the same problem with the SynonymLookup.ahk script.) I knew I could repair the Regular Expression (RegEx) in the broken script fairly quickly by following some basic steps:

  1. Access the source code for the target Web page and locate the key text.
  2. Copy the critical portion of the source code, including any unique HTML tags surrounding the target text, then paste the selection into Ryan’s RegEx Tester.
  3. Adjust the RegEx to include key unique tags surrounding the text—then extracting the paragraph.
  4. In the script, replace the old RegEx found in the RegExMatch() function with the new one from Ryan’s RegEx Tester.
  5. Make any necessary adjustments to the RegEx—primarily escaping double quotation marks.

The new horoscope e-mail script now includes more details and a link to the site.
Continue reading

AutoHotkey Tip of the Week: Cull Web Links from a Web Page and Activate Each in a Pop-up GUI

This Time I Combine a Number of AutoHotkey Techniques to Put Active Links in a Graphical User Interface (GUI) Pop-up Saving Space with GUI Tabs

As I pondered the GetActiveBrowserURL() function from last time, I looked for more ways to use this unique function by reviewing Chapter Ten, “An App for Extracting Web Links from Web Pages” from A Beginner’s Guide to Using Regular Expressions in AutoHotkey. By combining the function with the UrlDownloadToFile command and a couple of GUI controls (Link and Tab), I quickly wrote a script for collecting all of the external links from a Web page into a pop-up window displaying a list of active links—merely, click to follow one.

WebPageLinks
The GUI contains 10 tabs—most with 20 hot links each scraped from the ComputorEdge Free AutoHotkey Scripts page.

This process included a number of learning points worth discussing:

  1. I found the GetActiveBrowserURL() function more reliable and robust than using the Standard Clipboard Routine.
  2. Depending upon the target Web site, you may need to tailor your Regular Expressions (RegEx) to produce the most useful results.
  3. The GUI Link control creates hot Internet links for immediate action.
  4. The GUI Tab control wraps long lists for scenarios where no scroll bars exist and column wrapping proves impractical.

In this blog, I offer the script with a short discussion of the Regular Expressions (RegEx). In a future blog, I’ll discuss how to build a GUI pop-up window with an unknown number of hot Weblinks (almost 200 in the example at right) while not letting it get out of hand. But first, my thoughts on the GetActiveBrowserURL() function.

Continue reading

AutoHotkey Tip of the Week: Dynamic Regular Expressions (RegEx) for Math Calculating Hotstrings

An AutoHotkey Classic, the Dynamic Hotstrings() Function Makes Instant RegEx Replacements Possible—Now, You Can Do Math with Your Hotstrings!

Anyone who reads my blog on a routine basis knows how I love Regular Expressions (RegEx). They make feasible all kinds of capabilities not practical by any other method. While not necessarily easy for a beginner to grasp, RegEx provides a mechanism for matching text when you don’t know exactly which characters you need (wildcards). (That’s why I wrote the book A Beginner’s Guide to Using Regular Expressions in AutoHotkey.) Although you may encounter a bit of a learning curve, RegEx gives you the ability to accomplish some pretty fancy tasks. This time I plan to demonstrate a couple of Hotstring techniques that might amaze you—they did me! Continue reading

AutoHotkey Tip of the Week—Powerful RegEx Text Search Shorthand (~=)

AutoHotkey Provides an Abbreviated Regular Expression RegExMatch() Operator ( ~= ) for Quick Wildcard Text Matches

Regular Expressions (RegEx) can get confusing, but once understood, they pay tremendous dividends. Acting almost as another programming language, Regular Expressions in AutoHotkey provide a method for accomplishing complex search and/or replacement with only one line of code. While not impossible, doing the same thing without using RegEx often requires complex tricks and many lines of code. In the beginning, learning RegEx many feel daunting but you’ll find it well worth the journey.

Light Bulb!In spite of the initial learning curve, you don’t need to learn how the two primary AutoHotkey RegEx functions work (RegExMatch() and RegExReplace()) to make good use of a RegEx. The shorthand RegEx operator ( ~= ) provides a method for doing a complex string match without the limitations of the InStr() function. Regular Expressions search for patterns while the InStr() function searches for exact strings.

Continue reading

Why AutoHotkey for Poets?

Erstwhile Multifarious Poets Optated for Quill and Parchment. Forthwith, AutoHotkey Propounds the Furtherance of Lyrical Ruminations on Windows Computers.

Okay…I’m not a poet. My mind doesn’t work that way. But that doesn’t mean I can’t see how AutoHotkey might be useful to people who craft the English (or any other) language. Even so, I robotpoetryoccasionally enjoy writing a short rhyming couplet. (I know…constructing rhyming poems has become cliché—at least for real poets.)

In this blog, I offer a couple of AutoHotkey scripts for assisting and inspiring(?) budding wordsmiths. The first includes a set of over 500 Hotstrings for inserting “the most beautiful words in the English language.” The second script draws upon the Web to create a pop-up menu of rhymes. Even if you never intend to write a poem, you might find these AutoHotkey techniques interesting and/or useful. Continue reading

Too Much Planning Can Get in the Way of Good Scripting (AutoHotkey Quick Reference Part Five)

While Preplanning Script Writing Can Be Useful, Don’t Take It Too Seriously—Sometimes It Only Makes Sense to Rewrite Everything

The AutoHotkey script writing process rarely runs in a straight line. Often I start with a vague concept of what I want to do then start fiddling with the tools. Unlike when building a toolshed or bookcase, I rarely begin with a complete plan or blueprint for an AutoHotkey script. In fact, the code may undergo numerous changes during the debugging and problem-solving phases.

sarcastictweetsFor anyone who builds things, this approach may be disconcerting. Afterall, you can’t afford to build a house by trial-and-error. The cost of wasted materials would be prohibitive. Traditionally, we spend a great deal of time in the planning phase to make sure we avoid expensive mistakes. Even in computer programming, large projects come together much better after extensive planning. But with smaller projects such as AutoHotkey scripts the opposite may be true. I often start a script with only a vague idea of what I want to do. As I work on it, the possibilities expand and I often change course. Continue reading

Using INI Files for Web Address Letter Case-Sensitivity Problems (AutoHotkey Quick Reference Script, Part Four)

The Wrong Capitalization of Letters in URLs Can Cause Page Access Failure—A Trick for Using an INI File to Solve Case Problems in AutoHotkey

In an effort to take advantage of the hidden index built into the AutoHotkey.com site, I’ve started writing a script I call AutoHotkeyQuickReg.ahk which parses the downloaded pages. The first step involved those searches which downloaded a command page.

AutoHotkey Library Deal
AutoHotkey Library Deal

The original version of the AutoHotkey Quick Reference script pops up a MsgBox which displays the syntax of the command, then offers the option to open the Web page in the default browser. Recently, I added a new feature which parses and displays information about the built-in AutoHotkey variables whenever detecting the “Variables and Expressions” page. However, I had to find a way to deal with the problem of letter case (capitalization) sensitivity. Get it wrong and either the Web page doesn’t come up or the right data won’t load. Continue reading

Regular Expressions (RegEx) for Parsing Text (AutoHotkey Quick Reference Script Part Three)

The RegExReplace() Function Makes It Easy to Extract and Cleanup Text, Plus a Quick-and-Dirty RegEx to Strip All HTML Tags

commandsyntaxLast time, we accessed commands at AutoHotkey.com using its hidden built-in index. Whenever the script downloaded a command page, we identified it by the embedded HTML code <pre class=”Syntax”>. Not only do the <pre class=”Syntax”>…</pre> tags identify the command pages but they surround the proper syntax for that command. Since this easily located HTML format appears in every command page, it can be used to launch a quick reference pop-up window. We only need to parse the command syntax with the RegExReplace() function, then clean up any extraneous HTML tags. Continue reading

AutoHotkey Quick Reference Script (Part Two)

The AutoHotkey.com built-in Index Reappears—Now to Build a Reference Tool!

autohotkeybooks160x600As I ventured in a new direction toward creating AutoHotkey reference scripts, I once again tested the previously discovered hidden AutoHotkey.com index (which had vanished). It re-emerged!

This left me in a quandary. Do I continue in my new direction or take up the original quick reference tool I began building with this AutoHotkey.com secret capability? Since the hidden index offers so much power, I decided to continue on my first course. (The possibility that the feature may disappear again looms over my work, but any Web site can change.) Continue reading

The Problem with Accessing Web Data with AutoHotkey Scripts

The Trouble with Scripts Which Use Web Page Information, Plus AutoHotkey Tools for Downloading Web Page Source Code

While it doesn’t usually happen this fast, AutoHotkey scripts which depend on downloaded Web page data can go bad at any time. Last week, I discovered a clever index in the AutoHotkey site which allowed quick access to commands and other documentation. Now, I find that the site has changed and only a couple of the previous searches work. Continue reading