Sometimes I Find It Quicker to Use Ryan’s RegEx Tester to Make a Clean List
While working on my current book, I found that I needed a printed list to keep track of my progress. In many circumstances, computer-based lists work fine but sometimes I want a piece of paper which I can markup with checks, arrows, and extraneous comments. The paper sits on my desk as a working tool—never hiding behind an open window on the computer screen. It seems archaic but, for me, it’s just easier. My problem involved printing a clean list of chapters without a load of extras.
I use the Sigil EPUB Publishing software to put together my books. To get my printout, all I needed was a list of the chapters for annotation and tracking of my progress. The Content.opf file (a standard EPUB file found in all EPUBs) had what I needed with a little bit of extra coding. I could live with printing the chapter list with coding and all, but, by using Ryan’s RegEx Tester, I quickly stripped out all of the unneeded code.
I found it really cool to watch the letters disappear from the Results pane as I added each unwanted character to the Regular Expression field. When I reached the text I planned to save, I entered the ungreedy universal wildcard .*? which I placed inside parentheses (.*?) to grab the subpattern and used that first backreference $1 in the Replacement Text. Then, to get my list, I merely added the remainder of the unwanted code .x?html”/> to the RegEx field. Seeing the output I needed, I copied-and-pasted the list from the Results field into Notepad and printed it.
In this case, I could have used the StrReplace command (twice) to strip out the extra text, but that would have meant writing a script. Plus, if the Text to be searched had included more varying code before and/or after the target code, the StrReplace code would prove inadequate. Using the RegEx Tester, I could add the same universal wildcard .*? to the beginning and end of the RegEx to remove the extra garbage:
.*?.*?
I would need to add an unseen RETURN to the end of the Replacement Text to maintain the new lines in the column:
$1[Hidden New Line Character or RETURN]
No wonder I love Ryan’s RegEx Tester!
The equivalent in an AutoHotkey script using the RegExReplace() function:
NewStr := RegExReplace(OldStr , ".*?""(.*?).x?html""/>.*?", "$1`n")
Note the extra quotation mark to escape the original quotation mark:
""
and the:
`n
to replace the hidden RETURN used in the RegEx Tester.
This post was proofread by Grammarly
(Any other mistakes are all mine.)
(Full disclosure: If you sign up for a free Grammarly account, I get 20¢. I use the spelling/grammar checking service all the time, but, then again, I write a lot more than most people. I recommend Grammarly because it works and it’s free.)