Aba Search and Replace bloghttps://www.abareplace.com/blog/Search tips and tricks, regular expression tutorials, announcements about new versions of Aba Search and Replace.1440Review of Aba Search and Replace with video<p>FindMySoft, a software download directory, published a Quick Look Video showcasing Aba Search and Replace 2.2 interface and features. You can <a href="http://aba-search-and-replace.findmysoft.com/">watch the video and read a review</a> of my program at their site.</p> <div style="text-align:center"><a href="http://aba-search-and-replace.findmysoft.com/"><img src="/blog_findmysoft.png" alt="Find My Soft review and video" width="764" height="346" style="border:0"></a></div> <p>Some clarifications to their review. Aba cannot search and replace in MS Word documents (yet!). About being “too simple for advanced users”: Aba’s support for regular expressions includes <a href="/docs/lookaround.php">variable-length lookbehind</a> and <a href="/docs/charListClass.php">Unicode character classes</a>; most search-and-replace tools lack these features. Generally, I try to keep the interface clean and less cluttered than competitors while adding advanced features for power users.</p>Fri, 20 Apr 2012 10:00:00 +0200https://www.abareplace.com/blog/findmysoft/Aba 2.2 released<p>The new version adds <a href="/docs/lookaround.php">lookaround</a> and braces in regular expressions. I also implemented \b anchor and <a href="/docs/backref.php#noncapturing">non-capturing groups</a>.</p> <p>Several bugs were fixed including incorrect PHP syntax highlight, crash when processing invalid UTF-8 or when changing a long replacement.</p> <p>Many thanks to Stefan Schuck for updating the German translation. I'm looking for people who can translate Aba into other languages (especially, French and Spanish).</p> <p><a href="http://www.abareplace.com/setup.exe">Download the new version</a></p>Tue, 03 Jan 2012 21:00:00 +0100https://www.abareplace.com/blog/aba22/Discount on Aba Search and Replace<p>I would like to offer everybody <b>15% discount</b> on Aba Search and Replace until the end of January. Please use the coupon code:</p> <pre>Happy2012</pre> <p>Future upgrades will be free for registered users. Thank you for using Aba. Happy Holidays and best wishes for the New Year!</p> Tue, 20 Dec 2011 09:00:00 +0100https://www.abareplace.com/blog/happy2012discount/Using search and replace to rename a method<p>It's easy to rename a method using <a href="/">Aba Search and Replace:</a></p> <ul> <li>enter the current and the new names,</li> <li>turn on the <i>Match whole word</i> and <i>Match case</i> modes,</li> <li>review the found occurrences;</li> <li>press the <i>Replace</i> button.</li> </ul> <img src="/blog_rename_method.png" alt="Replace GetFileSize with GetSize, addslashes with sqlite_escape_string." width="488" height="65"> <p>The name of a method, <i>GetFileSize</i>, collided with the name of <a href="http://msdn.microsoft.com/en-us/library/aa364955">a Win32 API function,</a> so I wanted to replace it with <i>GetSize.</i> The later is also shorter and avoids the tautology: <i>file</i>.Get<i>File</i>Size.</p> <p>To find the references to my method, not to the Win32 API function, I added a dot (and <code>-&gt;</code> in C++): <code>.GetFileSize</code></p> <p>In a PHP code, I replaced all calls to <code>addslashes</code> with <code>sqlite_escape_string</code> when porting my site to SQLite. The two functions escape quotes <a href="http://php.net/sqlite_escape_string">differently;</a> <i>addslashes</i> should never be used in SQLite.</p>Mon, 05 Dec 2011 09:00:00 +0100https://www.abareplace.com/blog/blog_rename_method/Cleaning the output of a converter<p>When I worked at a small web design company, we often had clients bringing us <b>a MS Word, Excel, or PDF file</b> that must be published on web. Not as a downloadable file, but as a web page integrated into their site.</p> <p>Microsoft Word certainly can save files in HTML, but the resulting code was bloated and different from our design. What we needed was a simple HTML that our designer could edit and style. How could Aba S&amp;R help us?</p> <p>Here is a DOC file saved in HTML:</p> <p><code>&lt;h3 align=center style='text-align:center;'&gt;&lt;b&gt;&lt;span style='font-size:10.0pt;font-family:"Arial";'&gt;Lorem ipsum&lt;/span&gt;&lt;/b&gt;&lt;/h3&gt;</code></p> <p><code>&lt;p class=Normal align=justify style='text-indent:14.0pt;text-align:justify;'&gt;&lt;span style='font-family:"Times New Roman";'&gt;Lorem ipsum dolor sit &lt;i&gt;amet,&lt;/i&gt; consectetur adipisicing elit.&lt;/span&gt;&lt;/p&gt;</code></p> <p>We need to <b>remove all attributes and &lt;span&gt; tags:</b></p> <p><code>&lt;h3&gt;&lt;b&gt;Lorem ipsum&lt;/b&gt;&lt;/h3&gt;</code></p> <p><code>&lt;p&gt;Lorem ipsum dolor sit &lt;i&gt;amet,&lt;/i&gt; consectetur adipisicing elit.&lt;/p&gt;</code></p> <p>The following <b>replacements</b> can be used:</p> <pre>Search for: <b>&lt;</b><span style="color:blue">(</span><b>p</b><span style="color:blue">|</span><b>h1</b><span style="color:blue">|</span><b>h2</b><span style="color:blue">|</span><b>h3</b><span style="color:blue">)</span> <span style="color:olive">[^&gt;]</span><span style="color:purple">*</span><b>&gt;</b> Replace with: &lt;<span style="color:purple">\1</span>&gt; Search for: <b>&lt;span </b><span style="color:olive">[^&gt;]</span><span style="color:purple">*</span><b>&gt;</b> Replace with: (nothing) Search for: <b>&lt;/span&gt;</b> Replace with: (nothing)</pre> <p><code><span style="color:olive">[^&gt;]</span><span style="color:purple">*</span></code> matches everything up to the next closing angle bracket &gt;, and <span style="color:purple">\1</span> means the text inside the first parentheses (in our case, the tag name).</p> <div style="text-align:center"><img src="/blog_html_convertor.png" alt="Remove HTML attributes with regular expressions" width="492" height="342"></div> <p>Generally, I often used Aba to <b>clean the output of a converter.</b> For one client, I had to convert dozens of PDF files with technical specifications to HTML. There was a lot of formatting (subscripts, superscripts, tables), so I could not simply copy-and-paste it. There also were errors, for example, the letter O instead of zero in subscripts. Without Aba, I would not clean this mess.</p> <a name="bad-practice"></a> <h3>Is it a bad practice?</h3> <p><a href="http://www.reddit.com/r/webdev/comments/mcwh5/how_to_replace_html_tags_using_regular_expressions/">Two redditors criticized</a> my previous post about <a href="/blog/html_tags/">using regular expressions to replace HTML tags.</a></p> <p>I fully agree that regexes should never be used to parse <b>an arbitrary HTML code,</b> for example, an HTML code entered by user. Never do this in your scripts; it's unreliable and insecure.</p> <p>But what if you need to replace all relative links (/blog/) <b>in your own code</b> with absolute links (http://www.example.com/blog/), because you are moving some parts of your site to a subdomain (http://myproduct.example.com). <b>Would you craft a script</b> that parses your HTML code (carefully skipping &lt;?php tags — Python's HTMLParser cannot do that), searches for all <code>&lt;a&gt;</code> tags with the <code>href</code> attribute, replaces the links, and saves the result to a file?</p> <p><b>Or would you toss off a regex</b> in <a href="/">a search-and-replace tool?</a></p> <div style="text-align:center"><img src="/blog_html_convertor2.png" alt="Would you write 43 lines of Python code or one-line regex for an ad-hoc replacement?" title="Would you write 43 lines of Python code or one-line regex for an ad-hoc replacement?" width="652" height="466" style="padding-top:10px"></div>Fri, 18 Nov 2011 09:00:00 +0100https://www.abareplace.com/blog/html_convertor/Aba 2.1 released<p>The new version fixes some bugs like incorrectly displayed date/time and adds the <i>File</i> menu for viewing/editing a file or copying the results list into clipboard.</p> <img src="/docs/fileMenu.png" width="314" height="163" alt="File menu"> <p>Just as always, <b>the upgrade is free</b> for registered users.</p> <p><a href="http://www.abareplace.com/">Download the new version</a></p>Fri, 11 Nov 2011 09:00:00 +0100https://www.abareplace.com/blog/aba21/How to replace HTML tags using regular expressions<p>Strictly speaking, you cannot parse HTML using only regular expressions. The reason is explained in any computer science curriculum: HTML is <a href="http://en.wikipedia.org/wiki/Context-free_language">a context-free language</a>, but regular expressions allow only <a href="http://en.wikipedia.org/wiki/Regular_language">regular languages</a>. So, <b>you cannot match nested tags</b> with them.</p> <p>However, regexes are really useful for quick search and replace in your web pages. Full parsing is unnecessary, because you know the HTML code that you wrote. Approaches that are “impure” from theoretical point of view work extremely well in this setting. You even can simplify the regexes shown below: say, if you never insert newlines between <code>a</code> and <code>href</code>, then you need not to allow for them in your regular expression.</p> <h3>Match an HTML tag</h3> <pre><b>&lt;a</b><span style="color:#D2691E">\s</span><span style="color:blue">(</span><span style="color:olive">.*?</span><span style="color:blue">)</span><b>&gt;</b><span style="color:blue">(</span><span style="color:olive">.*?</span><span style="color:blue">)</span><b>&lt;/a&gt;</b></pre> <p>This regex matches an &lt;a&gt; tag with any attributes. If you break it into parts:</p> <ul> <li><code>\s</code> matches a space or a newline after <code>a</code>;</li> <li><code>.*?</code> matches any text to the next closing angle bracket <code>&gt;</code>;</li> <li>another <code>.*?</code> matches any text inside the tag.</li> </ul> <p>Parenthesis are used to capture the attributes and the text inside tag. You can then refer to them using <code>\1</code> and <code>\2</code> in the replacement. For example, you can <b>remove all links:</b></p> <div style="text-align:center"><img src="/blog_html_tags1.png" alt="Search for &lt;a\s(.*?)&gt;(.*?)&lt;/a&gt; and replace with \2" width="658" height="409"></div> <p>As mentioned above, the regex will not correctly match nested <code>&lt;a&gt;</code> tags; it just finds the next closing tag of the same type. But in this case, it's not important, because the nested <code>&lt;a&gt;</code> tags make little sense :)</p> <h3>Match an opening HTML tag with some attribute</h3> <pre><b>&lt;a</b><span style="color:#D2691E">\s</span><span style="color:blue">(</span><span style="color:#D2691E">[^&gt;]</span><span style="color:olive">*</span><span style="color:#D2691E">\s</span><span style="color:blue">)</span><span style="color:olive">?</span><b>href=&quot;</b><span style="color:blue">(</span><span style="color:olive">.*?</span><span style="color:blue">)</span><b>&quot;</b><span style="color:blue">(</span><span style="color:olive">.*?</span><span style="color:blue">)</span><b>&gt;</b></pre> <p>This regex matches an opening &lt;a&gt; tag with <code>href</code> attribute. The differences from the previous example are:</p> <ul> <li><code>[^&gt;]*</code> matches anything except the closing angle bracket <code>&gt;</code> (so it skips any attributes before <code>href</code>);</li> <li>the question mark <code>?</code> makes the other attributes optional, so <code>href</code> can immediately follow <code>a</code>.</li> </ul> <p>This regex is simple enough and works in most cases. The HTML standard allows spaces around <code>=</code> and single quotes instead of double quotes in attribute values. If you need to match such tags, you need a more complicated regex:</p> <pre><b>&lt;a</b><span style="color:#D2691E">\s</span><span style="color:blue">(</span><span style="color:#D2691E">[^&gt;]</span><span style="color:olive">*</span><span style="color:#D2691E">\s</span><span style="color:blue">)</span><span style="color:olive">?</span><b>href</b><span style="color:#D2691E">\s</span><span style="color:olive">*</span><b>=</b><span style="color:#D2691E">\s</span><span style="color:olive">*</span><span style="color:blue">(</span><span style="color:#D2691E">[&quot;']</span><span style="color:blue">)</span><span style="color:blue">(</span><span style="color:olive">.*?</span><span style="color:blue">)</span><span style="color:blue">\2</span><span style="color:blue">(</span><span style="color:olive">.*?</span><span style="color:blue">)</span><b>&gt;</b></pre> <p>But simpler regexes usually suffice. Here is how you can <b>replace absolute links with relative ones:</b></p> <div style="text-align:center"><img src="/blog_html_tags2.png" alt="Search for &lt;a\s([^&gt;]*\s)?href=&quot;http://www.abareplace.com(.*?)&quot; and replace with &lt;a \1 href=&quot;\2&quot;" width="658" height="370"></div> <p>I hope that this short tutorial convinced you of the power of regular expressions :)</p> <p>See also: <a href="/docs/regExprElements.php">Regular expression reference</a></p>Thu, 03 Nov 2011 09:00:00 +0100https://www.abareplace.com/blog/html_tags/Video trailer for Aba<p>Softoxi, an independent software site, published <a href="http://www.softoxi.com/aba-search--replace.html">an original review</a> of Aba Search and Replace. They even shoot <b>a video showing major features.</b></p> <div style="text-align:center"><a href="http://www.softoxi.com/aba-search--replace-video-trailer-screenshots.html"><img src="/blog_softoxi.jpg" width="502" height="378" alt="Aba Search and Replace video review" style="border:0"></a></div> <p>Another popular site, Softpedia, <a href="http://www.softpedia.com/progClean/Aba-Search-and-Replace-Clean-90327.html">granted “100% clean” award to Aba,</a> which means it does not contain any form of spyware or viruses.</p>Thu, 27 Oct 2011 10:00:00 +0200https://www.abareplace.com/blog/softoxi/Aba 2.0 released<div style="float:left; padding: 10px 40px 0 0"><img src="/aba2.png" width="375" height="318" alt="Aba 2.0 screenshot"></div> <p>After a month of beta testing, I released Aba 2.0. <b>The new features</b> in this version include:</p> <ul><li>Added syntax highlight for context viewer and for regular expressions.</li> <li>Implemented search history and favorites.</li> <li>Now you can undo a replacement if you have started another search or closed the program (undo information is saved in a dedicated folder).</li> <li>An editor or a viewer can be called for the selected file.</li> <li>When search or replacement is finished, the program notifies you by playing a sound.</li> <li>Visual styles and Windows 7 taskbar are now supported.</li> <li>When you edit the <i>Replace with</i> field, the search is not restarted (except when needed).</li> <li>Non-greedy matches are faster than they were in the previous version.</li> </ul> <p>Many <b>thanks to the beta testers:</b> Kyle Alons, Massimiliano Tiraboschi, and JJS. Without your help, I would never find some tricky bugs :)</p> <p>Unfortunately, German and Italian translations are still unfinished, but I'm waiting for response from our translators.</p> <div style="clear:both">&nbsp;</div>Tue, 25 Oct 2011 10:00:00 +0200https://www.abareplace.com/blog/aba20/