<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: How does search work? Part IV: Evolution of a Regular Expression</title>
	<atom:link href="http://www.phpbbdoctor.com/blog/2007/02/02/how-does-search-work-part-iv-dissecting-a-regular-expression/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.phpbbdoctor.com/blog/2007/02/02/how-does-search-work-part-iv-dissecting-a-regular-expression/</link>
	<description>Your premium source for custom modification services for phpBB</description>
	<lastBuildDate>Wed, 11 Jan 2012 20:39:04 -0600</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Dave Rathbun</title>
		<link>http://www.phpbbdoctor.com/blog/2007/02/02/how-does-search-work-part-iv-dissecting-a-regular-expression/comment-page-1/#comment-2581</link>
		<dc:creator>Dave Rathbun</dc:creator>
		<pubDate>Tue, 06 Nov 2007 04:49:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=74#comment-2581</guid>
		<description>Hi, Scott, thanks for your comments. In the phpBB search process special characters like the &#039; in you&#039;re are removed prior to the regex, leaving the word &lt;strong&gt;youre&lt;/strong&gt; instead. So they have that part covered. :) I&#039;ve been using the code that I posted earlier on an english language board without issues, but it does cause problems on non-english boards. I might still have some data that I was loaned for testing that I can use... when I get time I will try your suggestion. Thanks!

&lt;em&gt;PS - Hope you don&#039;t mind but I edited your comment to include the &quot;pre&quot; tag so that your extra space will show up as you wanted. I moved one of your inline comments to a separate line to keep the screen from scrolling horizontally.&lt;/em&gt;</description>
		<content:encoded><![CDATA[<p>Hi, Scott, thanks for your comments. In the phpBB search process special characters like the &#8216; in you&#8217;re are removed prior to the regex, leaving the word <strong>youre</strong> instead. So they have that part covered. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I&#8217;ve been using the code that I posted earlier on an english language board without issues, but it does cause problems on non-english boards. I might still have some data that I was loaned for testing that I can use&#8230; when I get time I will try your suggestion. Thanks!</p>
<p><em>PS &#8211; Hope you don&#8217;t mind but I edited your comment to include the &#8220;pre&#8221; tag so that your extra space will show up as you wanted. I moved one of your inline comments to a separate line to keep the screen from scrolling horizontally.</em></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Scott</title>
		<link>http://www.phpbbdoctor.com/blog/2007/02/02/how-does-search-work-part-iv-dissecting-a-regular-expression/comment-page-1/#comment-2580</link>
		<dc:creator>Scott</dc:creator>
		<pubDate>Mon, 05 Nov 2007 17:51:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=74#comment-2580</guid>
		<description>Well actually the following method will work to remove any one or two character words from a string without using the \b tags which break words like &quot;don&#039;t&quot; and &quot;you&#039;re&quot;:

&lt;pre&gt;// replaces each space with two spaces, yes there are two spaces in the second argument..
$string = preg_replace( &#039;/[ ]/&#039;, &#039;  &#039;, $string );
$string = preg_replace( &#039;/\s\S{1,2}\s/&#039;, &#039; &#039;, $string );  // kill one and two character words
$string = preg_replace( &#039;/\s+/&#039;, &#039; &#039;, $string ) // replaces multiple spaces with a single space&lt;/pre&gt;

Not the prettiest solution ever, but it is working for me.</description>
		<content:encoded><![CDATA[<p>Well actually the following method will work to remove any one or two character words from a string without using the \b tags which break words like &#8220;don&#8217;t&#8221; and &#8220;you&#8217;re&#8221;:</p>
<pre>// replaces each space with two spaces, yes there are two spaces in the second argument..
$string = preg_replace( '/[ ]/', '  ', $string );
$string = preg_replace( '/\s\S{1,2}\s/', ' ', $string );  // kill one and two character words
$string = preg_replace( '/\s+/', ' ', $string ) // replaces multiple spaces with a single space</pre>
<p>Not the prettiest solution ever, but it is working for me.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Scott</title>
		<link>http://www.phpbbdoctor.com/blog/2007/02/02/how-does-search-work-part-iv-dissecting-a-regular-expression/comment-page-1/#comment-2579</link>
		<dc:creator>Scott</dc:creator>
		<pubDate>Mon, 05 Nov 2007 17:41:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=74#comment-2579</guid>
		<description>I just built a very similar regex for a similar purpose this morning (to remove one and two letter terms from a search). 

It is worth mentioning that the \b boundary notation will break words at any character other than a-z0-9, so the regex shown above will turn the word: &quot;you&#039;re&quot; into &quot;you&quot; because it will see &quot;you&#039;re&quot; as two words: &quot;you&quot; and &quot;re&quot; and the &quot;re&quot; is only 2 characters so it is dropped.

I have not yet found a nice way to get around this.</description>
		<content:encoded><![CDATA[<p>I just built a very similar regex for a similar purpose this morning (to remove one and two letter terms from a search). </p>
<p>It is worth mentioning that the \b boundary notation will break words at any character other than a-z0-9, so the regex shown above will turn the word: &#8220;you&#8217;re&#8221; into &#8220;you&#8221; because it will see &#8220;you&#8217;re&#8221; as two words: &#8220;you&#8221; and &#8220;re&#8221; and the &#8220;re&#8221; is only 2 characters so it is dropped.</p>
<p>I have not yet found a nice way to get around this.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: dave.rathbun</title>
		<link>http://www.phpbbdoctor.com/blog/2007/02/02/how-does-search-work-part-iv-dissecting-a-regular-expression/comment-page-1/#comment-1570</link>
		<dc:creator>dave.rathbun</dc:creator>
		<pubDate>Mon, 12 Feb 2007 15:26:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=74#comment-1570</guid>
		<description>Update posted &lt;a href=&quot;http://www.phpbbdoctor.com/blog/?p=86&quot; rel=&quot;nofollow&quot;&gt;here&lt;/a&gt;. The news is not good; foreign language posts are not being properly processed by this expression.</description>
		<content:encoded><![CDATA[<p>Update posted <a href="http://www.phpbbdoctor.com/blog/?p=86" rel="nofollow">here</a>. The news is not good; foreign language posts are not being properly processed by this expression.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: dave.rathbun</title>
		<link>http://www.phpbbdoctor.com/blog/2007/02/02/how-does-search-work-part-iv-dissecting-a-regular-expression/comment-page-1/#comment-1519</link>
		<dc:creator>dave.rathbun</dc:creator>
		<pubDate>Sat, 10 Feb 2007 01:25:44 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=74#comment-1519</guid>
		<description>I have tested the regex presented in this post quite thoroughly on an english language board with no issues. I have had less success with some foreign language boards. I will post some further results when I have something more concrete to share.</description>
		<content:encoded><![CDATA[<p>I have tested the regex presented in this post quite thoroughly on an english language board with no issues. I have had less success with some foreign language boards. I will post some further results when I have something more concrete to share.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

