<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PHP vs .Net &#187; Code</title>
	<atom:link href="http://www.phpvs.net/category/code/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.phpvs.net</link>
	<description>ASP.Net and PHP go head to head</description>
	<lastBuildDate>Sat, 24 Dec 2011 18:20:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>HTTP Signed Requests with PHP</title>
		<link>http://www.phpvs.net/2011/12/24/http-signed-requests-with-php/</link>
		<comments>http://www.phpvs.net/2011/12/24/http-signed-requests-with-php/#comments</comments>
		<pubDate>Sat, 24 Dec 2011 18:19:27 +0000</pubDate>
		<dc:creator>blake</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://www.phpvs.net/?p=268</guid>
		<description><![CDATA[I thought I'd write a quick primer on a basic implementation of HTTP request signing with PHP. I see a lot of posts dealing with the topic, especially by people writing homebrew REST services. What are signed HTTP requests? Signed HTTP requests are simply a normal HTTP request, such as a GET or a POST, [...]]]></description>
			<content:encoded><![CDATA[<p>I thought I'd write a quick primer on a basic implementation of HTTP request signing with PHP. I see a lot of posts dealing with the topic, especially by people writing homebrew REST services.</p>
<p><a href="http://www.phpvs.net/wp-content/uploads/2011/12/digitalsecurity.jpg"><img src="http://www.phpvs.net/wp-content/uploads/2011/12/digitalsecurity.jpg" alt="" title="Digital Fingerprint" width="353" height="281" class="aligncenter size-full wp-image-288" /></a></p>
<h2>What are signed HTTP requests?</h2>
<p>Signed HTTP requests are simply a normal HTTP request, such as a GET or a POST, that happens to include a signature as part of the request.  A signature is just a string of characters generated in a meaningful way.  This signature can be used by the receiving party (ie. a REST service) to validate the request and ensure that it hasn't been tampered with.</p>
<h4>What are signed requests good for?</h4>
<p>In a nutshell, validating an HTTP request where restricted access to resources are necessary. For example, a service that allows someone to update their personal information needs to ensure that people can only update <em>their own</em> personal information. Signed requests ensure that the service that receives the data can verify that the sender is who they claim they are (or at least possesses the proper secret key, a de-facto assumption it is the right person).</p>
<p>Obviously, having someone log in and create a session is another method of verifying user identity, however, this is not always efficient or possible, especially with web services. As an example, every Facebook application developer gets a secret key when they create an application on Facebook. When Facebook sends information to someone's application, it will include a signed request that is only valid with that developer's key.  By checking the signature, a developer can verify that a request came from Facebook, and isn't someone trying to fool their application.  They also don't need to provide a complicated session procedure or have Facebook "log in" to their application; the signed request is enough.</p>
<h4> What <em>aren't</em> signed requests good for?</h4>
<p>Signed requests are not a form of encryption. If the data being sent is sensitive in nature, such that you wouldn't want it being sniffed in transit over a network connection, then SSL encrypted communication (HTTPS) is the way to go. For simple REST services however, often you just need to limit requests based on a user or other similar scope, and signed requests are sufficient for that.</p>
<h2>A problem that signed requests can solve</h2>
<p>Here is an example situation that has a security problem that can be solved by the use of signed requests. Let's say you operate a network of beer brewing web sites, and the brewers are allowed to send updates via a REST service you have provided. Each brewer has their own user id that they get when they sign up to your service. <strong>Bob's Brewery</strong> (user id <strong>1234</strong>) wants to update their email address. The simplest of implementations might allow for a request to be sent like this: <code>/user/update?email=newbob@example.com&amp;userid=1234</code>. However, if this was the only thing the REST service needed to update an email address, it would be trivial to cause serious mayhem. For example, <strong>Mr. Evil's Brewery</strong> could send in a simple request to change Bob's email address, just by finding Bob's ID somewhere, or even trying random ones:</p>
<p><code>/user/update?email=evil@example.org&amp;userid=1<br />
/user/update?email=evil@example.org&amp;userid=12<br />
/user/update?email=evil@example.org&amp;userid=1234<br />
etc.</code></p>
<p>Too easy! Obviously more security is needed, and this is where the concept of secret keys comes in.</p>
<h2>Secret Keys</h2>
<p>It is a common practice to assign each user a secret key that they use to interact with a web service, often called an API key, generally consisting of randomly generated characters. In this case, you could give Bob a secret key when he signs up with the brewery network, and it would be stored in your database along with the rest of his account information.</p>
<p>Here's where I often see confusion setting in. Many people think it is logical that the secret key be <strong><em>sent</em></strong> as part of the API request, because after all, it uniquely identifies the user and it was given only to that particular person. <strong>However, this is a security problem!</strong></p>
<p>The thing about secret keys is that they have to remain secret, just like any password. As soon as a key is leaked, whoever finds it can impersonate that account, just as easily as Mr. Evil's Brewery did in the previous example. <strong>If you send a secret key as part of an HTTP request, it is no longer secret.</strong> You have just leaked it to whoever cared to be listening! A network sniffer could pick it up, or it could be recorded in proxy logs or web server logs that aren't secured. Perhaps your ISP is doing deep packet inspection, and a rogue employee makes off with the logs. Who knows!</p>
<p>The following is an example of this type of insecurity. Bob is sending his update request along with the secret key:</p>
<p><code>/user/update?email=newbob@example.com&amp;userid=1234&amp;secret_key=bobs-super-secret-key</code></p>
<p>Now, pretend that Bob goes to his local Tarborks coffee shop and jumps on the free open wireless connection. Mr. Evil happens to be there, and fires up his network sniffer program and starts watching all the HTTP traffic on the network. He sees Bob's request go out, since HTTP connections are not secure. Now he has Bob's secret key, and can make any kind of request he wants with it. Maybe he could even delete Bob's account with a request to <code>/user/delete?userid=1234&amp;secret_key=bobs-super-secret-key</code>, or send insults to Bob's customers using a mail endpoint!</p>
<p><strong>Sending the secret key as part of a network request is NOT safe.</strong> So how do we properly identify the sender if the user id can't be trusted and they can't send us their key? Well, you knew I'd get to it eventually... signed requests!</p>
<h2>Signed HTTP Requests</h2>
<p>To produce a signed HTTP request, the sender and the receiver must both know the rules on how to generate a signature. It can be any crazy method you care to dream up, but a common one is as follows:</p>
<ol>
<li>The sender organizes the data they want to send in a logical way, such as sorting it alphabetically.</li>
<li>The data is run through a hashing algorithm using the secret key to produce a <strong>hash</strong>. Hashing algorithms produce short strings of characters that vary based on the input data, and the output is sufficiently unique such that varying the input data by even one character produces a completely different hash.</li>
<li>The hash is added to the original data and sent.</li>
<li>The receiver identifies the user and gets the secret key from their associated account information, and recreates the hash on the received data.</li>
<li>If the recreated hash matches the one included in the request, the request is valid.</li>
</ol>
<p>To return to our Tarborks coffee shop scenario, the request going out might look like this now:</p>
<p><code>/user/update?email=newbob@example.com&amp;userid=1234&amp;sig=x1zz645</code></p>
<p>If Mr. Evil sniffs this request, he might be pretty pleased with himself and try to change it to <code>/user/update?email=evil@example.org&amp;userid=1234&amp;sig=x1zz645</code>. This request would be rejected by the server however, since the <em>signature is now invalid</em>! When Mr. Evil changed the request, he needed to change the signature to match. But since he doesn't have Bob's secret key, he can't generate a correct signature, for this request or any other he cares to make up.  The only valid request he can possibly make is the exact same one he just sniffed, and that's not very malicious at all, since it's what Bob was trying to do anyway.</p>
<h4>Get to the code already</h4>
<p>Assuming Bob's user id is 1234, and the secret key you gave him is "bobs-super-secret-key", he might write the following PHP code.</p>
<h3>User code</h3>
<div class="igBar"><span id="lphp-5"><a href="#" onclick="javascript:showPlainTxt('php-5'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">PHP:</span>
<div id="php-5">
<div>
<ol>
<li>
<div><span style="color:#0000FF;">$USER_ID</span> = <span style="color:#FF0000;">"1234"</span>;</div>
</li>
<li>
<div><span style="color:#0000FF;">$SECRET_KEY</span> = <span style="color:#FF0000;">"bobs-super-secret-key"</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color:#008000;">/**</span></div>
</li>
<li>
<div><span style="color:#008000;"> * @param array $data Array of key/value pairs of data</span></div>
</li>
<li>
<div><span style="color:#008000;"> * @param string $secretKey</span></div>
</li>
<li>
<div><span style="color:#008000;"> * @return string A generated signature for the $data based on $secretKey</span></div>
</li>
<li>
<div><span style="color:#008000;"> */</span></div>
</li>
<li>
<div><span style="color:#000000; font-weight:bold;">function</span> generateSignature<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$data</span>,<span style="color:#0000FF;">$secretKey</span><span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#FF9933; font-style:italic;">//sort data array alphabetically by key</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <a href="http://www.php.net/ksort"><span style="color:#000066;">ksort</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$data</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#FF9933; font-style:italic;">//combine keys and values into one long string</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#0000FF;">$dataString</span> = <span style="color:#FF0000;">''</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#616100;">foreach</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$data</span> <span style="color:#616100;">as</span> <span style="color:#0000FF;">$key</span> =&gt; <span style="color:#0000FF;">$value</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#0000FF;">$dataString</span> .= <span style="color:#0000FF;">$key</span>.<span style="color:#0000FF;">$value</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">&#125;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#FF9933; font-style:italic;">//lowercase everything</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#0000FF;">$dataString</span> = <a href="http://www.php.net/strtolower"><span style="color:#000066;">strtolower</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$dataString</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#FF9933; font-style:italic;">//generate signature using the SHA256 hashing algorithm</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#616100;">return</span> hash_hmac<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#FF0000;">"sha256"</span>,<span style="color:#0000FF;">$dataString</span>,<span style="color:#0000FF;">$secretKey</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#125;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color:#0000FF;">$bobsData</span> = <a href="http://www.php.net/array"><span style="color:#000066;">array</span></a><span style="color:#006600; font-weight:bold;">&#40;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#FF0000;">"userid"</span> =&gt; <span style="color:#0000FF;">$USER_ID</span>,</div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#FF0000;">"email"</span> =&gt; <span style="color:#FF0000;">"newbob@example.com"</span></div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color:#0000FF;">$sig</span> = generateSignature<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$bobsData</span>,<span style="color:#0000FF;">$SECRET_KEY</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//add signature to the outgoing data</span></div>
</li>
<li>
<div><span style="color:#0000FF;">$bobsData</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#FF0000;">'sig'</span><span style="color:#006600; font-weight:bold;">&#93;</span> = <span style="color:#0000FF;">$sig</span>;</div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//generate HTTP query string</span></div>
</li>
<li>
<div><span style="color:#0000FF;">$queryString</span> = http_build_query<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$bobsData</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><a href="http://www.php.net/echo"><span style="color:#000066;">echo</span></a> <span style="color:#0000FF;">$queryString</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>The code above outputs <code>userid=1234&amp;email=newbob%40example.com&amp;sig=efffd9cc30a220f2981b5124e1caa44d91b85aa2d2181f5331f48ca719983c1d</code>.  That's the HTTP query string that Bob would send to the <code>/user/update</code> endpoint to change his email address.  So how does the service verify the request?</p>
<h3>Server-side code</h3>
<div class="igBar"><span id="lphp-6"><a href="#" onclick="javascript:showPlainTxt('php-6'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">PHP:</span>
<div id="php-6">
<div>
<ol>
<li>
<div><span style="color:#616100;">if</span> <span style="color:#006600; font-weight:bold;">&#40;</span><a href="http://www.php.net/empty"><span style="color:#000066;">empty</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$_REQUEST</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#FF0000;">'userid'</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; throw <span style="color:#000000; font-weight:bold;">new</span> Exception<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#FF0000;">"No user id was sent with the request."</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#125;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//look up the account associated with the value in $_REQUEST['userid']</span></div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//and get the secret key for that account - implement as necessary</span></div>
</li>
<li>
<div><span style="color:#0000FF;">$secretKey</span> = getSecretKeyFromUserId<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$_REQUEST</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#FF0000;">'userid'</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color:#0000FF;">$data</span> = <span style="color:#0000FF;">$_REQUEST</span>;</div>
</li>
<li>
<div><span style="color:#0000FF;">$receivedSignature</span> = <span style="color:#0000FF;">$data</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#FF0000;">'sig'</span><span style="color:#006600; font-weight:bold;">&#93;</span>;</div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//generate a signature using the data sent by the user, without the 'sig'</span></div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//parameter of course. Note that the generateSignature() function is the</span></div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//SAME ONE that the users would use!</span></div>
</li>
<li>
<div><a href="http://www.php.net/unset"><span style="color:#000066;">unset</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$data</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#FF0000;">'sig'</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#0000FF;">$generatedSignature</span> = generateSignature<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$data</span>,<span style="color:#0000FF;">$secretKey</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color:#616100;">if</span> <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$generatedSignature</span> != <span style="color:#0000FF;">$receivedSignature</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; throw <span style="color:#000000; font-weight:bold;">new</span> Exception<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#FF0000;">"Received signature is invalid!"</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#125;</span></div>
</li>
<li>
<div><span style="color:#616100;">else</span> <span style="color:#006600; font-weight:bold;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color:#FF9933; font-style:italic;">//continue on, knowing it is the right user making the request.</span></div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>There you have it. Signed requests!</p>
<h4>Advanced Usage</h4>
<p>The signature for a signed request can be sent in different ways. Some services, such as <a href="http://docs.amazonwebservices.com/AmazonS3/latest/dev/RESTAuthentication.html">Amazon's S3 REST API</a>, puts the signature in the HTTP headers. This is arguably a bit cleaner than including it in the parameters of an HTTP request, since the signature doesn't get mixed in with the data, and has implications for caching as well (browser caches and proxies). If you want to do it that way, you might have the user set a header as part of their HTTP request:</p>
<div class="igBar"><span id="lphp-7"><a href="#" onclick="javascript:showPlainTxt('php-7'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">PHP:</span>
<div id="php-7">
<div>
<ol>
<li>
<div><a href="http://www.php.net/header"><span style="color:#000066;">header</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#FF0000;">"X-Brewery-Sig: "</span>.<span style="color:#0000FF;">$sig</span><span style="color:#006600; font-weight:bold;">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>And the receiving server, instead of looking for the sig parameter in <code>$_REQUEST['sig']</code> (and having to remove it before running the data through the generator function), would find it in:</p>
<div class="igBar"><span id="lphp-8"><a href="#" onclick="javascript:showPlainTxt('php-8'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">PHP:</span>
<div id="php-8">
<div>
<ol>
<li>
<div><span style="color:#0000FF;">$_SERVER</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#FF0000;">'HTTP_X_BREWERY_SIG'</span><span style="color:#006600; font-weight:bold;">&#93;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Hope you found this useful!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpvs.net/2011/12/24/http-signed-requests-with-php/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>An Exercise in WordPress Integration, or Why WordPress Sucks</title>
		<link>http://www.phpvs.net/2009/12/08/an-exercise-in-wordpress-integration-or-why-wordpress-sucks/</link>
		<comments>http://www.phpvs.net/2009/12/08/an-exercise-in-wordpress-integration-or-why-wordpress-sucks/#comments</comments>
		<pubDate>Tue, 08 Dec 2009 08:23:50 +0000</pubDate>
		<dc:creator>blake</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://www.phpvs.net/?p=154</guid>
		<description><![CDATA[I'd like to prefix my upcoming rant with the fact that WordPress is good at what it does: making basic blogs and publishing content. I use it, many other people use it, it works. Heck, I'm using it right now. But from a technical standpoint, WordPress sucks. I'm going to relate my experience here trying [...]]]></description>
			<content:encoded><![CDATA[<p>I'd like to prefix my upcoming rant with the fact that WordPress is good at what it does:  making basic blogs and publishing content.  I use it, many other people use it, it works.  Heck, I'm using it right now.  But from a technical standpoint, WordPress sucks.  I'm going to relate my experience here trying write a quick function to store post output to a file, to be used by a separate application on the same server.</p>
<p>I started off to write a function (let's call it a caching function for simplicity) that stores some HTML from the most recently published post.  Sounds easy enough.  I should be able to just put a function into the functions.php file of the custom template set I'm using.  That's seems to be where the "userland" custom functions go.</p>
<p>So I check the <a href="http://codex.wordpress.org/Function_Reference/" target="_blank">function reference</a> first.  Hey, <a href="http://codex.wordpress.org/Function_Reference/wp_get_recent_posts" target="_blank">wp_get_recent_posts()</a>.  Looks promising, so I give it a shot.  It goes ahead and gets the most recent post just fine.  Things are ok so far.</p>
<h2>A problem appears</h2>
<h2><img class="alignleft size-medium wp-image-160" style="margin-left: 10px; margin-right: 10px;" title="storm-at-sea" src="http://www.phpvs.net/wp-content/uploads/2009/12/storm-at-sea-300x224.jpg" alt="storm-at-sea" width="245" height="183" /></h2>
<p>Now, I want to output the post exactly as it would appear in the blog, and save that output to a file on disk.  Surely there's a basic function that will output a post's content?  You know... take the post_content field from the database record and format it properly?  Suddenly, the skies darken.  Evil laughter booms out.  Ha ha ha!  WordPress mocks the folly of simplistic functional thinking!</p>
<p>The template files use functions like <code>the_content()</code> and <code>the_title()</code>.  Just in case you can't tell from the excellent naming scheme, these actually produce echoed output.  Checking out <code>the_content()</code>, we see it dutifully calls <code>get_the_content()</code>, then runs a couple of lines of formatting stuff on the results.   So how about using <code>get_the_content()</code> for my caching function?  I could run the other few formatting bits manually after that.  Should be ok, right?  After all, the doc comment for <code>get_the_content()</code> says the following:</p>
<p><code>/**<br />
* Retrieve the post content.<br />
*</code></p>
<p>So, I can go ahead assume it simply retrieves the basic post content then?  Ha ha.  <strong>NO</strong>.  <em>WHY WOULD IT DO THAT</em>?  Instead, it takes a bunch of globals that get set who-the-hell-knows-where, runs through a bunch of crap seemingly unrelated to the content of a post, and does a whole lot of textual modifications to <strong>some kind of content</strong>.  Reading through the function is like jabbing red-hot fire pokeys into your eyes.  Here's a portion of it:</p>
<p><code>
<div class="igBar"><span id="lphp-11"><a href="#" onclick="javascript:showPlainTxt('php-11'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">PHP:</span>
<div id="php-11">
<div>
<ol>
<li>
<div><span style="color:#0000FF;">$content</span> = <span style="color:#0000FF;">$pages</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#0000FF;">$page</span>-<span style="color:#CC66CC;color:#800000;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span>;</div>
</li>
<li>
<div><span style="color:#616100;">if</span> <span style="color:#006600; font-weight:bold;">&#40;</span> <a href="http://www.php.net/preg_match"><span style="color:#000066;">preg_match</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#FF0000;">'/&amp;lt;<span style="color:#000099; font-weight:bold;">\!</span>--more(.*?)?--&amp;gt;/'</span>, <span style="color:#0000FF;">$content</span>, <span style="color:#0000FF;">$matches</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span></div>
</li>
<li>
<div><span style="color:#0000FF;">$content</span> = <a href="http://www.php.net/explode"><span style="color:#000066;">explode</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$matches</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#CC66CC;color:#800000;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span>, <span style="color:#0000FF;">$content</span>, <span style="color:#CC66CC;color:#800000;">2</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#616100;">if</span> <span style="color:#006600; font-weight:bold;">&#40;</span> !<a href="http://www.php.net/empty"><span style="color:#000066;">empty</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$matches</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#CC66CC;color:#800000;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#41;</span> &amp;amp;&amp;amp; !<a href="http://www.php.net/empty"><span style="color:#000066;">empty</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$more_link_text</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li>
<div><span style="color:#0000FF;">$more_link_text</span> = <a href="http://www.php.net/strip_tags"><span style="color:#000066;">strip_tags</span></a><span style="color:#006600; font-weight:bold;">&#40;</span>wp_kses_no_null<span style="color:#006600; font-weight:bold;">&#40;</span><a href="http://www.php.net/trim"><span style="color:#000066;">trim</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$matches</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#CC66CC;color:#800000;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#41;</span>&lt;/code&gt;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color:#0000FF;">$hasTeaser</span> = <span style="color:#000000; font-weight:bold;">true</span>;</div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p><code>$pages</code> is some kind of global that doesn't seem to have any relation to a post.  Then apparently we're looking for HTML comments of <code>&lt;!--more something --&gt;</code>, and replacing them with... well, something.  I'd hate to think what would happen if I ever wrote a post with an HTML comment in it that happened to hit on whatever random content markers WordPress has decided to use.  (<em>Oh wait!  That</em> <em><strong>just happened to me</strong></em> <em>while I was trying to publish the above code fragment!</em>)  I didn't even bother to look into things like <code>wp_kses_no_null</code>.  It probably involves dark rituals with live chicken sacrifice.  Why is there so much going on in a function called <strong>get</strong>_the_content()?</p>
<p>In the end, it seems that <code>get_the_content()</code> will eventually get the content of a post, but only if you set a half-a-dozen or so globals before you call it.  And <em>what the hell post is it even getting</em>?</p>
<h2>"The Loop"</h2>
<h2><img class="alignright size-medium wp-image-161" style="margin-left: 10px; margin-right: 10px;" title="the-broken-chain1" src="http://www.phpvs.net/wp-content/uploads/2009/12/the-broken-chain1-300x224.jpg" alt="the-broken-chain1" width="300" height="224" /></h2>
<p>Digging further, it's clear that the template functions for output are all like that.  They don't take any kind of parameters; <em>they just operate on globals</em>!  <strong>There's no way to take the post data that I just retrieved with <code>wp_get_recent_posts()</code>, and format it using these functions.</strong> You have to be in "The Loop" in order to do that.  And "The Loop" sucks.  It's not a catchy, easy-to-use method of handling posts, despite WordPress's efforts to pass it off as something neat or fun.  It's a mish-mash of global functions with random naming and variable schemes (incidentally, just like the rest of WordPress).  You can only use "The Loop" if you're accessing WordPress in a "normal", web-requested-and-template-loaded kind of way.  It doesn't work if you're outside a template file (such as in functions.php before a template gets loaded).</p>
<p>So back to square one.  Unfortunately, it appears that if I want to have the regular blog-formatted output, I need to harness "The Loop" somehow, and clearly you can't do that on your own (ie. outside of a template file) without knowing about every global variable in the system.</p>
<p>After some quick googling, I came across the <a href="http://codex.wordpress.org/Template_Tags/query_posts" target="_blank">query_posts()</a> function, which you can use to set up "The Loop".  Reading the documentation on it, you can find this little gem:</p>
<blockquote><p>"The query_posts function overrides and replaces the main query for the page. To save your sanity, do not use it for any other purpose."</p></blockquote>
<p>To paraphrase: "We've created a public API function that is pretty much useless except in a very specific page-dependent situation.  Please enjoy how useless it is.  But don't use it."</p>
<p>The fact that there is a "main query" for a page is another indicator of just how global-happy WordPress is, and that in turn gives you an insight into why it has so many security holes.  How do you keep track of so many globals across so many functions?</p>
<h2>A solution... sort of.</h2>
<p>Fortunately, the <code>query_posts()</code> doc page links to the <a href="http://codex.wordpress.org/Function_Reference/WP_Query" target="_blank">WP_Query docs</a>, which is marginally more helpful, and provides the path for a solution.  Using <code>WP_Query</code> sets up the wacky global stuff necessary to use "The Loop", which means we can hack our way through to getting some formatted post content.  While technically feasible, you have to emulate a bunch of <code>$_REQUEST</code> parameters to the query() method.  I ended up with this:</p>
<p><code>
<div class="igBar"><span id="lphp-12"><a href="#" onclick="javascript:showPlainTxt('php-12'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">PHP:</span>
<div id="php-12">
<div>
<ol>
<li>
<div><span style="color:#000000; font-weight:bold;">function</span> cacheMostRecentPost<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#123;</span></div>
</li>
<li>
<div><span style="color:#0000FF;">$featuredPosts</span> = <span style="color:#000000; font-weight:bold;">new</span> WP_Query<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#0000FF;">$featuredPosts</span>-&amp;gt;query<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#FF0000;">'showposts=1'</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#616100;">while</span> <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#0000FF;">$featuredPosts</span>-&amp;gt;have_posts<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#123;</span></div>
</li>
<li>
<div><span style="color:#0000FF;">$featuredPosts</span>-&amp;gt;the_post<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><a href="http://www.php.net/ob_start"><span style="color:#000066;">ob_start</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//do output with stuff like the_title() and the_content()</span></div>
</li>
<li>
<div><span style="color:#0000FF;">$str</span> = <a href="http://www.php.net/ob_get_contents"><span style="color:#000066;">ob_get_contents</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><a href="http://www.php.net/ob_end_clean"><span style="color:#000066;">ob_end_clean</span></a><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//write $str to cache fragment</span></div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#125;</span></div>
</li>
<li>
<div><span style="color:#006600; font-weight:bold;">&#125;</span>&lt;/code&gt;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color:#FF9933; font-style:italic;">//set up hooks for this file when a post is changed or deleted</span></div>
</li>
<li>
<div>add_action<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#FF0000;">'save_post'</span>, <span style="color:#FF0000;">'cacheMostRecentPost'</span><span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li>
<div>add_action<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#FF0000;">'deleted_post'</span>, <span style="color:#FF0000;">'cacheMostRecentPost'</span><span style="color:#006600; font-weight:bold;">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>So despite relying on a specific set up of incoming HTTP parameters (as a string) for the most part, at least you can pass paramters to the query if you know the right ones.  In this case, "showposts=1" seems to be the total number of posts fetched, and they appear to come back ordered by posting date, most recent first.  This works for what I want it to do, but guess what?  It doesn't work if you try to run it anywhere that's not one of those action hooks, because "The Loop" overwrites all the globals necessary for doing output later!  So I can't use that function, say, at the top of the index.php template file if I wanted to.  If I do, thanks to the overwritten globals, WordPress decides that I actually want the "Archive" page instead of the index page(!), and switches templates accordingly. So while I achieved my goal of being able to cache a post to a file with this function, it's certainly not portable, and it's certainly not elegant.</p>
<h2>Wharrgarbl</h2>
<h2><img class="alignright size-medium wp-image-158" title="wharrgarbl" src="http://www.phpvs.net/wp-content/uploads/2009/12/wharrgarbl-300x240.jpg" alt="wharrgarbl" width="300" height="240" /></h2>
<p>The entire code flow is mind-boggling.  Basing the output functions around a bunch of globals reminds me of code someone would have written in PHP 3 a decade ago, or something a very inexperienced programmer would write.  Definitely not something you would expect in an application used by what is probably now millions of people.  What's wrong with having some data fetching functions, and some output functions?  You could, and I know I'm talking crazy here, but you could fetch some data, and then <em>pass it</em> to the output functions.  Then (bear with me here), you could probably fetch posts (or whatever) <em>at any time</em>, and get some formatted output <em>at any time</em>, without overwriting some important global that might be used later in the code flow.  Revolutionary, I know.  Sorry if I went too fast on that.  I'll repeat it louder and/or slower for any WordPress core developers that happen to be reading.</p>
<p>So, WordPress?  How about something like:</p>
<p><code>$postObjects = getRecentPostsByDate(1);<br />
$output = formatPostContent($postObject[0]);</code></p>
<p>The mere concept of having individual posts exist inside their own little encapsulated world would make the APIs a hundred times more useful (and easier to understand).  You could even keep those crap <code>the_title()</code> and <code>the_content()</code> and <code>the_something_lol_naming_scheme_lol()</code> functions if you wanted.  Just make them take parameters.  Better yet, put them inside a formatting object, or even the post object itself.  <code>$post-&gt;the_content()</code> would still work, but it would have context!</p>
<p>The reason this gets me worked up is not that it's so frustrating to use (although that helps).  I've had to deal with a lot of frustrating code in my career.  It's more the fact that it's this kind of thing that gives PHP programmers a bad name.  The code is just bad.  The design is random.  The API functions are random.  The naming schemes are random.  Functions don't do what their name (or their doc comment) indicates they should do.  Integrating wordpress into another application or site is next to impossible (try it, I dare you), and the other way around, integrating another application or site into wordpress is much more difficult than it should be.  Global usage is rampant and ridiculous to follow.</p>
<p>You don't have to look any farther than a single WordPress code file to understand why there have been so many security holes over the last couple of years.  And there's a lot of PHP code out there that's the quality of WordPress, or worse.</p>
<p>To re-iterate my opening, if you don't need to get anything special out of it, WordPress does the job.  They've filled their market niche well, and it's encouraging that development is ongoing and releases occur often.  I've worked with it on occasion over the last few years, and the improvements are obvious, interface-wise especially, and to some extent code-wise as well (the WP_Query object is a step forward).   But working with the code is not fun.  Even modifying the template files is an exercise in counter-intuitiveness.</p>
<p>I'm sure there are reasons the code is what it is at this point, and I'm equally as sure I don't have the full picture to go with my condemnations.  I guess I should just be thankful that I don't have to maintain it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpvs.net/2009/12/08/an-exercise-in-wordpress-integration-or-why-wordpress-sucks/feed/</wfw:commentRss>
		<slash:comments>37</slash:comments>
		</item>
		<item>
		<title>ASP.Net MVC &#8211; How to route to images or other file types</title>
		<link>http://www.phpvs.net/2009/08/06/aspnet-mvc-how-to-route-to-images-or-other-file-types/</link>
		<comments>http://www.phpvs.net/2009/08/06/aspnet-mvc-how-to-route-to-images-or-other-file-types/#comments</comments>
		<pubDate>Fri, 07 Aug 2009 04:23:51 +0000</pubDate>
		<dc:creator>morgan</dc:creator>
				<category><![CDATA[.Net]]></category>
		<category><![CDATA[ASP.Net]]></category>
		<category><![CDATA[ASP.Net MVC]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[MVC]]></category>

		<guid isPermaLink="false">http://www.phpvs.net/?p=123</guid>
		<description><![CDATA[A recent question on Stack Overflow (and subsequent answer that I wrote for it) inspired this post. I had recently been discussing URL rewriting in depth with my brother, and have also been doing some introductory work with the routing engine in ASP.Net MVC, and the question piqued my interest since I had been meaning [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.phpvs.net/wp-content/uploads/2009/08/Image.png"><img class="alignright size-full wp-image-228" style="border: 0pt none;" title="Image" src="http://www.phpvs.net/wp-content/uploads/2009/08/Image.png" alt="" width="256" height="256" /></a>A <a href="http://stackoverflow.com/questions/1146652/how-do-i-route-images-using-asp-net-mvc-routing">recent question on Stack Overflow</a> (and subsequent answer that I wrote for it) inspired this post.  I had recently been discussing URL rewriting in depth with my brother, and have also been doing some introductory work with the routing engine in ASP.Net MVC, and the question piqued my interest since I had been meaning to look at this more closely for some time.</p>
<p>The question on Stack Overflow is titled "How do I route images with ASP.Net MVC", but fundamentally the question is really asking "<strong>how can I use ASP.Net MVC to re-route URL's to actual physical files, rather than methods of a controller?</strong>"</p>
<p>To be clear, lets address the conceptual differences between routing and url rewriting.  Url rewriting takes the requested URL and modifies it before your code ever sees it.  As far as your application is concerned, the client requested the rewritten URL.  All that URL rewriting does is to change one URL into another URL, based on pattern matching.</p>
<p>Routing is a different and much more powerful beast.  The ASP.Net routing engine maps an URL to a "resource", based on a set of routes.  The first route to match the requested URL wins the prize, and sends the request off to the resource it chooses.  For the ASP.Net MVC framework (which uses <code>System.Web.Routing</code> under the hood), a resource is something that can handle the request object, which is always a piece of code.</p>
<p>So where does that leave physical files?  If a request is always parsed by the routing engine and then handed off to some function somewhere, how can we ever route a request for an image to actually return the physical image?</p>
<p>Well, it takes a tiny bit of legwork, but once we're through it, I'm confident you will see the huge advantages that routing has over simple url-rewriting.  We will show the equivalent of url-rewriting by handling a request for an image using an URL that doesn't map to a physical path, but be able to return the image anyway.</p>
<h2>Handling the Request</h2>
<p>First off, we need to handle the request that we want to re-route to a physical file.  Out of the box, ASP.Net MVC uses an instance of the <code>MvcRouteHandler </code>object to handle every request.  <code>MvcRouteHandler </code> hides all the complexities of taking the requested URL, breaking it down into parts, finding the right controller in your application, instantiating it and passing it all the data it needs.</p>
<p>The end result of <code>MvcRouteHandler </code>is not what we desire. We want to return an image, not instantiate a controller and run a method.   We want to skip dealing with controllers altogether in this case.  So lets create our own route handler that we'll use instead.</p>
<p>To do so, we simply implement <code>IRouteHandler</code>, an interface exposed by ASP.Net MVC that actually inherits from <code>IHttpHandler</code>.  This means that what we're writing is the ASP.Net MVC equivalent of an .ashx file for a webforms app - we're inserting our own handling module into the ASP.Net pipeline, that will handle the request much closer to the webserver/http level, rather than at the ASP.Net application level.</p>
<p><code>IRouteHandler </code>only has one method that we need to implement, which is <code>GetHttpHandler()</code>.</p>
<pre class="prettyprint"><code><span class="kwd">using</span><span class="pln"> </span><span class="typ">System</span><span class="pun">;</span><span class="pln">
</span><span class="kwd">using</span><span class="pln"> </span><span class="typ">System</span><span class="pun">.</span><span class="typ">Collections</span><span class="pun">.</span><span class="typ">Generic</span><span class="pun">;</span><span class="pln">
</span><span class="kwd">using</span><span class="pln"> </span><span class="typ">System</span><span class="pun">.</span><span class="pln">IO</span><span class="pun">;</span><span class="pln">
</span><span class="kwd">using</span><span class="pln"> </span><span class="typ">System</span><span class="pun">.</span><span class="typ">Linq</span><span class="pun">;</span><span class="pln">
</span><span class="kwd">using</span><span class="pln"> </span><span class="typ">System</span><span class="pun">.</span><span class="typ">Web</span><span class="pun">;</span><span class="pln">
</span><span class="kwd">using</span><span class="pln"> </span><span class="typ">System</span><span class="pun">.</span><span class="typ">Web</span><span class="pun">.</span><span class="typ">Compilation</span><span class="pun">;</span><span class="pln">
</span><span class="kwd">using</span><span class="pln"> </span><span class="typ">System</span><span class="pun">.</span><span class="typ">Web</span><span class="pun">.</span><span class="typ">Routing</span><span class="pun">;</span><span class="pln">
</span><span class="kwd">using</span><span class="pln"> </span><span class="typ">System</span><span class="pun">.</span><span class="typ">Web</span><span class="pun">.</span><span class="pln">UI</span><span class="pun">;</span><span class="pln">

</span><span class="kwd">namespace</span><span class="pln"> MvcApplication1
</span><span class="pun">{</span><span class="pln">
    </span><span class="kwd">public</span><span class="pln"> </span><span class="kwd">class</span><span class="pln"> </span><span class="typ">ImageRouteHandler</span><span class="pln"> </span><span class="pun">:</span><span class="pln"> </span><span class="typ">IRouteHandler</span><span class="pln">
    </span><span class="pun">{</span><span class="pln">
        </span><span class="kwd">public</span><span class="pln"> </span><span class="typ">IHttpHandler</span><span class="pln"> </span><span class="typ">GetHttpHandler</span><span class="pun">(</span><span class="typ">RequestContext</span><span class="pln"> requestContext</span><span class="pun">)</span><span class="pln">
        </span><span class="pun">{</span><span class="pln">
            </span><span class="kwd">string</span><span class="pln"> filename </span><span class="pun">=</span><span class="pln"> requestContext</span><span class="pun">.</span><span class="typ">RouteData</span><span class="pun">.</span><span class="typ">Values</span><span class="pun">[</span><span class="str">"filename"</span><span class="pun">]</span><span class="pln"> </span><span class="kwd">as</span><span class="pln"> </span><span class="kwd">string</span><span class="pun">;</span><span class="pln">

            </span><span class="kwd">if</span><span class="pln"> </span><span class="pun">(</span><span class="kwd">string</span><span class="pun">.</span><span class="typ">IsNullOrEmpty</span><span class="pun">(</span><span class="pln">filename</span><span class="pun">))</span><span class="pln">
            </span><span class="pun">{</span><span class="pln">
                </span><span class="com">requestContext.HttpContext.Response.Clear();
                requestContext.HttpContext.Response.StatusCode = 404;
                requestContext.HttpContext.Response.End();
</span><span class="pln">            </span><span class="pun">}</span><span class="pln">
            </span><span class="kwd">else</span><span class="pln">
            </span><span class="pun">{</span><span class="pln">
                requestContext</span><span class="pun">.</span><span class="typ">HttpContext</span><span class="pun">.</span><span class="typ">Response</span><span class="pun">.</span><span class="typ">Clear</span><span class="pun">();</span><span class="pln">
                requestContext</span><span class="pun">.</span><span class="typ">HttpContext</span><span class="pun">.</span><span class="typ">Response</span><span class="pun">.</span><span class="typ">ContentType</span><span class="pln"> </span><span class="pun">=</span><span class="pln"> </span><span class="typ">GetContentType</span><span class="pun">(</span><span class="pln">requestContext</span><span class="pun">.</span><span class="typ">HttpContext</span><span class="pun">.</span><span class="typ">Request</span><span class="pun">.</span><span class="typ">Url</span><span class="pun">.</span><span class="typ">ToString</span><span class="pun">());</span><span class="pln">

                </span><span class="com">// find physical path to image here.  </span><span class="pln">
                </span><span class="kwd">string</span><span class="pln"> filepath </span><span class="pun">=</span><span class="pln"> requestContext</span><span class="pun">.</span><span class="typ">HttpContext</span><span class="pun">.</span><span class="typ">Server</span><span class="pun">.</span><span class="typ">MapPath</span><span class="pun">(</span><span class="str">"~/test.jpg"</span><span class="pun">);</span><span class="pln">

                requestContext</span><span class="pun">.</span><span class="typ">HttpContext</span><span class="pun">.</span><span class="typ">Response</span><span class="pun">.</span><span class="typ">WriteFile</span><span class="pun">(</span><span class="pln">filepath</span><span class="pun">);</span><span class="pln">
                requestContext</span><span class="pun">.</span><span class="typ">HttpContext</span><span class="pun">.</span><span class="typ">Response</span><span class="pun">.</span><span class="typ">End</span><span class="pun">();</span><span class="pln">
            </span><span class="pun">}</span><span class="pln">
            </span><span class="kwd">return</span><span class="pln"> </span><span class="kwd">null</span><span class="pun">;</span><span class="pln">
        </span><span class="pun">}</span><span class="pln">

        </span><span class="kwd">private</span><span class="pln"> </span><span class="kwd">static</span><span class="pln"> </span><span class="kwd">string</span><span class="pln"> </span><span class="typ">GetContentType</span><span class="pun">(</span><span class="typ">String</span><span class="pln"> path</span><span class="pun">)</span><span class="pln">
        </span><span class="pun">{</span><span class="pln">
            </span><span class="kwd">switch</span><span class="pln"> </span><span class="pun">(</span><span class="typ">Path</span><span class="pun">.</span><span class="typ">GetExtension</span><span class="pun">(</span><span class="pln">path</span><span class="pun">))</span><span class="pln">
            </span><span class="pun">{</span><span class="pln">
                </span><span class="kwd">case</span><span class="pln"> </span><span class="str">".bmp"</span><span class="pun">:</span><span class="pln"> </span><span class="kwd">return</span><span class="pln"> </span><span class="str">"Image/bmp"</span><span class="pun">;</span><span class="pln">
                </span><span class="kwd">case</span><span class="pln"> </span><span class="str">".gif"</span><span class="pun">:</span><span class="pln"> </span><span class="kwd">return</span><span class="pln"> </span><span class="str">"Image/gif"</span><span class="pun">;</span><span class="pln">
                </span><span class="kwd">case</span><span class="pln"> </span><span class="str">".jpg"</span><span class="pun">:</span><span class="pln"> </span><span class="kwd">return</span><span class="pln"> </span><span class="str">"Image/jpeg"</span><span class="pun">;</span><span class="pln">
                </span><span class="kwd">case</span><span class="pln"> </span><span class="str">".png"</span><span class="pun">:</span><span class="pln"> </span><span class="kwd">return</span><span class="pln"> </span><span class="str">"Image/png"</span><span class="pun">;</span><span class="pln">
                </span><span class="kwd">default</span><span class="pun">:</span><span class="pln"> </span><span class="kwd">break</span><span class="pun">;</span><span class="pln">
            </span><span class="pun">}</span><span class="pln">
            </span><span class="kwd">return</span><span class="pln"> </span><span class="str">""</span><span class="pun">;</span><span class="pln">
        </span><span class="pun">}</span><span class="pln">
    </span><span class="pun">}</span><span class="pln">
</span><span class="pun">}</span><span class="pln">
</span></code></pre>
<p>The above <code>IRouteHandler </code>is pretty simple.  Ignoring the <code>GetContentType </code>helper method, there's really only two things happening.  First, we check for a "filename" parameter that got passed in to our handler (more on that in a second).  If it's not there, we return a 404 response.  Otherwise, we attempt to open up the physical file "test.jpg", and stream it to the browser.</p>
<p>Clearly, this should be adapted to your needs by actually using the filename parameter to find the physical files on your system.   But moving on - how do we invoke this from our MVC app?  And how do we pass in the filename parameter, of which we'd like to reroute to some other physical path?</p>
<h2>Routing the Request to the Custom Handler</h2>
<p>Well, this is the easy part.  Where you'd normally define your routes in <code>Global.asax</code>, simply use <code>routes.Add()</code>, instead of <code>routes.MapRoute()</code>.  Just like this:</p>
<pre>routes.Add("ImagesRoute",
                 new Route("graphics/{filename}", new ImageRouteHandler()));</pre>
<p>This method of adding our route allows us to specify our custom <code>IRouteHandler</code>, rather than <code>routes.MapRoute()</code>, which by default uses an instance of <code>MvcRouteHandler</code>.  So now, we've defined a route that matches against any requested URL containing "graphics/", and puts the rest of the URL into the "filename" bucket of the <code>RouteDataDictionary</code>, and hands it off to our <code>IRouteHandler</code>.  This is how we pass the filename parameter into our custom route handler - basically the same way we pass things into controllers, by defining the variables in the route pattern.</p>
<p>We've successfully routed all URL's containing "graphics/", which doesn't physically exist in our web application, and returning "temp.jpg", which could exist anywhere.  With a bit of coding around the file IO, you could return files from anywhere.</p>
<p>And that's pretty much it!  You might be thinking, "this seems like a lot of extra work just to re-route a URL to a physical file that already existed in my web app!".   If you take a step back though, you'll see the power of this approach.  What if you wanted to log every request to the original URL to a special log file?  What if you wanted to also transform the image before returning it?  Perhaps launch a system executable or asynchronously hit a web service?  What if you wanted to...?</p>
<p>In a nutshell, by inserting your own HttpHandlers into the ASP.Net pipeline to handle routed requests, you can code <em>anything that you'd like to happen</em> when a request comes in, rather than just rewriting it to some other URL.</p>
<p><a href="http://www.dotnetkicks.com/kick/?url=http%3a%2f%2fwww.phpvs.net%2f2009%2f08%2f06%2faspnet-mvc-how-to-route-to-images-or-other-file-types%2f"><img src="http://www.dotnetkicks.com/Services/Images/KickItImageGenerator.ashx?url=http%3a%2f%2fwww.phpvs.net%2f2009%2f08%2f06%2faspnet-mvc-how-to-route-to-images-or-other-file-types%2f&amp;bgcolor=FF9933&amp;cbgcolor=D4E1FD" border="0" alt="kick it on DotNetKicks.com" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpvs.net/2009/08/06/aspnet-mvc-how-to-route-to-images-or-other-file-types/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>PHP Variable Test reference</title>
		<link>http://www.phpvs.net/2008/06/29/php-variable-test-reference/</link>
		<comments>http://www.phpvs.net/2008/06/29/php-variable-test-reference/#comments</comments>
		<pubDate>Mon, 30 Jun 2008 02:08:33 +0000</pubDate>
		<dc:creator>blake</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.phpvs.net/?p=54</guid>
		<description><![CDATA[I thought I'd post a link to this PHP Variable Tests reference page. It's a great reference that's kept up to date with the current version of PHP. I use it sometimes when I'm waffling over what function to use to validate a variable. Something I've noticed lately with the newer versions of PHP is [...]]]></description>
			<content:encoded><![CDATA[<p>I thought I'd post a link to this <a href="http://www.killersoft.com/misc/php_variable_tests.php" target="_blank">PHP Variable Tests</a> reference page.  It's a great reference that's kept up to date with the current version of PHP.  I use it sometimes when I'm waffling over what function to use to validate a variable.</p>
<p>Something I've noticed lately with the newer versions of PHP is that <code>ctype_digit</code> no longer returns <code>true</code> when you give it an empty string (ie. <code>ctype_digit('')</code>).  This is great, since I always thought that returning true on an empty string was counter-intuitive; by definition, it should only return true "if every character in <code>$text</code> is a decimal digit", so if there's no characters, it can't be true.  It's also great because that means that there's lots of places in some of my code where I can change </p>
<p><code>if (ctype_digit((string) $x) &#038;& $x != '') </code> </p>
<p>to just </p>
<p><code>if (ctype_digit((string) $x))</code>,</p>
<p>which is cleaner and nicer.</p>
<p>It looks like they may have made this change back around PHP 5.1, but I never noticed it until I checked that variable reference page.  Nice to see it!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpvs.net/2008/06/29/php-variable-test-reference/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Ten PHP Best Practices Tips that will get you a job</title>
		<link>http://www.phpvs.net/2008/06/04/ten-php-best-practices-tips-that-will-get-you-a-job/</link>
		<comments>http://www.phpvs.net/2008/06/04/ten-php-best-practices-tips-that-will-get-you-a-job/#comments</comments>
		<pubDate>Wed, 04 Jun 2008 22:06:09 +0000</pubDate>
		<dc:creator>blake</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[best practices]]></category>
		<category><![CDATA[PHP best practices]]></category>
		<category><![CDATA[PHP tips]]></category>
		<category><![CDATA[top ten]]></category>

		<guid isPermaLink="false">http://www.phpvs.net/2008/06/04/ten-php-best-practices-tips-that-will-get-you-a-job/</guid>
		<description><![CDATA[The last couple of weeks have been quite the experience for me. I was part of a big layoff at my former company, which was interesting. I've never been in that position before, and it's hard not to take it personally. I started watching the job boards, and a nice-looking full-time PHP position caught my [...]]]></description>
			<content:encoded><![CDATA[<p>The last couple of weeks have been quite the experience for me.  I was part of a big layoff at my former company, which was interesting.  I've never been in that position before, and it's hard not to take it personally.  I started watching the job boards, and a nice-looking full-time PHP position caught my eye, so I sent out a resume and landed an interview.  Before the face-to-face portion, I chatted with the owner and head programmer on a conference call, and they ended up sending me a technical assessment quiz.  One particular question caught my eye on this quiz... it looked something like this:</p>
<p>Find the errors in the following code:
<pre>
&lt;?
function baz($y $z) {
	$x = new Array();
	$x[sales]  = 60;
	$x[profit] = 20:

	foreach($x as $key = $value) {
		echo $key+" "+$value+"&lt;BR&gt;";
	}
} 

?&gt;</pre>
<p>So, give it a shot.  How many can you find?</p>
<p>If you got the missing comma in the parameter list, the "new Array()" error, the colon instead of a semi-colon, the '=' instead of '=&gt;' in the foreach statement, and the erroneous use of '+' on the echo line, then congratulations, you found all the errors!  You have the basic PHP technical skills to pay the bills.</p>
<p>That's not how I answered the question though.  I noted the errors, obviously, but I went further than that.  For instance, did you notice that there were no single quotes around the array indexes ($x[sales] and $x[profit])?  That won't cause a fatal PHP error, but it is a coding error!  Did you also notice the use of double-quoted strings instead of single-quoted strings on the echo line?  Or the usage of the opening PHP short tag? Or the usage of "&lt;BR&gt;" instead of "&lt;br/&gt;"?</p>
<p>After pointing out the actual errors, I made a point of adding comments about those things I just mentioned.  It was enough to push the answer from "correct" to "impressive", and it scored me a lot of points with the programmers who were reviewing my application.  Enough so that they offered me the job!  (I eventually turned it down, as I have been seduced by the siren call of the contracting life, and I intend to flex my PHP skills to the benefit of my clients, and not a faceless corporate overlord who dabbles in telemarketing.  I need a shower).</p>
<p>So, read on for my Ten PHP Best Practices Tips that will get you a job:</p>
<p><strong>1.  Single-quoted strings are your friend.</strong> When you surround a PHP string in double quotes, it is subsequently parsed by the PHP interpreter for variables and special characters, such as "\n".  If you just want to output a basic string, use single quotes!  There is a marginal performance benefit, since the string does not get parsed.  If you have variables or special characters, then by all means use double-quotes, but pick single quotes when possible.</p>
<p><strong>2.  String output.</strong> Which line of code do you think runs faster?<br />
<code><br />
print "Hi my name is $a. I am $b";<br />
echo "Hi my name is $a. I am $b";<br />
echo "Hi my name is ".$a.". I am ".$b;<br />
echo "Hi my name is ",$a,". I am ",$b;<br />
</code><br />
This might seem weird to you, but the last one is actually the fastest operation.  print is slower than echo, putting variables inline in a string is slower than concatenating them, and concatenating strings is slower than using comma-separated echo values! Not only does not-inlining your variables give you a performance boost, but it also makes your code easier to read in any editor that has syntax highlighting (your variables will show up in nice colors).  The little-known use of echo as a function that takes a comma-separated list of values is the fastest of them all, since no string operations are performed, it just outputs each parameter.  If you combine all this with Tip #1 and use single quotes, you're on your way to some finely-tuned strings.</p>
<p><strong>3.  Use single-quotes around array indexes.</strong> As you saw in the quiz question above, I pointed out that <code>$x[sales]</code> is technically incorrect!  You should quote associative array indexes, like so: <code>$x['sales']</code>.  This is because PHP considers the unquoted index as a "bare" string, and considers it a <em>defined constant</em>.  When it can't find a matching symbol for this constant in the symbol table however, it converts it to a real string, which is why your code will work.  Quoting the index prevents this constant-checking stuff, and makes it safer in case someone defines a future constant with the same name.  I've also heard that it is up to seven times faster than referencing an unquoted index, although I haven't tested this.  For more on this, see the section called "Array do's and don'ts" in the <a href="http://www.php.net/manual/en/language.types.array.php" target="_blank">Array section</a> of the PHP manual.</p>
<p><strong>4.  Don't use short open tags.</strong> Eww... are you really using these?  <code>&lt;?</code> is just bad form.  It can cause conflicts with XML parsers, and if you ever distribute code, it's going to annoy the heck out of people who have to start modifying their PHP ini directives to get it to work.  There's just no good reasons to use short open tags.  Use the full <code>&lt;?php</code>.</p>
<p><strong>5.  Don't use regular expressions if you don't need to.</strong> If you're doing basic string operations, stay away from the preg and ereg function groups whenever possible.  str_replace is much faster than preg_replace, and strtr is even faster than str_replace!  Save those crunch cycles... your enterprise applications will thank you.</p>
<div class="ad"><!--adsense#halfbanner--></div>
<p><strong>6.  Don't use functions inside a loop declaration.</strong> This isn't a PHP-specific tip, but you'll see it in a lot of code.</p>
<p>Bad:<br />
<code><br />
for ($i = 0; $i &lt; count($array); $i++) {<br />
//stuff<br />
}<br />
</code><br />
Good:<br />
<code><br />
$count = count($array);<br />
for($i = 0; $i &lt; $count; $i++) {<br />
//stuff<br />
}<br />
</code><br />
That should be pretty self-explanatory, but it's amazing how many people would rather save a line of code at the expense of performance.  If you use a function like count() inside a loop declaration, it's going to get executed at every iteration!  If your loop is large, you're using a lot of extra execution time.</p>
<p><strong>7.  Never rely on register_globals or magic quotes.</strong> <a href="http://www.php.net/manual/en/ini.core.php#ini.register-globals" target="_new">register_globals</a> and <a href="http://www.php.net/manual/en/security.magicquotes.php" target="_blank">magic quotes</a> are both old features of PHP that seemed like a good idea at the time (ten years ago), but in reality turned out to be not that great.  Older installations of PHP would have these features on by default, and they cause security holes, programming errors, and all sorts of bad practices, such as relying on user input to create variables.  Both these features are now deprecated, and everyone needs to stop using them.  If you are ever working on code that relies on these features, get it out of there as soon as you can!</p>
<p><strong>8.  Always initialize your variables. </strong> PHP will automatically create a variable if it hasn't been initialized, but it's not good practice to rely on this feature.  It makes for sloppy code, and in large functions or projects can become quite confusing if you have to track down where it's being created.  In addition, incrementing an uninitialized variable is much slower than if it was initialized.  It's just a good idea.</p>
<p><strong>9.  Document your code.</strong> You've heard it many times, but this can't be said enough.  I know places that won't hire people who don't document code.  I even <em>got my previous job</em> after an interview where the VP sat-in with the interviewer and I, where I had brought my laptop in and I was just scrolling through some code I had written for one of my sites.  He saw my documented functions and was impressed enough to ask me about my documenting habits.  A day later I had the job.</p>
<p>I know a lot of self-declared PHP gurus out there like to pretend that their code is so good that they don't have to spend time documenting it, and these people are full of <code>$largeAnimal</code> poop.  Learn docblock syntax, familiarize yourself with some PHP Documentation packages like <a href="http://manual.phpdoc.org/HTMLframesConverter/default/" target="_blank">phpDocumentor</a> or <a href="http://www.stack.nl/~dimitri/doxygen/" target="_blank">Doxygen</a>, and take the extra time to do it.  It's worth it.</p>
<p><strong>10.  Code to a standard.</strong> This is something that <strong>you should ask potential employers</strong> about during interviews.  Ask them what kind of coding standards they use... <a href="http://pear.php.net/manual/en/standards.php" target="_blank">PEAR</a>?  <a href="http://framework.zend.com/manual/en/coding-standard.html" target="_blank">Zend</a>?  In-house?  Mention that you code to a specific standard, whether it be your own, or one of the more prevalent ones out there.  The problem with loosely-typed languages like PHP is that without a proper coding standard, code tends to start looking like huge piles of garbage.  Stinky, disgusting garbage.  A basic set of rules that includes whitespace standards, brace matching, naming conventions, etc. is a must-have, must-follow for anyone who prides themselves on their code quality.</p>
<p>That being said, I hate all you space-indenters.  I mean, what the hell?  4 space characters as an indent?  That's exactly four times as much whitespace to parse as a tab.  More importantly, you can set your tab-stops to any value you want if you're using any text-editor more advanced than Notepad, so every developer can having something that they like the looks of.  Set it to 4 if you want, or 0 if you're a masochist.  I don't care, but you can't do that with spaces!  You're stuck with exactly the amount that Mr. Monkey Pants in cubicle 17 decided to put in.  So why are spaces so popular?  This "4-space indent" standard everyone uses is stupid!  Stop it!  Stop doing it!</p>
<p>... sorry, pet peeve.</p>
<p>Anyway, I hope these tips are helpful.  If you want to impress at a job interview, it's the little details that will get you noticed!  Maybe don't rant about your coding standards though.</p>
<div class="ad"><!--adsense#halfbanner--></div>
]]></content:encoded>
			<wfw:commentRss>http://www.phpvs.net/2008/06/04/ten-php-best-practices-tips-that-will-get-you-a-job/feed/</wfw:commentRss>
		<slash:comments>153</slash:comments>
		</item>
		<item>
		<title>HTML manipulation with System.Xml.XmlDocument</title>
		<link>http://www.phpvs.net/2008/02/17/html-manipulation-with-systemxmlxmldocument/</link>
		<comments>http://www.phpvs.net/2008/02/17/html-manipulation-with-systemxmlxmldocument/#comments</comments>
		<pubDate>Mon, 18 Feb 2008 06:13:10 +0000</pubDate>
		<dc:creator>morgan</dc:creator>
				<category><![CDATA[.Net]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://www.phpvs.net/2008/02/17/html-manipulation-with-systemxmlxmldocument/</guid>
		<description><![CDATA[HTML Table of Contents Generator Example Sometimes it's easy to forget that HTML is just one type of XML, and hence you can utilize the System.Xml library for fun and profit with your HTML. System.Xml is full of powerful tools to manipulate well-formed documents, and you really don't need to know much about XML to [...]]]></description>
			<content:encoded><![CDATA[<h2>HTML Table of Contents Generator Example</h2>
<p>Sometimes it's easy to forget that HTML is just one type of XML, and hence you can utilize the <code>System.Xml</code> library for fun and profit with your HTML.  <code>System.Xml</code> is full of powerful tools to manipulate well-formed documents, and you really don't need to know much about XML to leverage it.  With two simple lines of code you can have a document loaded into a data structure that has powerful manipulation methods that allow you to do complex tasks. Such as generating a table of contents, for example.</p>
<p>Blake phoned me last night very frustrated after having spent a couple hours scouring the 'tubes for some kind of tool that would take his marked-up html document and generate a table of contents from the heading tags in it.  He started asking my advice about a C# program he had downloaded. It included three forms and over 1000 lines of code, and purported to do what he needed.  Except it didn't... it just kept crashing, and couldn't handle certain nestings of tags, etc. etc.  One look at the code made it pretty clear why... some kind of home-brewed tree structure peppered with variables like "treeUp, treeDown, treeRight, itemBegin, itemEnd".... bleeargh.  <code>XmlDocument </code>to the rescue!</p>
<p>In 45 minutes I had a program whipped up into a console app that did exactly what he needed, and it was essentially only 60 lines of code (plus some jazz for error handling/argument passing).   Let's take a look:</p>
<div class="igBar"><span id="lcsharp-16"><a href="#" onclick="javascript:showPlainTxt('csharp-16'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">C#:</span>
<div id="csharp-16">
<div>
<ol>
<li>
<div><span style="color: #0600FF;">private</span> <span style="color: #0600FF;">void</span> GenerateTOC<span style="color: #000000;">&#40;</span>XmlNodeList nodelist, StringBuilder sb<span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div><span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color: #0600FF;">foreach</span> <span style="color: #000000;">&#40;</span>XmlNode node <span style="color: #0600FF;">in</span> nodelist<span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span>Regex.<span style="color: #0000FF;">IsMatch</span><span style="color: #000000;">&#40;</span>node.<span style="color: #0000FF;">Name</span>, <span style="color: #808080;">"h[1-7]"</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//We've found an &quot;h&quot; tag.&nbsp; Update our TOC stringbuilder,</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//and our original XMLDocument to add anchor tags.</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span><span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">isVerbose</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span> Console.<span style="color: #0000FF;">WriteLine</span><span style="color: #000000;">&#40;</span><span style="color: #808080;">"Found "</span> + node.<span style="color: #0000FF;">Name</span><span style="color: #000000;">&#41;</span>; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #FF0000;">String</span> tabs = <span style="color: #808080;">""</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #FF0000;">int</span> hLevel = <span style="color: #FF0000;">int</span>.<span style="color: #0000FF;">Parse</span><span style="color: #000000;">&#40;</span>node.<span style="color: #0000FF;">Name</span>.<span style="color: #0000FF;">Substring</span><span style="color: #000000;">&#40;</span><span style="color: #FF0000;color:#800000;">1</span>, <span style="color: #FF0000;color:#800000;">1</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span>hLevel != <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">lastHLevel</span><span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span>hLevel &lt;this.<span style="color: #0000FF;">lastHLevel</span><span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//Retreat to a less indented block level</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">for</span> <span style="color: #000000;">&#40;</span><span style="color: #FF0000;">int</span> i = <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">lastHLevel</span> - <span style="color: #FF0000;color:#800000;">1</span>; i&gt; hLevel - <span style="color: #FF0000;color:#800000;">1</span>; i--<span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tabs = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> <span style="color: #FF0000;">String</span><span style="color: #000000;">&#40;</span><span style="color: #808080;">'<span style="color: #008080; font-weight: bold;">\t</span>'</span>, i<span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sb.<span style="color: #0000FF;">Append</span><span style="color: #000000;">&#40;</span>tabs + <span style="color: #808080;">"&lt;/ul&gt;<span style="color: #008080; font-weight: bold;">\n</span>"</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">else</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//Indent some more - Add the level difference in indents</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">for</span> <span style="color: #000000;">&#40;</span><span style="color: #FF0000;">int</span> i = <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">lastHLevel</span>; i &lt;hLevel; i++<span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tabs = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> <span style="color: #FF0000;">String</span><span style="color: #000000;">&#40;</span><span style="color: #808080;">'<span style="color: #008080; font-weight: bold;">\t</span>'</span>, i<span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sb.<span style="color: #0000FF;">Append</span><span style="color: #000000;">&#40;</span>tabs + <span style="color: #808080;">"&lt;/ul&gt;<span style="color: #008080; font-weight: bold;">\n</span>"</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//Set lastHLevel to the current HLevel</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">lastHLevel</span> = hLevel;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//Generate the TOC entry for this node, with a link to it's anchor.</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tabs = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> <span style="color: #FF0000;">String</span><span style="color: #000000;">&#40;</span><span style="color: #808080;">'<span style="color: #008080; font-weight: bold;">\t</span>'</span>, <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">lastHLevel</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sb.<span style="color: #0000FF;">Append</span><span style="color: #000000;">&#40;</span>tabs + <span style="color: #808080;">"&lt;li&gt;&lt;a href=<span style="color: #008080; font-weight: bold;">\"</span>#toc"</span> + <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">tocCount</span> + <span style="color: #808080;">"<span style="color: #008080; font-weight: bold;">\"</span>&gt;"</span> + node.<span style="color: #0000FF;">InnerXml</span> + <span style="color: #808080;">"&lt;/a&gt;&lt;/li&gt;<span style="color: #008080; font-weight: bold;">\n</span>"</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//Add an anchor tag to the node in the original document</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; node.<span style="color: #0000FF;">InnerXml</span> = <span style="color: #808080;">"&lt;a name=<span style="color: #008080; font-weight: bold;">\"</span>toc"</span> + <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">tocCount</span>.<span style="color: #0000FF;">ToString</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span> + <span style="color: #808080;">"<span style="color: #008080; font-weight: bold;">\"</span>&gt;"</span> + node.<span style="color: #0000FF;">InnerXml</span> + <span style="color: #808080;">"&lt;/a&gt;"</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">tocCount</span>++;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//Now recurse over child nodes</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span>node.<span style="color: #0000FF;">ChildNodes</span>.<span style="color: #0000FF;">Count</span>&gt; <span style="color: #FF0000;color:#800000;">0</span><span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; GenerateTOC<span style="color: #000000;">&#40;</span>node.<span style="color: #0000FF;">ChildNodes</span>, sb<span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008080; font-style: italic;">//Finish whatever &lt;ul&gt; level we have open if we're the last child of the root.</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span>node.<span style="color: #0000FF;">NextSibling</span> == <span style="color: #0600FF;">null</span> &amp;&amp; node.<span style="color: #0000FF;">ParentNode</span>.<span style="color: #0000FF;">ParentNode</span> == <span style="color: #0600FF;">null</span><span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">for</span> <span style="color: #000000;">&#40;</span><span style="color: #FF0000;">int</span> i = <span style="color: #FF0000;color:#800000;">0</span>; i &lt;this.<span style="color: #0000FF;">lastHLevel</span>; i++<span style="color: #000000;">&#41;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #FF0000;">String</span> tabs = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> <span style="color: #FF0000;">String</span><span style="color: #000000;">&#40;</span><span style="color: #808080;">'<span style="color: #008080; font-weight: bold;">\t</span>'</span>, <span style="color: #0600FF;">this</span>.<span style="color: #0000FF;">lastHLevel</span> - i - <span style="color: #FF0000;color:#800000;">1</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sb.<span style="color: #0000FF;">Append</span><span style="color: #000000;">&#40;</span>tabs + <span style="color: #808080;">"&lt;/ul&gt;<span style="color: #008080; font-weight: bold;">\n</span>"</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div>&nbsp; &nbsp; <span style="color: #000000;">&#125;</span></div>
</li>
<li>
<div><span style="color: #000000;">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
So in line 3, we start looping over every node (i.e. html element) in the document.  Line 5 checks to see if the current node is a header tag with a simple Regular Expression.  Lines 11-34 control the indent level of the TOC's html output - we use one &lt;ul&gt; level for each header level.  (So an h5 tag is nested in 5 &lt;ul&gt; tags.)  Line 37-38 adds some html output for the TOC for the current node, namely we create a TOC list item.  Finally, lines 41 and 42 modify the original <code>XmlDocument </code>object by adding an anchor tag to the html of the current node.  Then we recursively call the function again with the current node's children.  The last bit of code polishes off our TOC output at the very end of our recursion. </p>
<p>(At this point, real purists might interject with the fact that 10 lines of code and an XSLT stylesheet could do the same thing; I'd agree, except in practice I find that executing simple loop-driven tasks with XSLT to be quite cumbersome, and I doubt I could do anything with XSLT in 45 minutes.)</p>
<p>So to use the function above, simply harness the raw power of the <code>System.Xml.XmlDocument </code> object, like so:</p>
<div class="igBar"><span id="lcsharp-17"><a href="#" onclick="javascript:showPlainTxt('csharp-17'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">C#:</span>
<div id="csharp-17">
<div>
<ol>
<li>
<div>XmlDocument htmldoc = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> XmlDocument<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>htmldoc.<span style="color: #0000FF;">PreserveWhitespace</span> = <span style="color: #0600FF;">true</span>;</div>
</li>
<li>
<div>htmldoc.<span style="color: #0000FF;">Load</span><span style="color: #000000;">&#40;</span><span style="color: #808080;">"myfile.html"</span><span style="color: #000000;">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Assuming your HTML is well-formed, you can now pass <code>htmldoc.ChildNodes</code> and a <code>StringBuilder </code>into the recursive function above, and your <code>StringBuilder </code>will come back full of HTML table of contents goodness.  Additionally, your <code>XmlDocument </code>variable will have the corresponding anchors added to the header tags.  Just simply output your <code>StringBuilder </code>and <code>XmlDocument </code>to a file, and voila!  Instant HTML table of contents!  (Might look something like below:)</p>
<div class="igBar"><span id="lcsharp-18"><a href="#" onclick="javascript:showPlainTxt('csharp-18'); return false;">&gt;&gt; show as plain text</a></span></div>
<div class="syntax_hilite"><span class="langName">C#:</span>
<div id="csharp-18">
<div>
<ol>
<li>
<div>StringBuilder sb = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> StringBuilder<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div><span style="color: #008080; font-style: italic;">//Assume that the root node is not an &lt;h&gt; tag and build our TOC from the children.</span></div>
</li>
<li>
<div>thisApp.<span style="color: #0000FF;">GenerateTOC</span><span style="color: #000000;">&#40;</span>htmldoc.<span style="color: #0000FF;">ChildNodes</span>, sb<span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color: #008080; font-style: italic;">//Output TOC</span></div>
</li>
<li>
<div>FileStream fs = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> FileStream<span style="color: #000000;">&#40;</span><span style="color: #808080;">"TOC.html"</span>, FileMode.<span style="color: #0000FF;">Create</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>StreamWriter sw = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> StreamWriter<span style="color: #000000;">&#40;</span>fs<span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>sw.<span style="color: #0000FF;">Write</span><span style="color: #000000;">&#40;</span>sb.<span style="color: #0000FF;">ToString</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>sw.<span style="color: #0000FF;">Close</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span style="color: #008080; font-style: italic;">//Output original document with new &lt;a&gt; tags</span></div>
</li>
<li>
<div>XmlWriter xw = <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> XmlTextWriter<span style="color: #000000;">&#40;</span><span style="color: #808080;">"OriginalWithAnchors.html"</span>, Encoding.<span style="color: #0000FF;">UTF8</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li>
<div>htmldoc.<span style="color: #0000FF;">WriteTo</span><span style="color: #000000;">&#40;</span>xw<span style="color: #000000;">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>All that in less than 80 lines of code, 45 minutes, and no XSD's, XSLT, or really, any XML at all.  XmlDocument.Load() is simply one of the greatest functions in the .Net framework.  Instant document object with an implicit tree structure.</p>
<p>Download the code here:  <a href='http://www.phpvs.net/wp-content/uploads/2008/02/htmltoc.zip' title='HTML Table of Contents Generator'>HTML Table of Contents Generator</a>.  It includes a binary .exe file in the "bin\Release" directory, so you don't need Visual Studio if you just want to run the above program <img src='http://www.phpvs.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />   Simply call <code>htmltoc.exe infile.html</code>, and you'll have TOC.html and OriginalWithAnchors.html outputted.  TOC.html contains your nicely formatted table of contents, with links to all the anchors in OriginalWithAnchors.html.</p>
<p>
<a href="http://www.dotnetkicks.com/kick/?url=http%3a%2f%2fwww.phpvs.net%2f2008%2f02%2f17%2fhtml-manipulation-with-systemxmlxmldocument%2f"><img src="http://www.dotnetkicks.com/Services/Images/KickItImageGenerator.ashx?url=http%3a%2f%2fwww.phpvs.net%2f2008%2f02%2f17%2fhtml-manipulation-with-systemxmlxmldocument%2f&#038;bgcolor=FF9933&#038;cbgcolor=D4E1FD" border="0" alt="kick it on DotNetKicks.com" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpvs.net/2008/02/17/html-manipulation-with-systemxmlxmldocument/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

