Premature Optimization – The Root of All Evil

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil”

– Donald Knuth

Correct > Fast

I’ve always tried to keep the above quote in mind when programming. My first priority is to get my programs working correctly, and only then do I go back and try to make them more efficient, usually with the help of a good profiler. Recently, the importance of profiling before optimizing was re-inforced in a big way.

Filling the gaps

When converting clothing from one Poser figure to another, my CrossDresser clothing conversion application welds the seams of the clothing together. Different figures often have different polygon groupings, and when regrouping a clothing for a new figure, seams can appear at the boundaries of the original groups if the clothing creator didn’t weld the original object.

Welding an object in a consistent way is a somewhat complicated algorithm. I decided to tackle it in the following way:

  1. Create a nearest neighbor class using a kd-tree containing the vertices of the object.
  2. Create a graph object with each vertex of the object being a vertex in the graph.
  3. For each vertex in the object, use the nearest neighbor class to find all vertices within the weld tolerance.
  4. Add edges to the graph between each pair of vertices that are within the weld tolerance of each other.
  5. Walk the graph until each vertex has been visited, and generate a list of vertices to weld.
  6. Weld the specified vertices.

Too slow

Given the complex nature of the weld algorithm, I wasn’t surprised when it was somewhat slow. After being annoyed with it for several months, I found the time to go back and try to optimize it. Initially, I guessed that the algorithm I was using was simply inefficient, and needed to be redesigned from scratch to be more efficient. However, before I invested my time in a massive rewrite, I ran the code through the a code profiler to see where the bottlenecks were.

Nothing can stop us now

As it turns out, my guess was completely wrong! The actual bottleneck was in a place that I would have never guessed. In the code that implemented a breadth-first search of the graph, I needed a way to track which vertices had been accessed, so I initializing a boolean array filled with “false.” This array had one entry for each vertex in the graph, and the entry was set to true when the vertex was visited. Unfortunately, this array was initialized every time a new breadth-first search was started. Since the weld graph was very sparse (almost all subgraphs consisted of two or less vertices), this happened a huge number of times. Almost 90% of the running time was taken up with initializing an empty variable! When I replaced the array with an empty set of vertices which was given a new entry whenever a vertex was visited, the running time dropped immensely.

Don’t guess. Know.

Although I was already a believer in profiling before optimizing, this was a very good example of why you should do so. I’ve found in most cases that when it comes to guessing where a bottleneck might be, I might as well be blind. :)

Zend View Helper and Fluent Interfaces

Lately, I’ve been building a new website using Zend Framework as a base. In addition to having easy to use bootstrap utilities and a solid MVC structure, Zend also has numerous view helpers to assist with common tasks.

…Left, Right, Left, Right, Left…

One common styling feature commonly seen in data grids is alternating row colors. Normally in PHP, you would handle that using something like:

<table>
	...
	<?php $row = 0; ?>
	<?php foreach($items as $item) { ?>
		<?php if($row == 2) $row = 0; ?>
		<tr class="row<?=$row++;?>">
			...
		</tr>
	<?php } ?>
	...
</table>

This works, but it’s not very elegant and pollutes the view namespace with unnecessary variables and code. It would be nice if there was a way to encapsulate the code into a single statement. Fortunately, Zend has a view helper called Cycle that does just that:

<table>
	...
	<?php foreach($items as $item) { ?>
		<tr class="row<?=$this->cycle(array(1,2))->next()?>">
			...
		</tr>
	<?php } ?>
	...
</table>

Much better! Now everything regarding the alternate rows is in a single place. Now, what if instead of alternating every row, you want to alternate every other row? There are two obvious alternatives:

<table>
	...
	<?php foreach($items as $item) { ?>
		<tr class="row<?=$this->cycle(array(1,1,2,2))->next()?>">
			...
		</tr>
	<?php } ?>
	...
</table>
<table>
	...
	<?php $row = 0; ?>
	<?php $cycle = $this->cycle(array(1,2)); ?>
	<?php foreach($items as $item) { ?>
		<?php if($row++%2==0) $cycle->next(); ?>
		<tr class="row<?=$cycle?>">
			...
		</tr>
	<?php } ?>
	...
</table>

Both of these have issues. The first would need a lot of repetition in the array if you wanted to alternate between five numbers every 4 rows instead. The second example is back to polluting the view namespace.

Fluent Interfaces

Declarative languages like SQL are useful because they allow the user to specific what they want without having to worry about the hairy implementation that goes on behind the scene. The user can say “I want this data satisfied these criteria” and the database engine will take care of putting together an efficient search query to get the desired data.

One way to get this kind of declarative power in other languages is to use what is called a “fluent” interface. The idea is to design your API so that you can chain configuration calls into a kind of domain specific language for that class. For example, a fluent cycling view helper would allow us to easily specify:

  1. What values to cycle between
  2. What string to insert the values into
  3. How often to switch to the next value

Something like:

$this->alternate()->between(array(1,2,3,'a','b','c'))->every(2)->using('this is the {N} row');

Using this is a loop would output:

this is the 1 row
this is the 1 row
this is the 2 row
this is the 2 row
this is the 3 row
this is the 3 row
this is the a row
this is the a row
this is the b row
this is the b row
this is the c row
this is the c row
this is the 1 row
...

Setting up a fluent interface like this isn’t very difficult. The critical step is to simply return $this from every method which sets parameters. Below is the class I implemented for my website which uses this fluent interface to alternate values in loops.

Creative Commons License
This PHP code snippet is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

class ATP_View_Helper_Alternate
{
	protected $_every = 1;
	protected $_template = '{N}';
	protected $_modeMap = array(0=>1,1=>2);
	
	protected $_curCount = 0;
	protected $_curMode = 0;
	
	protected static $_alternators = array();
	
	public function alternate($template = '')
	{
		if(!isset(self::$_alternators[$template]))
		{
			self::$_alternators[$template] = new self();
			self::$_alternators[$template]->using($template);
		}
		
		return self::$_alternators[$template];
	}
	
	public function using($template = '')
	{
		if(strpos($template,'{N}')===false)
		{
			$template .= '{N}';
		}
		$this->_template = $template;
		return $this;
	}
	
	public function every($times)
	{
		$this->_every = intval($times) ? intval($times) : 1;
		return $this;
	}
	
	public function between($modes)
	{
		if(is_array($modes))
		{
			$this->_modeMap = array_values($modes);
		}
		else
		{
			$modes = intval($modes) ? intval($modes) : 2;
			$this->_modeMap = array();
			while(count($this->_modeMap)<$modes)
			{
				$this->_modeMap[] = count($this->_modeMap)+1;
			}
		}
		
		return $this;
	}
	
	public function __toString()
	{
		$str = str_replace("{N}",$this->_modeMap[$this->_curMode],$this->_template);
		
		$this->_curCount++;
		if($this->_curCount == $this->_every)
		{
			$this->_curCount = 0;
			$this->_curMode++;
			if($this->_curMode == count($this->_modeMap))
			{
				$this->_curMode = 0;
			}
		}
		
		return $str;
	}
}

Welcome Back (or: The Importance of Backups)

In June this year, our server went down. No problem. It gets overloaded occasionally, and kicking Apache would usually get things going again. So, I logged onto the server and realized that the problem was bigger than I thought. Something had happened to the server (I’m still not sure what), and the database was corrupted.

So, my blog database was lost…

…as well as the database for my wife’s 3d graphics business…

…and my SVN repository…

…and the last backup was 3 years old.

Whoops.

Backup. Yesterday. Do it Now!

Fortunately, we had local copies of the code for all of our projects, so the “only” thing we lost was 3 years worth of customer data. The website was back up and running in a few days, and we now have a weekly backup routine for the website.

Welcome Back!

This blog got put on the back burner for a while, but I finally found the time to get everything re-uploaded. I have some interesting projects I’m working on now, so hopefully, I’ll have more updates in the near future.

Minecraft to Max Converter (Part 3)

Now that the time-consuming hassles of the holidays, hard drive crashes, new computers, and new jobs are out of the way, I finally have time to work on the Minecraft exporter again. I dusted off the project a few days ago, and found that despite not working on the exporter since the Halloween update, it still mostly worked. It only required a few minor changes to exclude the Nether chunk files.

Refactoring the Code

I’ve never liked projects which require code changes to incorporate what are essentially configuration changes. The original version of the exporter contained custom code and a big set of ifs and switches to determine what geometry to load for a particular block. In other words, if the block IDs ever change or new objects are added, code changes and a re-compile would have been needed to include those changes into the exporter. So my first task this time around was to generalize the handling of the various Minecraft blocks and objects.

I removed all of the if torch do this else if door do this else if bleh code and replaced it with a new method that reads the block ids and custom object definitions out of a configuration file. The advantage of this is that changes to block IDs or objects now require a few changes to a text file instead of a re-compile. It also allows any users of the program to customize the output of the exporter to their needs. You can use the standard cube for objects like workbenches and furnaces, or replace them with custom objects of your own design. You can even do things like replace the bottom-most part of a tree with a custom object that has roots!

missing object test render

Error: torch geometry not found.

The new config system also allows you to test out your config file changes without needing the new custom objects immediately. If the exporter can’t find one of the objects specified in the config file, it will just put a nice big exclamation point in your scene in place of the object. That way, you can see at a glance which objects are missing. This test render shows a part of my world which contains numerous missing objects. I’m remaking all of the custom objects, including the torches and fences, which is why they show up as missing now.

Next Steps

Now that the geometry creation system has been revamped, the next step will be to create an actual interface around the test rig. Once that’s done, the utility should be ready for a beta release!