| Ben is | co-founder of Aria Glassworks, an augmented reality startup |
| formerly part of Yelp's Mobile Team, the creator of Monocle AR | |
| a graduate of the EE BS program at Stanford University | |
| formerly co-artistic director of Stanford Swingtime | |
| an alumnus of Microsoft Research Asia | |
| an alumnus of SwayLaw | |
| an alumnus of the Rotary Youth Exchange Program to Taipei | |
| on twitter @newhouseb |
It all started with this tweet by Robert Scoble on July 14th, 2009:
At the time, Yelp was not working on such a feature - no one was really sure what Scoble was talking about. While the team shrugged it off, I took it as a challenge from Scoble (especially considering I'd never really developed anything for the iPhone or done any OpenGL). I went home a couple nights later and hacked away from 8pm to 5am on my own interpretation of a Yelp Augmented Reality using Yelp's public APIs. To my surprise, I had a working demo (albeit ugly) a mere two days later. I started showing it to some other Yelp developers during my lunch break and soon enough word got to the rest of the team and I was busy porting the code I wrote in my aparment at 4am into Yelp's iPhone codebase.
After a couple weeks of polishing, we shipped it hidden as an easter egg in our v3.0 of the iPhone app (if you're curious, contrary to popular belief, we used no private APIs). Given the shortened development cycle, it still wasn't as polished as the rest of the app so we decided to hide it for all but the tech savvy users who would find it cool.
After weeks of waiting for the App Store to approve our app it was finally released, and a day later, none other than Robert Scoble himself posted to friendfeed the following message:
"New Yelp iPhone app is also out. There's a cool easter egg (Augmented Reality feature). Here's how to do it:..."
With that, the race was on. Within minutes Mashable had cobbled together a demo video, with RWW soon to follow. A few hours later, it was a trending topic on twitter.
You're not familiar with the concept here's an (eventually in focus!) quick video from Steve Garfield:
In the months that followed, we released a version of the Monocle for 3.1 and received a constant supply of press (that is surprisingly still coming). Here's an abbreviated list of top media stories
When I'm not coding, there's a decent probability I'm swing dancing. I've been dancing since I was a kid, and since coming to Stanford I joined Swingtime, a Lindy Hop performance troupe. During my final year, I've taken over as a co-artistic director and thus end up teaching a lot of Choreography (which I admittedly love doing).
The problem is - Choreography is incredibly hard to convey by any means other than having someone dance in front of you. Up until recently, many of us have relied on excel spreadsheets with counts marked out explicitly. Lining up these counts with the actual music secondhand is quite a challenge.

I put together this simple Django app with the aid of the jPlayer plugin for jQuery (which I modified to always use Flash instead of HTML5 Audio, which is buggy). As the song plays, the app shows the corresponding move. It's simple, effective, and by no means done. It's not particularly high on my priorities, but expect slow progress.
Between my (now retired) Mobile Safari handwriting recognition site, iFone Chinese, and the assortment of random tools online, surfing the Chinese internet is not only a mental exercise but also an excercise in alt-tabbing between different translation tools.
One weekend at Microsoft Research Asia, I was gearing up for a heavy weekend of coding. Alas, apparently a pipe broke in the data-center (apparently?) and all the servers I needed were taken offline. I had to put my pent up momentum into something and so I decided to create a bookmarklet that could annotate chinese characters on any domain with their pronunciation, and Add Pinyin was born (it was actually originally called Pinyinify).
The idea is that on a Chinese page you click a bookmarklet and the pronunciations are added alongside each character. Try clicking here and watching the text below
Add Pinyin has two possible underlying engines - a PHP engine that references a memory-mapped file and a C server that serves requests asynchronously from an in memory dictionary. The C server can serve around 10,000 requests per second, but (obviously) required more maintenance than a stateless PHP script. At the moment, the C server is tucked away until traffic spikes.
A simple webapp that only allows scheduling two states "busy" and "not busy". I created this actually to help schedule interviews. Shadow Calendar links each user to an image that can be embedded in an e-mail, allowing for scheduling changes to occur after the e-mail is sent. My current calendar looks as follows:
This was part of a class project (for the intro biotech class at Stanford strangely enough) in which I attempted to first explore the computational power of genetic algorithms, and then use the collective power of the internet to evolve pictures of animals from collections of polygons.
In doing so I designed a numerical DNA that closely mimiced actual DNA in that it was made up of codons (including stop codons), which allowed for selective expression of each polygon. In searching for a photo of a dog, a surprisingly accurate picture converged then later diverged into noise after someone went crazy on it.
It is currently disabled until I figure out how to host multiple Django sites on one Apache instance with mod_wsgi (I keep hitting cache interference issues)
Back before there were native iPhone apps, there were only Mobile Safari apps. Using a Japanese handwriting recognition library with Python bindings (Tomoe) I wrote a simple server to classify a group of lines as Chinese characters. Because one cannot drag on the iPhone (because such a gesture is interpreted as scrolling), I designed an interface in which single taps either started or continued a stroke, and double taps ended a stroke. In addition to just character matching, pronunciation and definitions were added for each character
This was retired when Apple embedded native character recognition into the iPhone firmware
Effectively Doodle, before I knew what Doodle was. The differentiating feature was that you created events by CC'ing tellme@whenareyoufree.net, and the service would send links with the scheduling pages to all your friends automatically via e-mail (based on who else was CC'ed). Pieces were salvaged and used in Shadow Calender. Retired because it was redundant
Jorssis was my first real foray into web development, which begain in Mo!relax (a coffee shop) in Taipei. It evolved into many things, but was originally designed as a fluid platform to save and store notes. I configured IM, SMS, e-mail, a web-interface and simple callbacks via setting the entire query string as a note to input bits of information into the system. With RSS and a web interface to get the data back out, I ended up creating quite a versatile personal information broker. I later integrated regexes to parse out links and other types of data to allow for automatic detection of metadata. It's still up, but slowly dieing as the host upon which it is hosted slowly upgrades through various PHP versions, breaking compatibility.
Essays are timeless, blogs are not. So I write essays instead of blogging.
January 2010 New Year Resolutions
May 2009 Evolution in 40 lines of code
One of the most common arguments used when people debate evolution (in my opinion it’s not much of a debate), is that there is simply too much information contained in the genome to “randomly” show up. A common misconception is that one generation may succeed because of one singular mutation. Somewhat logically, the next step is to assume that each of the 4 billion genes resulted from billions of generations, one at a time. The earth has only been around for 5 billion years, life much less, so on a first pass, the creationists might seem to have a point. The key, though, is realizing that mutations happen in parallel and compound really, really fast.
During gene duplication, each gene has a certain probability of mutating. In reality, there are endless variables that determine this probability, but DNA is assembled one base pair (0.5 second Bio review, a base pair is one of AT, CG, GC, and TA) at a time, so one mutation upstrand shouldn’t affect base pairs downstrand. Turns out that this greatly reduces generations required to get to a “fit” goal gene.
All in all it’s pretty simple, and per suggestion of a book my brother and father recently read (I can’t think of the title). You can simulate evolution in quite a terse chunk of code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import string, random
goal = "To be or not to be, that is the question"
gen_size = 100
gen_count = 100000000
bases = ''.join(set(list(goal)))
# Our Adam/Eve
seed = {'gene': ''.join([random.choice(bases) for i in xrange(len(goal))]),
'fitness': 0}
print 'Start: ' + seed['gene'] + "(" + str(seed['fitness']) + ")"
# For each generation
for i in xrange(gen_count):
children = []
# Make each baby
for j in xrange(gen_size):
child = {'gene': '', 'fitness': 0}
# Copy each letter
for k in xrange(len(seed['gene'])):
nextbase = seed['gene'][k]
# Mutation!
if random.random() > 0.999:
nextbase = random.choice(bases)
# Add base to gene and calculate fitness boost
child['gene'] += nextbase
if nextbase == goal[k]:
child['fitness'] += 1
# Append each child to the generation
children.append(child)
# Choose a winner
newseed = max(children,key=(lambda x : x['fitness']))
if newseed != seed:
print '#' + str(i) + ' ' + newseed['gene'] + \
"(" + str(newseed['fitness']) + "/" + str(len(goal)) + ")"
seed = newseed
# Finish if we're done
if seed['gene'] == goal:
break
What this does, is starts out with a random string of letters, and then for each generation, makes 100 copies of the previous winning species, with a finite probability of mutation for each base pair (for simplicity I only consider point mutations). At the end of each generation, it evaluates the fitness of each child by comparing it to the goal string and selects the “most fit” to father the next generation. An important point in making this realistic is that no base pair is immune from mutation. That means, that if all of the children randomly progressive backwards, then the species as a whole progresses backwards.
So running this script, we find the output to be:
Start: iTmZ bReNyaQDtJWPbBkZSQmJPYNjCerARSqEPNE(0) #0 iTmZ bReNyaQDtJWPbBkZSQmJPYNjCerARSqEPNE(2/40) #2 iTmZ nReNyaQDtJWPbBkZSQmJPYNjCerARSqEPNE(2/40) #23 iTmZ nReNyaQDtJWPbBkZSQmJPYNjCerARSsEPNE(3/40) #29 iTmZ nReNyaQDtJWPbBktSQmJPYNjCerARSsEPNE(4/40) #40 iTmZ nReNyaQ tJWPbBktSQmJPYNjCerARSsEPNE(5/40) #53 iTmZ nReNyaQ tJWPbBktSQtJPYNjCerARSsEPNE(6/40) #62 iTmZ nReNyaQ tJWPbBktSQtJPYNtCerARSsEPNE(7/40) #70 iTmZ nReNyQQ tJWPbBktSQtJPYNtCerARSsEPNE(7/40) #71 iTmZ nReNyQQ tJWPbB tSQtJPYNtCerARSsEPNE(8/40) #78 iTmZ nReNyQQ tJWPbB tSQtJPYNtherARSsEPNE(9/40) #87 iTmZ nReNyQQ tJWPbB tSQtJPYNtherARSstPNE(10/40) #99 iTmZ nReNyQQ tJWPbB tSQtJPYNtherARSstPoE(11/40) #107 iTmZ nReNyQQ tJWPb, tSQtJPYNtherARSstPoE(12/40) #111 iTmZ nReNyQQ toWPb, tSQtJPYNtherARSstPoE(13/40) #140 iomZ nReNyQQ toWPb, tSQtJPYNtherARSstPoE(14/40) #161 iomZ nReNiQQ toWPb, tSQtJPYNtherARSstPoE(14/40) #164 iomZ nReNiQQ toWPe, tSQtJPYNtherARSstPoE(15/40) #185 domZ nReNiQQ toWPe, tSQtJPYNtherARSstPoE(15/40) #205 domZ nReNiQQ toWPe, tSJtJPYNtherARSstPoE(15/40) #217 domZ nReNiQQ toWPe, tSatJPYNtherARSstPoE(16/40) #234 domZ noeNiQQ toWPe, tSatJPYNtherARSstPoE(17/40) #249 domZ oeNiQQ toWPe, tSatJPYNtherARSstPoE(18/40) #278 domZ oeNiQQ toWPe, tSatJPYNtherAuSstPoE(19/40) #285 domZ oeNiQQ toWPe, tSatJkYNtherAuSstPoE(19/40) #297 domZ oeNiQQ toWPe, thatJkYNtherAuSstPoE(20/40) #314 domZe oeNiQQ toWPe, thatJkYNtherAuSstPoE(21/40) #315 domZe oeNYQQ toWPe, thatJkYNtherAuSstPoE(21/40) #365 domZe oeNYQQ toWPe, thatJiYNtherAuSstPoE(22/40) #372 TomZe oeNYQQ toWPe, thatJiYNtherAuSstPoE(23/40) #386 TomZe oeNYQj toWPe, thatJiYNtherAuSstPoE(23/40) #424 TomZe oeNYQc toWPe, thatJiYNtherAuSstPoE(23/40) #431 TomZe oe YQc toWPe, thatJiYNtherAuSstPoE(24/40) #432 TomZe oe YQc toWPe, thatJiYNtherAuSstPoY(24/40) #461 TomZe oe YQc toWPe, thatJiYNthebAuSstPoY(24/40) #462 TomZe oe YQc toWPe, thatJiYNthebquSstPoY(25/40) #481 TomZe oe YQc totPe, thatJiYNthebquSstPoY(25/40) #486 TomZe oe YQc totPe, thatJiYNthebquestPoY(26/40) #613 TomZe oe Yoc totPe, thatJiYNthebquestPoY(27/40) #647 TomZe or Yoc totPe, thatJiYNthebquestPoY(28/40) #695 TomZe or Yoc totPe, thatJiYNthebquestPon(29/40) #726 Tombe or Yoc totPe, thatJiYNthebquestPon(30/40) #727 Tombe or Yoc totbe, thatJiYNthebquestPon(31/40) #729 Tombe or Yoc totbe, thatJiYLthebquestPon(31/40) #830 Tombe or noc totbe, thatJiYLthebquestPon(32/40) #855 Tombe or not totbe, thatJiYLthebquestPon(33/40) #879 Tombe or not totbe, thatJiYLthebquestion(34/40) #880 Tombe or not totbe, thatJiYLthe question(35/40) #899 Tombe or not tojbe, thatJiYLthe question(35/40) #978 Tombe or not to be, thatJiYLthe question(36/40) #1108 Tombe or not to be, thatJipLthe question(36/40) #1276 To be or not to be, thatJipLthe question(37/40) #2171 To be or not to be, thatJisLthe question(38/40) #2383 To be or not to be, thatMisLthe question(38/40) #2543 To be or not to be, thatMisethe question(38/40) #2714 To be or not to be, thatMis the question(39/40) #3096 To be or not to be, thatfis the question(39/40) #3577 To be or not to be, that is the question(40/40)
So, as you can see, we started with the random string “iTmZ bReNyaQDtJWPbBkZSQmJPYNjCerARSqEPNE”, and after 3577 generations of 100 children with a 0.1% chance of mutation when copying each letter/base pair, we arrived at exactly the desired result. Doing some quick math, 3577*100 = 357,700 total copies of the DNA were explored (or alternatively, children born). For a completely random sequence to find our desired phrase “To be or not to be, that is the question”, it must exhaustively search 28^40 (~=7.7*10^57) solutions. That’s is a reduction of 53 orders of magnitude! This is the beauty of evolution.
You can play around with the script if you want, changing the mutation probability and generation size. For longer “genes”, a lower mutation probability is better, while a higher probability is better for shorter genes. You could even evolve these parameters alongside the genes themselves!
There are a lot of fun things you can “evolve” with this script. Here’s a few things I evolved from random garbage data:
For anything beyond string comparisons, by far, the hardest part of “evolving” data is creating good tests for fitness. My third example was actually heavily reduced from a first attempt to “evolve” a valid C program. The problem was, I couldn’t think of an easy way to rank compilation errors in terms of fitness. For my third example, since effectively the only errors were syntax errors and python tells you at which character it dies, I used the depth by which python parsed before dying as my fitness test. So in general, if you can figure out how to phrase a problem in terms of a fitness test, evolution could potentially save you a lot of work.
I remember from somewhere, that someone used evolution to design a radio antenna, and it turned out the antenna was better than anything a human could have conceived. Now I just need to “evolve” my homework…
Here are a number of things that are either quick (useful) hacks or things friends of mine have asked to use.
Creating a personal website formerly requires either an immensely complex folder tree of HTML files or an overblown CMS that does way beyond what the average user needs.
Extend on XHTML such that a single file can describe not only a single page, but an entire website. SWS merely requires PHP5, and can optionally use mod_rewrite for pretty URLs. You don't even need access to a database.
<site title="John Doe" template="default.tmpl" css="main.css">
<page link="" name="Home">
<header>My Home Page</header>
<content>
<p>Hello World!</p>
</content>
</page>
<page link="Contact me">
<header>To contact me</header>
<content>
<p>choose a medium</p>
</content>
<subpages>
<page link="Email">
<header>Email</header>
<content>
<p>superhappyfuntime@me.com</p>
</content>
</page>
<page link="IM">
<header>IM</header>
<content>
<p>IM superhappyfuntime</p>
</content>
</page>
</subpages>
</page>
</site>
<html>
<head>
<title />
</head>
<body>
<div id="alllinks" />
<h3 id="header" />
<div id="content" />
</body>
</html>
main.css - Your run of the mill CSS stylesheet (empty for this example)
ResultFall quarter my sophomore year at Stanford (2007), I took CS145 - Introduction to Databases. This class spent about half of its time on RDBMS and half its time on XML. I understood why we were covering RDBMS, but I had never actually seen any usage of XML beyond XHTML and nasty Java configuration files, and no real examples of XPath being used anywhere. At this point in time I was using a shared hosting provider that provided little beyond PHP5, MySQL, and PHPMyAdmin (ie. no ssh). I imagined that other people were often not even fortunate enough to have access to an SQL database. My website needed updating and thus I decided to build my website around an XML document, queried with XPath, and hence SWS was born.
SWS happily kept my personal website up for a couple years and eventually enough people told me I should open source it that I bunkered down for a couple days over christmas 2009 and completely rewrote it from scratch to allow for a lot of more advanced features (namely recursive page trees), and then released it. And here it is, under a BSD license (a la Django).
It's OK, I forgive you. Just add these three (easy!) steps
(1) For my Stanford web space my filepath is "WWW" and my urlpath is "/~benzn". (2) I visit my staging environment (at http://stanford.edu/~benzn/cgi-bin/SWS/), and (3) I run:
myth:~$ cp ~/cgi-bin/SWS/WWW/* ~/WWW
Ideally, I could set my filepath to "../../WWW", and skip this command, but something about Stanford's arcane file permissions is throwing up a roadblock at the moment. I will try to fix it soon!
Here's a few ground rules and bits of important information
template.sws must contain a root <site> node. A root node must contain a template attribute pointing to an HTML template. The <site> node must contain at least one page. Variables can be "defined by default" by declaring them in the <site> node. Pages are not strictly required to contain anything, but can contain variable definitions and a <subpages> tag in which sub pages are defined and potentially default variables for the sub pages are defined. Thus the node types are as follows:
Attributes:
template = The HTML template to parse
filepath (optional - static compilation) = The root directory on the filesystem to compile static html files to
urlpath (optional - static compilation) = The site URL path to the static files
css (optional) = The css stylesheet to include
title (optional) = The defualt title to be prepended to all page titles
Contains: <page> tags and <*> tags.
Attributes:
link = The address of this page relative to its parent (URLs are case insensitive and a space matches an underscore)
name (optional) = A name to display in autogenerated links
hidden (optional) = If "true", the page is never displayed in any menus
ref (optional) = An external url to reference, for links in menus to external web sites
title (optional) = The title to be appended to all page titles
Contains: a <subpages> tag and <*> tags.
Attributes: None
Contains: <page> tags and <*> tags.
Attributes: None
Contains: The arbitrary HTML to replace this variable with when found in the html template and other variable definitions
In the first pass through template.sws, all tags that are children of <site> or <page> that are not named <page> or <subpages> are treated as variables in which their value is the inner value of the given element. Variables are scoped according to depth and the selected page. Only nodes in the line of ancestors from the requested page are collected from and deeper definitions take precedence over shallower definitions. A few variables are automagically created for your convenience (see Automagic Variables).
For convenience a few variables are automatically populated
alllinks - All links as <a> tags, collected into <div> tags by level in the navigation tree.
links - Only the top level links
sub(*n)links - Links at level (n+1)
breadcrumb - A list of <span> tags that contain each section of the URL
path - Only the top level part of the URL
sub(*n)path - Path at level (n+1)
Next, each tag in the template is replaces by its variable contents, and then the variable contents are parsed for potential replacements. Note that if you reference a variable within its definition, it will result in an infinite loop (so don't do this). The replacement rules are as follows. Assuming "foo" is defined as "bar":
| Source | Replacement | |
| <div id="foo" /> | <div id="foo">bar</div> | |
| <foo /> | bar |
See the example here or the template.sws and html template for this website
Hello World!
choose a medium
superhappyfuntime@me.com
superhappyfuntime
This was a quick project started and finished at a Super Happy Dev House. The gist is, that I (used) to do a bunch of problem sets which required a combination of computation and explanation. If I can do something on a computer, I will, so I end up writing most of my problem sets in LaTeX. I thought it would be great if I could combine the two into one process, minimizing the amount of copying and pasting required.
PHP's inlining into HTML is quite convenient for quick and dirty jobs (as much as I hate actually coding in PHP), so I thought I would come up with a similar scheme for Matlab and LaTeX and MatTex was born. Here's a contrived, but illustrative example:
\documentclass{article}
\begin{document}
Hello World. Recently I found out that 1 + 1 =
<?ml
1 + 1
?>
\end{document}
![]()
A particularly handy function in Matlab is “latex”, which formats symbolic expressions in LaTeX. One example of this is printing a matrix:
\documentclass{article}
\begin{document}
\begin{displaymath}
<?ml
latex(sym([1,2,3,4;5,6,7,8;9,10,11,12]))
?>
\end{displaymath}
\end{document}

For anyone who’s done matrices by hand in LaTeX, this is a lot less painful.
It's all quite a hack - It’s more or less a bash script which runs some perl regular expressions to parse out the Matlab, runs the Matlab, uses python to paste the Matlab back into the LaTeX (I would use perl, but my knowledge does not extend beyond regex one-liners), then runs your LaTeX distro of choice (I use TeXShop on my MacBook).
If you want to take a look yourself, I’ve put the code up at github under an MIT License. A word of warning, I coded it amidst heavy socializing all afternoon, so its not exactly my best work. At the moment it requires that the ‘<?ml’ and ‘?>’ reside on their own lines. You’re welcome to fix it though and commit said fix to github!
The syntax is as follows:
./mattex.sh [your tex file]
ImpTran is a little script I wrote a few months ago for impedance transforms. What the heck is an impedance transform you ask? Let me attempt to explain in a completely non-technical way:
Say you are standing outside of a large croud of people and you want to get inside the crowd for whatever reason. It’s a lot easier to move around outside the crowd than inside the crowd simply because there are more people inside the crowd. In the first attempt to get into the crowd, you run at it only to be bounced back, left still outside the circle and frustrated. What could you do to make it easier to get into the crowd? Well you could request that people near the edges leave, making the gradient into the crowd softer. This however would require people leaving, which is no fun at all. You could, however, require that people shuffle around and allow people into the crowd at a specified rate, this way people entering the crowd would face no resistance in the transition in.
Now, there are a lot of inconsistencies in that anology, but the general picture is there. Often an analogue electronics you want to reduce reflections between certain circuit elements (the crowd and the not crowd), and you want to do this by dissipating as little power as possible (ie. not asking people to leave). So what you can do is, instead of using resistors, which burn power, you can use special configurations of capacitors and inductors to “match” the impedance (resistance with an imaginary component) between different elements.
Now normally, figuring out how to construct these capacitor and inductor circuits required the use of a Smith Chart. A Smith Chart allows you to do a lot of complex math (and by complex I mean imaginary, not necessarily hard) by just tracing some arcs. It will tell you, for example, when to add an inductor in series and a capacitor in parallel. Trouble is, when doing actual impedance matching in lab, things never work. You measure the impedance of an antenna input to be 25 - 20iΩ, and you trace some lines and solder some capacitors and inductors, remeasure the impedance of the match and invariably it has destroyed some other match somewhere else in the circuit. Next you iteratively, match back and forth countless matches until you magically arrive on one that works and doesn’t disturb the other matches.
In the world of perfect models, you wouldn’t see such unpredictable changes, but the world isn’t perfect and so me, again being lazy, decided to write a script to do the Smith Chart stuff automatically given an impedance. It took me a few hours, and in the end there is some serious code reduplication (thanks to matlab’s insistence on making a file for each function, which I refuse to do), but it works. One future feature would be to make the script perform a match with only a single part, because often that’s all you have room for and/or really need.
At any rate, the syntax looks as follows for the above 25 - 20iΩ impedance:
EDU>> imptran(25,-20)
ans =
First add inductor in series of 2.864789e-07H,
then add capacitor in parallel of 1.273240e-10F
OR
First add inductor in series of -3.183099e-08H,
then add inductor in parallel of 3.183099e-07H
You can find the code here.
| You can find me on | |
| HN | |
| Stack Overflow | |
| Yelp | |