<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-11162230</id><updated>2011-07-07T16:27:19.387-07:00</updated><title type='text'>Jeremy Siek</title><subtitle type='html'>Musing about programming languages and computer science.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://siek.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11162230/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://siek.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Jeremy Siek</name><uri>http://www.blogger.com/profile/13773635290126992920</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>3</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-11162230.post-7979529342038315210</id><published>2009-12-21T15:49:00.000-08:00</published><updated>2010-07-22T05:17:03.096-07:00</updated><title type='text'>The ECD Abstract Machine, A Programmer's Operational Semantics</title><content type='html'>&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;There are many different styles of operational semantics but my favorite is not very well known. Hence this post. While in graduate school, I took a course on type systems from Amr Sabry in which we studied a miniature version of SML and used the style of operational semantics that I'm about to write about. Amr didn't give a name to this style, so I'm calling it the ECD abstract machine.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Why do I like the ECD machine? The ECD machine works a lot like a debugger. A debugger session has three components: a view of the source code for the currently executing procedure with the current position marked, a list of in-scope variables and their values, and a stack of the procedure calls. The ECD machine has the same three components.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Historical aside: the ECD machine is closely related to the SECD virtual machine created by Peter Landin. The ECD machine drops the operand Stack and, instead of using bytecode instruction for the Control, it uses abstract syntax trees, making ECD an abstract machine instead of a virtual machine.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;In the following I'm going to write down what an ECD machine looks like for the lambda calculus. The grammar for the lambda calculus is given below (using the keyword "fun" instead of "lambda"). Note that function application is just two expressions next to each other, where the first is the function and the second is the argument. The id terminal is for identifiers (variable names).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;expr&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; ::= &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;id&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; | &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;expr&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;expr&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; | fun &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;id&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;. &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;expr&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; | &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;value&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;value&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; ::= (fun &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;id&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;. &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;expr&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;, &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;env&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;The only kind of value (the result of running the program) in the lambda calculus is a closure, which is the result of evaluating a function (a lambda). A closure is just a tuple containing a lambda and an environment. An environment (&lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;env&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;) is a function from identifiers to values. Yes, this is a bit circular!&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Unfortunately, the lambda calculus looks rather different from your typical imperative programming language, so that may make this particular ECD machine more difficult to understand to a reader not familiar with the lambda calculus or functional programming.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;First, a word about how to represent source code with a mark on the current position. Because we're dealing with an expression-oriented&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;language, the current position is not a line number but instead a sub-expression. So the current position can be visualized as a circle drawn around the next sub-expression to be evaluated. The traditional&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;way to represent this is with two pieces: the first piece is a data structure called an evaluation context that represents the source code outside the circle. The second piece is just the sub-expression inside the circle. The following is the grammar for evaluation contexts for the call-by-value version of the lambda calculus. The [] is the hole in the context, i.e., the location of the circle.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;eval-context&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; ::= [] | &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;eval-context expression&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; | &lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: italic;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;value eval-context&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;The function fill takes an evaluation context and an expression and returns the result of plugging the expression into the hole and then rebuilding the rest of the program. In the following we use lowercase e's for expressions and uppercase E's for evaluation contexts.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fill([], e) = e&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fill(E e2, e) = fill(E, e) e2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fill(e1 E, e) = e1 fill(E,e)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Next, let's describe the ECD abstract machine. As stated above, the ECD has three components. The first is an Environment, the second is the Control, which we will represented with an expression of the lambda calculus, and the third component, the strangely named Dump, is the call stack. The following are the reduction rules for the ECD abstraction machine. The variable x ranges over variables, s over stacks, and r over environments. Each reduction rule has a name given in parenthesis on the right-hand side.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(VAR) (r, e, s) --&amp;gt; (r, fill(E, r(x)), s)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;          where e = fill(E,x)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(LAM) (r, e, s) --&amp;gt; (r, fill(E, (fun x. e, r)), s)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;          where e = fill(E, fun x. e)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(APP) (r, e, s) --&amp;gt; (r'[x:=v], e', (E,r) s)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;        where e = fill(E, (fun x. e', r') v)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(RET) (r, e, (E,r') s) --&amp;gt; (fill(E,e), r', s)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;          where e is a value&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;The VAR rule handles the case of evaluating a variable by looking it up in the environment. The LAM rule evaluates a lambda into a closure, capturing the current environment in the second part of the closure. The APP rule starts a function call whereas the RET rule finishes a function call. Each element of the call stack is a tuple containing an evaluation context and an environment.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Let's finish with an example:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;({}, (fun x. (fun y. x)) (fun z. z) (fun w. w), [])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(LAM) --&amp;gt; ({}, (fun x. (fun y. x), {}) (fun z. z) (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; w. w), [])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(LAM) --&amp;gt; ({}, (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; x. (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; y. x), {}) (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {}) (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; w. w), [])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(APP) --&amp;gt; ({x:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {})}, (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; y. x), [ ([] (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; w. w), {}) ])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(LAM) --&amp;gt; ({x:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {})}, (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; y. x, {x:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {})}), [ ([] (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; w. w),{}) ])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(RET) --&amp;gt; ({}, (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; y. x, {x:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {})}) (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; w. w), [])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(LAM) --&amp;gt; ({}, (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; y. x, {x:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {})}) (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; w. w, {}), [])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(APP) --&amp;gt; ({x:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {}), y:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; w. w, {})}, x,  [([], {})])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(VAR) --&amp;gt; ({x:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {}), y:=(&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; w. w, {})}, (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {}), [([],{})])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;(RET) --&amp;gt; ({}, (&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;fun&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; z. z, {}), [])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;A parting question. Is the ECD machine space efficient with regards to tail-recursive functions? If not, how would you modify it to be space efficient?&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11162230-7979529342038315210?l=siek.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://siek.blogspot.com/feeds/7979529342038315210/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11162230&amp;postID=7979529342038315210' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11162230/posts/default/7979529342038315210'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11162230/posts/default/7979529342038315210'/><link rel='alternate' type='text/html' href='http://siek.blogspot.com/2009/12/ecd-abstract-machine-programmers.html' title='The ECD Abstract Machine, A Programmer&apos;s Operational Semantics'/><author><name>Jeremy Siek</name><uri>http://www.blogger.com/profile/13773635290126992920</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11162230.post-2611686767703234034</id><published>2009-12-04T22:41:00.000-08:00</published><updated>2009-12-10T21:37:14.273-08:00</updated><title type='text'>Greatest Common Divisor</title><content type='html'>Euclid's algorithm for computing the greatest common divisor of two integers is beautiful because it is extremely simple and also captures an interesting property of linear equations. The equation &lt;span style="font-style: italic;"&gt;ax+by = c&lt;/span&gt; has an integer solution if and only if &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; divides &lt;span style="font-style: italic;"&gt;c&lt;/span&gt;, where &lt;span style="font-style: italic;"&gt;gcd&lt;/span&gt; is Euclid's algorithm written below. Recall that &lt;span style="font-style: italic;"&gt;x&lt;/span&gt; divides &lt;span style="font-style: italic;"&gt;y&lt;/span&gt; means there exists some &lt;span style="font-style: italic;"&gt;n&lt;/span&gt; such that &lt;span style="font-style: italic;"&gt;xn = y&lt;/span&gt;.&lt;br /&gt;&lt;pre&gt;gcd(a,b) =&lt;br /&gt;  if a == 0 then&lt;br /&gt;     b&lt;br /&gt;  else if b == 0 then&lt;br /&gt;     a&lt;br /&gt;  else if b &amp;lt; a then&lt;br /&gt;     gcd(a - b, b)&lt;br /&gt;  else&lt;br /&gt;     gcd(a, b - a)&lt;br /&gt;&lt;/pre&gt;Proving that Euclid's algorithm really works is a good exercise in applying strong induction.  We are going to prove that &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; is the greatest common divisor of &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;. To apply strong induction, we need to pick a number to do induction on. The numbers &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; or &lt;span style="font-style: italic;"&gt;b&lt;/span&gt; are obvious candidates, but neither does the job. Consider the two branches of the "if" expression in &lt;span style="font-style: italic;"&gt;gcd&lt;/span&gt;. If we choose to do induction on &lt;span style="font-style: italic;"&gt;a&lt;/span&gt;, then the "else" branch will cause us trouble because we won't be able to apply the induction hypothesis for &lt;span style="font-style: italic;"&gt;gcd(a, b - a)&lt;/span&gt;. If we choose to do induction on &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;, then we'll have the same kind of trouble in the "then" branch. We need some number that gets smaller in both branches. It turns out that &lt;span style="font-style: italic;"&gt;a + b&lt;/span&gt; is such a number.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Theorem (Correctness of gcd).&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; is the greatest common divisor of &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Proof&lt;/span&gt;.&lt;br /&gt;We proceed by strong induction on &lt;span style="font-style: italic;"&gt;a + b&lt;/span&gt;. When trying to prove something about a function like &lt;span style="font-style: italic;"&gt;gcd&lt;/span&gt;, it often helps to structure your proof in a way that mimics the definition of the function. That is, we'll do case analysis in the proof in a way that matches the cases in the definition of &lt;span style="font-style: italic;"&gt;gcd&lt;/span&gt;.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Case &lt;span style="font-style: italic;"&gt;a = 0&lt;/span&gt;:&lt;/span&gt;&lt;br /&gt;In this case &lt;span style="font-style: italic;"&gt;gcd(a,b) = b&lt;/span&gt;. We know that &lt;span style="font-style: italic;"&gt;b&lt;/span&gt; divides &lt;span style="font-style: italic;"&gt;0&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt; divides &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;. Also, for any other divisor &lt;span style="font-style: italic;"&gt;d&lt;/span&gt; of &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;, it is trivially true that &lt;span style="font-style: italic;"&gt;d&lt;/span&gt; divides &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;. Thus, &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; is the greatest common divisor of &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Case &lt;span style="font-style: italic;"&gt;not (a = 0) and b = 0&lt;/span&gt;:&lt;/span&gt;&lt;br /&gt;The reasoning is the mirror image of the previous case and left for the reader.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Case &lt;span style="font-style: italic;"&gt;not (a = 0) and not (b = 0)&lt;/span&gt;:&lt;/span&gt;&lt;br /&gt;Without loss of generality, assume that &lt;span style="font-style: italic;"&gt;b &amp;lt; a&lt;/span&gt;. Then &lt;span style="font-style: italic;"&gt;gcd(a,b) = gcd(a - b, b)&lt;/span&gt;. Note that &lt;span style="font-style: italic;"&gt;(a - b) + b &amp;lt; a + b&lt;/span&gt;. So by the induction hypothesis we know that&lt;span style="font-style: italic;"&gt; gcd(a - b, b)&lt;/span&gt; is the greatest common divisor of &lt;span style="font-style: italic;"&gt;a - b&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt; and so is its equal, &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt;. Because &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; divides both &lt;span style="font-style: italic;"&gt;a - b&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; divides &lt;span style="font-style: italic;"&gt;a&lt;/span&gt;, so &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; is a common divisor of &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;. To finish we need to show it is the greatest. Assume &lt;span style="font-style: italic;"&gt;d&lt;/span&gt; is an arbitrary common divisor of &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;. Then &lt;span style="font-style: italic;"&gt;d&lt;/span&gt; divides &lt;span style="font-style: italic;"&gt;a - b&lt;/span&gt; and because &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; is the greatest common divisor of &lt;span style="font-style: italic;"&gt;a - b&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;, we can conclude that &lt;span style="font-style: italic;"&gt;d&lt;/span&gt; divides &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt;. We therefore proved that &lt;span style="font-style: italic;"&gt;gcd(a,b)&lt;/span&gt; is the greatest common divisor of &lt;span style="font-style: italic;"&gt;a&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;b&lt;/span&gt;.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;QED&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;A proof of this theorem in Isabelle can be found &lt;a href="http://ecee.colorado.edu/%7Esiek/GreatestCommonDivisor.thy"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11162230-2611686767703234034?l=siek.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://siek.blogspot.com/feeds/2611686767703234034/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11162230&amp;postID=2611686767703234034' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11162230/posts/default/2611686767703234034'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11162230/posts/default/2611686767703234034'/><link rel='alternate' type='text/html' href='http://siek.blogspot.com/2009/12/greatest-common-divisor.html' title='Greatest Common Divisor'/><author><name>Jeremy Siek</name><uri>http://www.blogger.com/profile/13773635290126992920</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11162230.post-6327934390542037619</id><published>2009-11-30T01:12:00.000-08:00</published><updated>2010-02-08T13:47:08.641-08:00</updated><title type='text'>Strong Induction</title><content type='html'>&lt;blockquote&gt;&lt;/blockquote&gt;Induction, in one form or another, is the main tool that computer scientists use to prove properties of the systems that they build. Induction has many forms, some of which are much more suitable to certain situations than others, so it's a good idea to know the many different forms.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Curiously enough, most of the different forms of induction boil down to good old mathematical induction. So in some sense, all one really needs to learn is mathematical induction. Nevertheless, the "boiling down" of the different forms to mathematical induction is not completely trivial, so it still makes sense to learn the others.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Recall that mathematical induction goes as follows:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you want to prove something about all the natural numbers, &lt;span style="font-style: italic;"&gt;forall n, P(n)&lt;/span&gt;, it suffices to prove that&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li style="font-style: italic;"&gt;P(0)&lt;/li&gt;&lt;li style="font-style: italic;"&gt;forall k, P(k) implies P(k+1)&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;Mathematical induction is directly applicable in many situations, but it also falls down in some cases. For example, suppose you want to prove some property about binary trees and try to do induction on the height of the tree. You'd like to use the induction hypothesis to prove the property in question for the sub-trees. However, the height of each sub-tree is not necessarily one less than the height of the current tree (the height could be even less). Thus, you wouldn't be able to apply the induction hypothesis to each sub-tree.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this situation, strong induction (a.k.a. complete induction or course of values induction) is a a much better fit (and structural induction is an even better fit, but I'll wait to talk about that). Recall that strong induction goes as follows:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To prove a property about all integers, &lt;span style="font-style: italic;"&gt;forall n, P(n)&lt;/span&gt;, it suffices to prove that for any &lt;span style="font-style: italic;"&gt;k&lt;/span&gt;, if for any m where &lt;span style="font-style: italic;"&gt;m &amp;lt; k&lt;/span&gt;, you have &lt;span style="font-style: italic;"&gt;P(m)&lt;/span&gt;, then &lt;span style="font-style: italic;"&gt;P(k)&lt;/span&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So the nice thing about strong induction is that you get to assume that the property is true of all the natural numbers less than &lt;span style="font-style: italic;"&gt;k&lt;/span&gt;, instead of just &lt;span style="font-style: italic;"&gt;k - 1&lt;/span&gt; as is the case in mathematical induction.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Even though strong induction is much easier to apply in many situations than mathematical induction, strong induction boils down to mathematical induction. Here's the proof of strong induction by use of mathematical induction.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As with many proofs, the strong induction principle cannot be proved directly by induction, but instead a slightly stronger version can be proved instead. This seems unintuitive! Why would something stronger be easier to prove? The reason is that when proving something by induction, in the induction step you get to assume  what you're trying to prove (for &lt;span style="font-style: italic;"&gt;n-1&lt;/span&gt;) so the stronger the property, the more horsepower you have to get through the induction step. In this case, instead of proving &lt;span style="font-style: italic;"&gt;forall k, P(k)&lt;/span&gt; we'll prove that &lt;span style="font-style: italic;"&gt;forall j, forall k &amp;lt; j, P(k)&lt;/span&gt;. It's easy to see that the later implies the former. Just pick &lt;span style="font-style: italic;"&gt;j=k+1&lt;/span&gt; and you have &lt;span style="font-style: italic;"&gt;P(k)&lt;/span&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Lemma.&lt;/span&gt; If &lt;span style="font-style: italic;"&gt;(forall n, (forall k &amp;lt; n. P k) implies P n)&lt;/span&gt;, then &lt;span style="font-style: italic;"&gt;(forall j, forall k &amp;lt; j, P(k))&lt;/span&gt;.&lt;/div&gt;&lt;div&gt;The proof is by mathematical induction on j.&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Base case (&lt;span style="font-style: italic;"&gt;j=0&lt;/span&gt;):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;There are no &lt;span style="font-style: italic;"&gt;k &amp;lt; 0&lt;/span&gt;, so this case is vacuously true.&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;I&lt;/span&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;nduction step:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;The induction hypothesis is: if &lt;span style="font-style: italic;"&gt;(forall n, (forall k &amp;lt; n. P(k)) implies P(n))&lt;/span&gt;, then &lt;span style="font-style: italic;"&gt;(forall k &amp;lt; j. P(k))&lt;/span&gt;. We need to show the corresponding property for &lt;span style="font-style: italic;"&gt;j + 1&lt;/span&gt;. We proceed by assuming that&lt;span style="font-style: italic;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div style="text-align: center;"&gt;    &lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;span style="font-style: italic;"&gt;forall n, (forall k &amp;lt; n. P(k)) implies P(n)           &lt;/span&gt; (call this fact H)&lt;br /&gt;&lt;/div&gt;and need to prove that &lt;span style="font-style: italic;"&gt;forall k &amp;lt; j+1, P(k)&lt;/span&gt;. We then pick an arbitrary &lt;span style="font-style: italic;"&gt;k&lt;/span&gt; less than &lt;span style="font-style: italic;"&gt;j+1&lt;/span&gt; and need to show that &lt;span style="font-style: italic;"&gt;P(k)&lt;/span&gt;. Note that by the induction hypothesis combined with H, we know that&lt;span style="font-style: italic;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div style="text-align: center;"&gt;&lt;span style="font-style: italic;"&gt;forall m &amp;lt; j. P(m)&lt;/span&gt; (call this fact A).&lt;br /&gt;&lt;/div&gt;Now, because &lt;span style="font-style: italic;"&gt;k &amp;lt; j+1&lt;/span&gt; we have two cases to consider:&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Case &lt;span style="font-style: italic;"&gt;k &amp;lt; j&lt;/span&gt;: &lt;span class="Apple-style-span" style="font-weight: normal;"&gt;We can apply fact A to conclude that &lt;span style="font-style: italic;"&gt;P(k)&lt;/span&gt;. Note that without the modification to strengthen this lemma, we would have been stuck here.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Case &lt;span style="font-style: italic;"&gt;k = j&lt;/span&gt;:&lt;span class="Apple-style-span" style="font-weight: normal;"&gt; From fact A and H, we have &lt;span style="font-style: italic;"&gt;P(j)&lt;/span&gt;. But since &lt;span style="font-style: italic;"&gt;j=k&lt;/span&gt; we can conclude that &lt;span style="font-style: italic;"&gt;P(k)&lt;/span&gt;.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;QED&lt;/span&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://ecee.colorado.edu/%7Esiek/strong-ind.thy"&gt;Here&lt;/a&gt; is a version of the above proof written in a machine-checkable language (&lt;a href="http://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html"&gt;Isabelle&lt;/a&gt;).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11162230-6327934390542037619?l=siek.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://siek.blogspot.com/feeds/6327934390542037619/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11162230&amp;postID=6327934390542037619' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11162230/posts/default/6327934390542037619'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11162230/posts/default/6327934390542037619'/><link rel='alternate' type='text/html' href='http://siek.blogspot.com/2009/11/strong-induction.html' title='Strong Induction'/><author><name>Jeremy Siek</name><uri>http://www.blogger.com/profile/13773635290126992920</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
