Featured image of post 解析SICP-Proj4 EC

解析SICP-Proj4 EC

对SICP课程Project04EC问题的一点思考

解析SICP-Proj4 EC

Photo by Florian Klauer on Unsplash|600

本文章请搭配NJU-SICP Project04 EC (Adopted from CS61A from Berkeley) 食用

Written By SweetGargamel

Introduction - proj04: Scheme

Debugging Guide - proj04: Scheme

原问题

Why current interpreter cannot properly process tail call?

Consider such scheme code:

1
2
3
4
5
6
(define (factor n acc)
  (if (= n 0)
      acc
      (factor (- n 1) (* acc n))))

(factor 10 1)

The current evaluation process is:

1
2
3
4
     scheme_eval(`(factor 10 1))
===> return scheme_eval(`(factor 9 10))
===> return (return scheme_eval(`(factor 8 90)))
...

The problem is python interpreter does not properly perform tail call.

There’re mainly three way to solve this problem:

  • transform the scheme interpreter to an non-recursive interpreter, i.e., eliminate all recursive call to scheme_eval. That can be accomplished by
    • rewrite this interpreter to an CEK interpreter. (‘C’ means control, ‘E’ means environment, ‘K’ means continuation). See here and Chap. 5 of ESSENTIALS OF PROGRAMMING LANGUAGES.
    • rewrite this interpreter to an trampolined interpreter. We use this approach.
  • transform the source scheme code, e.g. Continuation-Passing Style transformation.

The trampoline technique is the simplest one. So we use this approach.

Trampoline

The trampoline technique is a simple technique to perform proper tail call in a “bad” language (e.g. Python, C, C++, Java, …).

For example, we can define a sum recursive function in python:

1
2
3
4
5
def sum(n, acc):
    if n == 0:
        return acc
    else:
        return sum(n - 1, acc + n)

And sum(10000, 0) will produce a StackOverflow error. Now we “trampoline” such function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
class Unevaluated:
    def __init__(self, n, acc):
        self.n = n
        self.acc = acc

def sum_tram(n, acc):
    if n == 0:
        return acc
    else:
        return Unevaluated(n - 1, acc + n)

def sum(n, acc):
    # use do-while here is better, but Python does not support that
    res = sum_tram(n, acc)
    while isinstance(res, Unevaluated):
        res = sum_tram(res.n, res.acc)
    return res

Now sum(10000, 0) will properly works. This can be generalized to transform all recursive functions to loops, see here. But the above form is enough for our problem.

The Tasks

Complete the function optimize_tail_calls in scheme_eval_apply.py. It returns an alternative to scheme_eval that is properly tail recursive. That is, the interpreter will allow an unbounded number of active tail calls in constant space. It has a third argument tail that indicates whether the expression to be evaluated is in a tail context.

The Unevaluated class represents an expression that needs to be evaluated in an environment. When optimized_eval receives a non-atomic expression in a tail context, it returns an Unevaluated instance. Otherwise, it should repeatedly call original_scheme_eval until the result is a value, rather than an Unevaluated.

A successful implementation will require changes to several other functions, including some functions that we provided for you. All expressions throughout your interpreter that are in a tail context should be evaluated by calling scheme_eval with True as the third argument (now called tail). Your goal is to determine which expressions are in a tail context throughout your code and change calls to scheme_eval as needed.

Once you finish, uncomment the following line in scheme_eval_apply.py to use your implementation:

scheme_eval = optimize_tail_calls(scheme_eval)

疑惑引出

问题

1
2
3
4
5
6
7
8
def optimize_tail_calls(original_scheme_eval):
 def optimized_eval(expr, env, tail=False):
        if tail and not scheme_symbolp(expr) and not self_evaluating(expr):
            return Unevaluated(expr,env)
        # BEGIN PROBLEM EC

        # END PROBLEM EC
    return optimized_eval
  • 是的,很显然你会不会觉得,开始在运行第一行代码的时候,如果是尾递归的话就会直接返回一个Unevaluated的玩意,这程序不就该崩了吗?

答案

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def optimize_tail_calls(original_scheme_eval):
 def optimized_eval(expr, env, tail=False):
        if tail and not scheme_symbolp(expr) and not self_evaluating(expr):
            return Unevaluated(expr,env)
        # BEGIN PROBLEM EC
        res=original_scheme_eval(expr, env)
        while isinstance(res, Unevaluated):
            res = original_scheme_eval(res.expr, res.env)
        return res
        # END PROBLEM EC
    return optimized_eval
  • 是的你看了答案会不会更觉得下面的代码根本就运行不到?
  • (当然除了该这些代码还应该按照助教的建议,把凡是尾递归的部分都给加一个tail=True的参数)

使用Vscode的Debug功能解析

怎么使用Debug

  • 直接按照助教的提示修改.vscode/launch.json文件即可

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "scheme.py",
            "args":[
                "-i",
                "tests.scm"
            ],
            "console": "integratedTerminal",
            "justMyCode": true,

        }
    ]
}
  • 你可以修改tests.scm里面的测试用例来测试,下面是我的例子
1
2
3
4
5
6
(define (factor n acc)
  (if (= n 0)
      acc
      (factor (- n 1) (* acc n))))

(factor 10 1)
  • 当然别忘打断点了

Debug,启动!

  • 点击Start Debugging

  • |300

  • 然后你会发现左边出来这些东西,分别是你的VARIABLESWATCHCALL_STACK

    • VARIABLES是当前的变量
    • WATCH可以添加自己想看的表达式或者变量的值
    • CALL_STACK看调用的堆栈

我们发现他停到了if这个语句停下来了

现在解释器在干啥?

  • 我们发现左上角的expr的第一个symdefine,说明他正在进行do_define_form

怎么让解释器往下走?

在最上面有几个按钮,分别是下面几个

  • Step Into 如果当前语句是一个call expression,他就会进入函数体让你看函数体内部是怎么执行的;否则和Step Over一样。
  • Step Over:不管是否是call expression直接当一条普通语句运行过去。
  • Step Out:跳出当前Frame(比如A里面调用了B函数,我现在通过Step Into进入了B所在的Frame,Step Out就直接回到A里面)

所以我们按Step Into,你会发现他跳过了上面的if语句;这是因为我们现在是在do_define_form,并不是尾递归操作,所以就调用原始的eval_call.

  • 我们再次选择Step Into,发现:

这个部分就比较好理解了,他就是在做define的内容,我就不解释了。

做完define form之后

  • 按照解释器解释的规则,我们把factor绑定到对应的lambda函数之后,就会执行(factor 10 1)

那么他的的执行顺序就是(请你注意看左下角的CALL_STACK

  1. eval factor
  2. eval 10
  3. eval 1 图略,类似上面
  4. apply lambda 10 1

我们发现这些都不是尾递归;

在这一步之后选择Step Into

进入apply过程!

  • 你会发现apply后,他就在do_if_form。注意,在调用do_if_form的时候参数tail=True

  • 这里继续选择Step into

  • 然后就到了最核心的部分,这里终于会有Unevaluated类产生了(下面都简写为un类)

  • 这里你要思考,我的un类返回到哪里了?你点一下Step Into,发现这个东西是返回到了scheme_eval里面。

  • 那么他是在eval谁?你看左边的Variables,这会正在做do_define_form

  • 然后你就会发现他回到了最外层的apply的过程

  • 诶你是不是发现,这里上面虽然最左边的tailfalse,但是这里返回了un类。

思路理顺

通过上面的分析,我们发现,实际上un类并不会”漏出去“,实际上他是通过好层的递归调用,在最外层的eval_calleval的是函数的call-expression)拿到了最里层eval_call(这里是do_if_form)返回的un类,然后再循环往复的eval

反思总结

  • 我和xsh和fhy同学都交流过,他们似乎都感觉最后这个un类会被返回到最外层,从而你最后的eval结果都是一个un类。说明这应该是许多同学的共同的问题
  • 然而通过Debug的逐句调用,我们发现靠空想是不行的,需要深入的一行行把自己当编译器理解。Debug应该是我们要掌握的一门技术。

最后再次感谢助教、各位同学的帮助!

Licensed under CC BY-NC-SA 4.0
Recording Myself!
使用 Hugo 构建
主题 StackJimmy 设计