配置
config模块包含了各种用于修改Theano的属性。在Theano导入时,许多属性都会被检查,而有些属性是只读模式。
一般约定,在用户代码内部config模块的属性不应当被修改。
Theano的这些属性都有默认值,但是你也可以在你的.theanorc文件里面修改,并且使用THEANO_FLAGS的环境变量进行修改。
优先顺序是:
1. theano.config.<property>的赋值
2. THEANO_FLAGS的赋值
3..theanorc(或者在THEANORC文件中表示)的赋值
通过打印theano.config可以展示当前的配置:
python-c 'import theano; print theano.config' | less
例如,修改笔记(二)中的逻辑回归函数,设置精度为float32
#!/usr/bin/envpython #Theano tutorial #Solution to Exercise in section 'Configuration Settings and Compiling Modes' importnumpy importtheano importtheano.tensor as tt theano.config.floatX= 'float32' rng= numpy.random N= 400 feats= 784 D= (rng.randn(N, feats).astype(theano.config.floatX), rng.randint(size=N,low=0, high=2).astype(theano.config.floatX)) training_steps= 10000 #Declare Theano symbolic variables x= tt.matrix("x") y= tt.vector("y") w= theano.shared(rng.randn(feats).astype(theano.config.floatX),name="w") b= theano.shared(numpy.asarray(0., dtype=theano.config.floatX),name="b") x.tag.test_value= D[0] y.tag.test_value= D[1] #print"Initial model:" #printw.get_value(), b.get_value() #Construct Theano expression graph p_1= 1 / (1 + tt.exp(-tt.dot(x, w) - b)) #Probability of having a one prediction= p_1 > 0.5 # The prediction that isdone: 0 or 1 xent= -y * tt.log(p_1) - (1 - y) * tt.log(1 - p_1) # Cross-entropy cost= tt.cast(xent.mean(), 'float32') + 0.01 * (w ** 2).sum() # The cost to optimize gw,gb = tt.grad(cost, [w, b]) #Compile expressions to functions train= theano.function( inputs=[x, y], outputs=[prediction, xent], updates={w: w - 0.01 * gw, b: b -0.01 * gb}, name="train") predict= theano.function(inputs=[x], outputs=prediction, name="predict") ifany([x.op.__class__.__name__ in ['Gemv', 'CGemv', 'Gemm', 'CGemm'] for x in train.maker.fgraph.toposort()]): print 'Used the cpu' elifany([x.op.__class__.__name__ in ['GpuGemm', 'GpuGemv'] for x in train.maker.fgraph.toposort()]): print 'Used the gpu' else: print 'ERROR, not able to tell if theanoused the cpu or the gpu' print train.maker.fgraph.toposort() fori in range(training_steps): pred, err = train(D[0], D[1]) #print"Final model:" #printw.get_value(), b.get_value() print"target values for D" printD[1] print"prediction on D" printpredict(D[0])
用time python file.py运行,可得:
real 0m15.055s user 0m11.527s sys 0m0.801s
Mode
每次调用theano.function时,Theano变量输入和输出的符号化关系都被优化和编译了。
而这些编辑都通过made参数的值来控制。
Theano定义以下mode:
FAST_COMPILE:
compile.mode.Mode(linker='py',optimizer='fast_compile')
FAST_RUN:
compile.mode.Mode(linker='cvm',optimizer='fast_run')
DebugMode:
compile.debugmode.DebugMode()
ProfileMode(不赞成使用):
compile.profilemode.ProfileMode()
默认的模式是FAST_RUN,但是通过传递关键字参数给theano.function,可以控制config.mode,从而改变模式。
Linkers
一个mode由2个部分组成:1个优化器和1个Linker。
[1] gc指计算中间过程的碎片收集。否则在Theano函数调用之间,操作所使用的内存空间将被保存起来。为了不重新分配内存,降低开销(overhead),使其速度更快。
[2] 默认linker
[3] 不推荐使用
使用DebugMode
一般你应当使用FAST_RUN 或者FAST_COMPILE模式,当你定义新的类型的表达式或者优化方法时,先用DebugMode(mode=‘DebugMode‘)运行是很有用的,DebugMode通过运行一些自检和判断程序来帮助诊断出将会导致错误输出的可能的编程错误。值得注意的是,DebugMode比FAST_RUN或者 FAST_COMPILE模式要慢得多,所以只在开发期使用。
举个例子:
import theano importtheano.tensor as T x= T.dvector('x') f= theano.function([x], 10 * x, mode='DebugMode') f([5]) f([0]) f([7])
运行后,如果有问题,输出会提示异常,如果依然不能解决,请联系本领域的专家。
但是DebugMode也不是万能的,因为有些错误只在特定的输入条件下才会出现。
如果你使用构造器而不是关键词DebugMode,就可以通过配构造器变量来配置。而关键词设置太严格了。
ProfileMode不推荐使用
检索时间信息
图编译好之后,运行就可以了。然后调用profmode.print_summary(),返回各自时间信息,例如你的图大多数时间花在什么地方了等等。
还是以逻辑回归为例
生成ProfileMode实例
fromtheano import ProfileMode profmode= theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())
在函数末尾声明一下
train = theano.function( inputs=[x,y], outputs=[prediction,xent], updates={w:w - 0.01 * gw, b: b - 0.01 * gb}, name="train",mode=profmode) #如果是Module则这样声明: # m = theano.Module() # minst = m.make(mode=profmode)
取回时间信息
文件末尾添加
profmode.print_summary()
则运行效果是这样的
ProfileMode.print_summary() --------------------------- Timesince import 6.183s Theanocompile time: 0.000s (0.0% since import) Optimization time: 0.000s Linker time: 0.000s Theanofct call 5.452s (88.2% since import) Theano Op time 5.003s 80.9%(since import)91.8%(of fct call) Theano function overhead in ProfileMode0.449s 7.3%(since import) 8.2%(of fct call) 10000Theano fct call, 0.001s per call Restof the time since import 0.730s 11.8% Theanofct summary: <%total fct time> <total time> <time per call> <nb call><fct name> 100.0% 5.452s 5.45e-04s 10000 train SingleOp-wise summary: <%of local_time spent on this kind of Op> <cumulative %> <selfseconds> <cumulative seconds> <time per call> [*]<nb_call> <nb_op> <nb_apply> <Op name> 87.9% 87.9% 4.400s 4.400s 2.20e-04s * 20000 1 2 <class 'theano.tensor.blas_c.CGemv'> 10.8% 98.8% 0.542s 4.942s 5.42e-06s * 100000 10 10 <class 'theano.tensor.elemwise.Elemwise'> 0.5% 99.3% 0.023s 4.966s 1.17e-06s * 20000 1 2 <class 'theano.tensor.basic.Alloc'> 0.4% 99.6% 0.018s 4.984s 6.05e-07s * 30000 2 3 <class'theano.tensor.elemwise.DimShuffle'> 0.3% 99.9% 0.013s 4.997s 1.25e-06s * 10000 1 1 <class 'theano.tensor.elemwise.Sum'> 0.1% 100.0% 0.007s 5.003s 3.35e-07s * 20000 1 2 <class 'theano.compile.ops.Shape_i'> ... (remaining 0 single Op account for0.00%(0.00s) of the runtime) (*)Op is running a c implementation Op-wisesummary: <%of local_time spent on this kind of Op> <cumulative %> <selfseconds> <cumulative seconds> <time per call> [*] <nb_call> <nb apply> <Opname> 87.9% 87.9% 4.400s 4.400s 2.20e-04s * 20000 2CGemv{inplace} 6.3% 94.3% 0.318s 4.718s 3.18e-05s * 10000 1Elemwise{Composite{[Composite{[Composite{[sub(mul(i0, i1), neg(i2))]}(i0,scalar_softplus(i1), mul(i2, i3))]}(i0, i1, i2, scalar_softplus(i3))]}} 2.1% 96.3% 0.103s 4.820s 1.03e-05s * 10000 1Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1, i2))]}(i0,neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1, i2,exp(i3), i4, i5)]}}[(0, 0)] 1.6% 98.0% 0.082s 4.902s 8.16e-06s * 10000 1Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0, 0)] 0.5% 98.4% 0.023s 4.925s 1.17e-06s * 20000 2 Alloc 0.3% 98.7% 0.013s 4.938s 1.25e-06s * 10000 1 Sum 0.2% 98.9% 0.012s 4.950s 6.11e-07s * 20000 2InplaceDimShuffle{x} 0.2% 99.1% 0.008s 4.959s 8.44e-07s * 10000 1Elemwise{gt,no_inplace} 0.1% 99.2% 0.007s 4.965s 6.80e-07s * 10000 1Elemwise{sub,no_inplace} 0.1% 99.4% 0.007s 4.972s 3.35e-07s * 20000 2 Shape_i{0} 0.1% 99.5% 0.006s 4.978s 6.11e-07s * 10000 1Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)] 0.1% 99.6% 0.006s 4.984s 5.93e-07s * 10000 1InplaceDimShuffle{1,0} 0.1% 99.7% 0.005s 4.989s 5.33e-07s * 10000 1Elemwise{neg,no_inplace} 0.1% 99.8% 0.005s 4.994s 4.85e-07s * 10000 1Elemwise{Cast{float32}} 0.1% 99.9% 0.005s 4.999s 4.60e-07s * 10000 1Elemwise{inv,no_inplace} 0.1% 100.0% 0.004s 5.003s 4.25e-07s * 10000 1Elemwise{Composite{[sub(i0, mul(i1, i2))]}}[(0, 0)] ... (remaining 0 Op account for 0.00%(0.00s) of the runtime) (*)Op is running a c implementation Apply-wisesummary: <%of local_time spent at this position> <cumulative %%> <applytime> <cumulative seconds> <time per call> [*] <nb_call><Apply position> <Apply Op name> 54.7% 54.7% 2.737s 2.737s 2.74e-04s * 10000 7 CGemv{inplace}(Alloc.0, TensorConstant{1.0}, x, w,TensorConstant{0.0}) 33.2% 87.9% 1.663s 4.400s 1.66e-04s * 10000 18 CGemv{inplace}(w, TensorConstant{-0.00999999977648}, x.T,Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1, i2))]}(i0,neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1, i2,exp(i3), i4, i5)]}}[(0, 0)].0, TensorConstant{0.999800026417}) 6.3% 94.3% 0.318s 4.718s 3.18e-05s * 10000 13 Elemwise{Composite{[Composite{[Composite{[sub(mul(i0, i1),neg(i2))]}(i0, scalar_softplus(i1), mul(i2, i3))]}(i0, i1, i2,scalar_softplus(i3))]}}(y, Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)].0,Elemwise{sub,no_inplace}.0, Elemwise{neg,no_inplace}.0) 2.1% 96.3% 0.103s 4.820s 1.03e-05s * 10000 16 Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1,i2))]}(i0, neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1,i2, exp(i3), i4, i5)]}}[(0,0)](Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)].0, Alloc.0, y, Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)].0,Elemwise{sub,no_inplace}.0, Elemwise{Cast{float32}}.0) 1.6% 98.0% 0.082s 4.902s 8.16e-06s * 10000 14 Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)](Elemwise{neg,no_inplace}.0) 0.3% 98.3% 0.015s 4.917s 1.53e-06s * 10000 12 Alloc(Elemwise{inv,no_inplace}.0, Shape_i{0}.0) 0.3% 98.5% 0.013s 4.930s 1.25e-06s * 10000 17 Sum(Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0,add(i1, i2))]}(i0, neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4,i5)]}(i0, i1, i2, exp(i3), i4, i5)]}}[(0, 0)].0) 0.2% 98.7% 0.008s 4.938s 8.44e-07s * 10000 15Elemwise{gt,no_inplace}(Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)].0, TensorConstant{(1,) of 0.5}) 0.2% 98.9% 0.008s 4.946s 8.14e-07s * 10000 5 Alloc(TensorConstant{0.0}, Shape_i{0}.0) 0.1% 99.0% 0.007s 4.953s 6.80e-07s * 10000 4 Elemwise{sub,no_inplace}(TensorConstant{(1,) of 1.0}, y) 0.1% 99.1% 0.006s 4.959s 6.16e-07s * 10000 6 InplaceDimShuffle{x}(Shape_i{0}.0) 0.1% 99.2% 0.006s 4.965s 6.11e-07s * 10000 9 Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)](CGemv{inplace}.0,InplaceDimShuffle{x}.0) 0.1% 99.4% 0.006s 4.972s 6.07e-07s * 10000 0 InplaceDimShuffle{x}(b) 0.1% 99.5% 0.006s 4.977s 5.93e-07s * 10000 2 InplaceDimShuffle{1,0}(x) 0.1% 99.6% 0.005s 4.983s 5.33e-07s * 10000 11 Elemwise{neg,no_inplace}(Elemwise{Composite{[sub(neg(i0), i1)]}}[(0,0)].0) ... (remaining 5 Apply instances account for0.41%(0.02s) of the runtime) (*)Op is running a c implementation
欢迎参与讨论并关注本博客和微博以及知乎个人主页后续内容继续更新哦~
转载请您尊重作者的劳动,完整保留上述文字以及文章链接,谢谢您的支持!
原文地址:http://blog.csdn.net/ycheng_sjtu/article/details/39010413