I ran a set of parse ranking experiments on the JHPSTG corpus. I chose a reasonable set of learning parameters and ranged over the machine learning priors. The highest accuracy I got was 40.6. This seems very low, and way beneath the state of the art. Do I have a bug?<div>
<br></div><div>Here's the batch-experiment call I use:</div><div><br></div><div><div>(in-package :tsdb)</div><div><br></div><div>(load "parsing.lisp")</div><div><br></div><div>(batch-experiment</div><div> :source "jhpstg" :skeleton "jhpstg"</div>
<div> :nfold 10 :niterations 2 :type :mem</div><div> :prefix "jhpstg"</div><div> :score-similarities nil</div><div> :grandparenting 4</div><div> :active-edges-p t</div><div> :lexicalization-p nil</div><div> :constituent-weight 0</div>
<div> :ngram-size 4 :ngram-back-off-p t</div><div> :lm-p nil</div><div> :random-sample-size nil</div><div> :counts-absolute 0 :counts-contexts 0 :counts-events 0 :counts-relevant 1</div><div> :variance '(nil 1e4 1e2 1e0 1e-2 1e-4 1e-6)</div>
<div> :relative-tolerance '(1e-6 1e-8 1e-10))</div><div><br></div><div>I ran scoring with the following lisp script:</div><div><br></div><div><div>(setf *tsdb-home* "/home/billmcn/logon/lingo/redwoods/tsdb/home")</div>
<div>(summarize-folds :output "/home/billmcn/temp/jhpstg.results" :pattern "\\[jhpstg\\]")</div><div><br></div><div>I got the following results:</div><div><br></div><div><div>39.449802 9.746283 13.783325 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6] AT[1.0e-20] VA[1.0e-6] PC[100]'</div>
<div>38.287010 11.390718 16.108908 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6] AT[1.0e-20] VA[1.0e-4] PC[100]'</div><div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6] AT[1.0e-20] VA[1.0e+4] PC[100]'</div>
<div>38.173570 4.652545 6.579693 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10] AT[1.0e-20] VA[1.0e-2] PC[100]'</div><div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8] AT[1.0e-20] VA[1.0e+0] PC[100]'</div>
<div>38.287010 11.390718 16.108908 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10] AT[1.0e-20] VA[] PC[100]'</div><div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10] AT[1.0e-20] VA[1.0e+0] PC[100]'</div>
<div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10] AT[1.0e-20] VA[1.0e+4] PC[100]'</div><div>40.612595 8.101849 11.457745 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10] AT[1.0e-20] VA[1.0e-4] PC[100]'</div>
<div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6] AT[1.0e-20] VA[1.0e+2] PC[100]'</div><div>38.173570 4.652545 6.579693 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6] AT[1.0e-20] VA[1.0e-2] PC[100]'</div>
<div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6] AT[1.0e-20] VA[1.0e+0] PC[100]'</div><div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8] AT[1.0e-20] VA[1.0e+4] PC[100]'</div>
<div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8] AT[1.0e-20] VA[1.0e+2] PC[100]'</div><div>38.287010 11.390718 16.108908 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8] AT[1.0e-20] VA[] PC[100]'</div>
<div>34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10] AT[1.0e-20] VA[1.0e+2] PC[100]'</div><div>38.287010 11.390718 16.108908 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10] AT[1.0e-20] VA[1.0e-6] PC[100]'</div>
<div>38.173570 4.652545 6.579693 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8] AT[1.0e-20] VA[1.0e-2] PC[100]'</div><div>40.612595 8.101849 11.457745 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8] AT[1.0e-20] VA[1.0e-4] PC[100]'</div>
<div>39.449802 9.746283 13.783325 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4] NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8] AT[1.0e-20] VA[1.0e-6] PC[100]'</div><div><br></div></div></div><br>-- <br>W.P. McNeill<br>
<a href="http://staff.washington.edu/billmcn/index.shtml">http://staff.washington.edu/billmcn/index.shtml</a><br>Sent from Seattle, WA, United States
</div>