Corpus:
https://docs.google.com/document/d/1n24mDhheoHtj-NJB2JN0Cz0Ztr8jRLOkhl9b_N7o6mk/edit

Mutual information:
left-word right-word mutual-info fractional-mutual-info

Questions:
a) If only N-1 links per N-gram sentence are allowed?
b) If MI calculation does not account for distance and word frequency?
c) Can we extract MI sheet used for the parse?
d) Lack of WSD and account for distance lead to incorrectly parsed links which break the structure of the entire parse?
e) Can presence or absence of one links in the parse depend on statistical bias in respect to other links?
f) Can lack of aggregation of words in sentence into "phrases" (see my earlier email) lead to inconsistent links like either v->n or v->a & a->n in different sentences of the same structure?
g) To actually get the pipeline POC, should we start not with elementary English corpus but with elementary Turtle corpus?