Unit 5: Sequence tagging with MEMMs
1.Assign Penn Treebank POS tags to the words in the following sentence:
2.What is the difference between an "adverb phrase" and an "adverbial"?
3.What are open class and closed class grammatical categories? - Name at least two parts of speech that are open class.- Name at least two parts of speech that are closed class.
4.What are the assumptions of a first-order hidden Markov model?
5.Using the given probability table, calculate the probability of the weather forecast for the next 7 days being "sunny-sunny-rainy-rainy-sunny-cloudy-sunny" given that today is sunny (i.e., the probability of it being sunny today is 1).
6.What are two components (probabilities) of an HMM tagger?
7.What are the advantages of using a MEMM instead of an HMM?
8.What is the Viterbi algorithm? How is it used?
9.Describe some approaches to decoding.
10.You are training a maximum entropy Markov model to identify biomedical entities. Identify \( \geq 5\) useful features for your MEMM.
11.Complete the code to fit a DictVectorizer using the given feature functions:"The North Wind and the Sun had a quarrel about which of them was the stronger" \(\rightarrow\) python<br>[<br> 'startsWithUpper', <br> 'token=North', <br> 'token=Sun', <br> 'token=The', <br> 'token=Wind', <br> 'token=a', <br> 'token=about', <br> 'token=and', <br> 'token=had', <br> 'token=of', <br> 'token=quarrel', <br> 'token=stronger', <br> 'token=the', <br> 'token=them', <br> 'token=was', <br> 'token=which'<br>]<br>
12.Complete the code to calculate frequency of letters in the given input, and transform it into a vector using DictVectorizer.## \( \text{input} \rightarrow \text{output} \)- \(\text{c c a a} \rightarrow [[2. 0. 2.]] \)- \(\text{a a c c} \rightarrow [[2. 0. 2.]] \)- \(\text{a a a a} \rightarrow [[4. 0. 0.]] \)
13.What one feature is required in a left \(\rightarrow\) right MEMM?
Did you like this question?
Was this helpful?
You may exit out of this review and return later without penalty.