AI needs to provide context to be useful.
-
NLP is a pipeline.
-
many methods can be considered NLP
-
national library of medicine did early
-
processing language into tokens is difficult
- language has lots of ambiguity that makes it hard to process.
-
every word counts in unstructured data
- judging the presence of a particular feature is very difficult.
-
word extraction is not adequate
-
medical language often uses short abbreviated words that do not follow normal English rules.
-
data must be hand annotated to train supervised classifiers.
-
models can be evaluated quantitatively or qualitatively to determine performance